Modelling QTL on BTA06 in dairy cattle using random regression
The profile of work
The project comprises QTL detection for milk production traits considering different QTL effects across lactation. The final model is based on a mixed model with random regression with polynomial functions for additive polygenic effect and a QTL effect. The analysis is based on a real data set from Holstein Dairy cattle with a dense microsatellite marker map on BTA6 (14 markers covering 63.5 cM).
In the first part of this project I analyse simulated data. After good definition of models I'll start to analyse real data set, which I have received from professor Quin Zhang.
Project consists of four following parts:
Building a data base with a structure corresponding to data bases used in national genetic evaluation centres, only with smaller amount of data. To create this data base I use MySQL program. Data base makes it easy to edit data and export it in various formats suitable for different to statistical packages. The Data base consist of following components:
- table with orginal and new animal identification numbers, new number of animal is created by a user,
- a table with pedigree structure, i.e. three columns with number of animal, number of sire and number of dam (all numbers should be a new numbers creating by a user),
- a table with covariates, i.e. animal new numbers, sex, date of birth, date of test days and farms,
- a table with trait values, i.e. new numbers of animal, protein and fat contents, protein, fat and milk yields and Somatic Cell Score (SCS) at each test day,
- a table with marker data, i.e. new numbers of animal, genotypes of the following markers on BTA06.
Monte Carlo simulations, which allow to definition of properties of statistical models used (power, type I error) and properties of estimators (bias, compatibility). Simulations enclose:
- simulating matrix of alleles frequency for markers and QTL,
- simulating haplotypes for animals from a base generation,
- simulating haplotypes for offspring, based on haplotypes of parents,
- simulating trait value: at the whole lactation basis and on correlated test day basis,
- statistical analysis of simulating data.
Calculating DYD (daughter yield deviation) and YD (yield deviation) and their precisions. This data will be used as a dependent variable in the analysis.
- YD is yield deviation of cows,
- DYD defined as a weighted YD of a bull's daughters adjusted for their dams' breeding values.
Estimating breeding values for individuals and QTL effects using the following models:
- model with a fixed QTL effect,
- model with a random QTL effect constant a cross lactation,
- model with random regression for polygenic and QTL effects.
update 2007