Application of SNP chips in the estimation of breeding values in dairy cattle

This project is realized by Kacper Żukowski

Supervisor Joanna Szyda

The profile of work

SNP microarrays (Single Nucleotide Polymorphism microarrays, SNP chips) are small, solid supports onto which the sequences from thousands of different genes are immobilized, or attached, at fixed locations. The support itself is usually a glass microscope slide , of the size of two side-by-side pinky fingers, but this can also be silicon chips or nylon membranes. The DNA is printed, spotted, or actually synthesized directly onto the support. This diversification of genes/markers in plots makes it possible to qualify tens of thousands of genotypes in a single experiment. The miniaturization let to the production of plates, which contain from several to even five hundred thousands SNP in one experiment. The most advanced investigations are in human genetics, where SNP microarrays are also used to profile somatic mutations in cancer.

Commercially accessible dairy cattle SNP chips include 50K SNP. They are based on SNP which were affirmed and published in 2004 on the first bovine physical map and it's complement in 2006. SNP microarrays based on polymorphisms identified in the Bovine HapMap project.

At present there are no methods to implemented evaluate breeding value dairy cattle (so called genomic breeding value, GBV), based on SNP microarrays data sets. The country which leads in whole genome studies towards GBV witch is a part of MAS (Marker Assisted Selection) in dairy cattle is France, but similar investigations are also led in the Canada, USA and the Netherlands.

The aim of project

To develop methodology and a packet of computer programs allowing for the estimation of GBV from SNP chip.

The simulation of the data

Simulations include:

  • SNP genotypes of bulls and cows from a base generation,
  • SNP genotypes of young bull and heifers on the basis of the parent's genotypes,
  • trait values.

Effects which can be modified in simulations:

  • size of population,
  • number of generations,
  • frequency of SNP genotypes,
  • percentage of the incorrect SNP genotypes
  • type of genetic evaluation trait (polygenic trait, with QTL, with epistasis),
  • level of inbreeding in the population.

The analysis of the data

Creating of the database with utilization of the multithreaded, multi-user SQL database management system before the analysis of the real data. The database has to be adapted to the structure of the data receiving from SNP microarrays and contain information about all individuals:

  • number of individual (ID),
  • information of base generation (Sire ID, Dam ID),
  • sex,
  • birth year,
  • genotype data from SNP microarray,
  • breeding values estimated in the routine genetic evaluation,
  • yield deviations for all individuals (bulls DYD, cows YD),

Analysis of both simulated and real data sets

The simplest model will contain a single SNP genotype introduced as fixed effects. Models with several SNPs. In this model the effect of each SNP will be estimated separately.

Selection of SNPs for the final model is based on:

  • SNP effect on the variation of the analysed trait,
  • Hardy-Weinberg disequilibrium with other SNP.

Use of model containing all chosen polymorphisms. Selected SNPs will be used to build a haplotype for a model with a random SNP-haplotype and with non-parametric approach that a random SNP effect estimated through the application of Kernel functions or approach "sliding window".

Summary

The use of SNP microarrays in the evaluate of dairy cattle trait will allow to very early estimation traits defining breading value of bulls on the basis of whole genome studies. The worked out programme will allow to the considerable reduction of costs in the relation to applied at present methods of the evaluate value of dairy cattle.

update 2007