Angus Nelore Jersey
Breed
 SNPs are chosen based on location – they are evenly spaced
throughout the genome. Results: Redundant coverage on large
groups of SNPs in LD (blue, light green); insufficient coverage
on smaller groups of SNPs in LD (red, light blue, dark red).
Array Plate indicated with blue.
Average sample call rate
Mendelian consistency
Reproducibility
99.62%
99.96%
99.94%
Development of a high-throughput high-density
SNP genotyping array for bovine
Ali Pirani, Julia Montgomery, Brant Wong, Yontao Lu, Yiping Zhan, Fiona
Brew, Christopher Davies, Anne Ferguson, and Teresa A. Webster
Affymetrix, Inc. Santa Clara, CA 95051 USA
Background: Simultaneous genotyping of many single nucleotide
polymorphisms (SNPs) has been made possible by the development of
array-based hybridization platforms. High-density genotyping
empowered the success of genome-wide association studies for
determination of genetic variation affecting complex traits. Medium-
density studies in agricultural organisms have been beneficial in marker-
assisted breeding, mapping quantitative trait loci, and other
applications. Now there is interest in expanding to higher-density
options to further refine and develop these techniques. The bovine
research community has undertaken a massive genome sequencing and
SNP screening effort to develop a comprehensive solution for a versatile,
high-throughput high-density bovine SNP genotyping array.
Experimental design: The Affymetrix Bovine Consortium, consisting of
academic researchers and breeding groups, has combined efforts to
sequence the genomes of 15 bovine breeds from Bos indicus and Bos
taurus on three next-generation sequencing platforms to produce an
extensive collection of SNPs and their genotypes. Screening arrays were
designed using a method to optimize physical coverage of a subset of
these SNPS across the sequenced genomes. A diverse set of 384
samples, representing taurine, indicine, tropical taurine, and Asian
breeds, was obtained from the HapMap collection and other
collaborators and screened on the screening arrays on the high-
throughput Affymetrix Axiom™ Genotyping Solution. 3 million high-
performance SNPs were validated.
Results: A multi-breed relevant subset of over 648,000 SNPs has been
selected for the Axiom™ Genome-Wide BOS 1 Array Plate. These SNPs
were selected using a genetic selection method to obtain greater than
90 percent genetic coverage of many beef and dairy cattle breeds.
Conclusion: Affymetrix has developed a comprehensive collection of
SNPs, which genetically cover the LD blocks of multiple breeds. This SNP
collection can be utilized for customized breed-specific array
development. Future breed-specific screens will add to this pool of SNPs
with validated genotyping performance. These high-density genotyping
assays will support marker assisted breeding and other applications in
cattle.
Breed segregation
Bos taurus
Mix breeds
The SNP screens
On the Affymetrix Axiom™
Genotyping Solution, SNP screening
was carried out for SNPs identified
by the sequencing effort. For SNPs
to validate, they must be robust,
measured by performance metrics,
such as genotype call rate, cluster
separation, reproducibility, etc., and
they must exhibit at least 2
examples of the minor allele.
Japanese Black
Jersey
Limousin
Nelore
Norwegian red
1.1M
1.2M
1.4M
2.3M
1M
0.24
0.23
0.21
0.22
0.39
Genetic coverage
2500000
2000000
1500000
Bos indicus
Figure 1: Principal component analysis shows the breeds segregate based on
subpopulations. Cladogram from Decker, J. E., et al. PNAS 106:18644-18649
(2009).
Genetic SNP selection
LD calculations were carried out based on genotypes obtained from our
SNP screens for 10 breeds. To choose SNPs for the Axiom genotyping panel
based on the pool of SNPs with validated genotyping performance, genetic
coverage for both common and rare SNPs are considered, with the
objective to genetically cover the 3 million validated SNPs. Additional
factors, including a genotyping performance metric, the number of features
each SNP requires on the chip, etc., are also considered. The SNP
selection/ranking was achieved in a greedy approach derived from the work
of Carlson, et al1. At each step, a single SNP that is considered to be the
most valuable given the genetic coverage already achieved in breeds of
interest is selected. The SNP selection goes on until maximal genetic
coverage for selected breeds is achieved or enough SNPs are selected. A
separate set of SNPs considered to be of important biological value is also
included in the Axiom array design.
Table 3: Test set performance. Validated SNPs are polymorphic and pass
performance criteria for all the breeds and samples tested. Additional SNPs may be
validated on a per-breed basis. These SNPs were used for sample performance and
coverage calculations.
1. Carlson C. S., Eberle M. A., Rieder M. J., Yi Q., Kruglyak L., Nickerson D. A. Selecting a maximally informative set of
single-nucleotide polymorphisms for association analyses using linkage disequilibrium. American Journal of Human
Genetics 74:106–20 (2004).
Lim-
ousin
88.4
Romag-
nola
92.5
Brah-
man
78.8
Gir
80.8
Hol- Fleck- Here-
stein vieh ford
98.1 98.8 87.4 96.9 91.2 95.4
Table 2: Genetic coverage for each breed (%).
1000000
500000
0
Genetic selection
Figure 2: Number of polymorphic
 SNPs are chosen because they accurately represent groups SNPs. SNPs genetically covered on
of SNPs in LD. Result: The fewest number of SNPs is used to the Axiom™ Genome-Wide BOS 1
cover the known genetic variation of a population.
Axiom™ Genome -Wide BOS 1 Array Plate performanc
Test set
Number of SNPs 648,855
Number of validated SNPs 618,345
Sample pass rate 99.94%
Genetic selection
 Each validated SNP is indicated by a bar; the horizontal line
represents the chromosome. SNPs in LD are the same color.
Physical selection
Ran screen with array plate format
20 breeds; ~400 samples
Process
Sequencing effort by the
Affymetrix Bovine Consortium
Filtered to SNPs that physically cover the genome
and were identified in multiple breeds
Axiom Bovine
Genomic
Database
~ 3M & growing
validated SNPs
Axiom™ Genome-Wide BOS 1 Array Plate
Breed screen information
Poly. Avg.
SNPs MAF
Afrikander 1.4M 0.27
Angus 1.4M 0.21
Ayrshire 0.96M 0.29
Blonde d'Aquitaine 0.99M 0.27
Boran 2.2M 0.26
Brahman 2.4M 0.23
Brown Swiss 0.99M 0.28
Simmental 1.4M 0.21
Gir 2.1M 0.24
Hanwoo 1.3M 0.25
Hereford 1.2M 0.22
Holstein 1.6M 0.21
Genetic
SNP
selection
Rouge des Prés 0.7M 0.33
Romagnola 1.6M 0.20
Tuli 1.4M 0.29
Table 1: Polymorphic SNPs per breed
in the 3M validated list. Average minor
allele frequency for unrelated samples.

Development of a high-throughput high-density SNP genotyping array for bovine

  • 1.
    Angus Nelore Jersey Breed SNPs are chosen based on location – they are evenly spaced throughout the genome. Results: Redundant coverage on large groups of SNPs in LD (blue, light green); insufficient coverage on smaller groups of SNPs in LD (red, light blue, dark red). Array Plate indicated with blue. Average sample call rate Mendelian consistency Reproducibility 99.62% 99.96% 99.94% Development of a high-throughput high-density SNP genotyping array for bovine Ali Pirani, Julia Montgomery, Brant Wong, Yontao Lu, Yiping Zhan, Fiona Brew, Christopher Davies, Anne Ferguson, and Teresa A. Webster Affymetrix, Inc. Santa Clara, CA 95051 USA Background: Simultaneous genotyping of many single nucleotide polymorphisms (SNPs) has been made possible by the development of array-based hybridization platforms. High-density genotyping empowered the success of genome-wide association studies for determination of genetic variation affecting complex traits. Medium- density studies in agricultural organisms have been beneficial in marker- assisted breeding, mapping quantitative trait loci, and other applications. Now there is interest in expanding to higher-density options to further refine and develop these techniques. The bovine research community has undertaken a massive genome sequencing and SNP screening effort to develop a comprehensive solution for a versatile, high-throughput high-density bovine SNP genotyping array. Experimental design: The Affymetrix Bovine Consortium, consisting of academic researchers and breeding groups, has combined efforts to sequence the genomes of 15 bovine breeds from Bos indicus and Bos taurus on three next-generation sequencing platforms to produce an extensive collection of SNPs and their genotypes. Screening arrays were designed using a method to optimize physical coverage of a subset of these SNPS across the sequenced genomes. A diverse set of 384 samples, representing taurine, indicine, tropical taurine, and Asian breeds, was obtained from the HapMap collection and other collaborators and screened on the screening arrays on the high- throughput Affymetrix Axiom™ Genotyping Solution. 3 million high- performance SNPs were validated. Results: A multi-breed relevant subset of over 648,000 SNPs has been selected for the Axiom™ Genome-Wide BOS 1 Array Plate. These SNPs were selected using a genetic selection method to obtain greater than 90 percent genetic coverage of many beef and dairy cattle breeds. Conclusion: Affymetrix has developed a comprehensive collection of SNPs, which genetically cover the LD blocks of multiple breeds. This SNP collection can be utilized for customized breed-specific array development. Future breed-specific screens will add to this pool of SNPs with validated genotyping performance. These high-density genotyping assays will support marker assisted breeding and other applications in cattle. Breed segregation Bos taurus Mix breeds The SNP screens On the Affymetrix Axiom™ Genotyping Solution, SNP screening was carried out for SNPs identified by the sequencing effort. For SNPs to validate, they must be robust, measured by performance metrics, such as genotype call rate, cluster separation, reproducibility, etc., and they must exhibit at least 2 examples of the minor allele. Japanese Black Jersey Limousin Nelore Norwegian red 1.1M 1.2M 1.4M 2.3M 1M 0.24 0.23 0.21 0.22 0.39 Genetic coverage 2500000 2000000 1500000 Bos indicus Figure 1: Principal component analysis shows the breeds segregate based on subpopulations. Cladogram from Decker, J. E., et al. PNAS 106:18644-18649 (2009). Genetic SNP selection LD calculations were carried out based on genotypes obtained from our SNP screens for 10 breeds. To choose SNPs for the Axiom genotyping panel based on the pool of SNPs with validated genotyping performance, genetic coverage for both common and rare SNPs are considered, with the objective to genetically cover the 3 million validated SNPs. Additional factors, including a genotyping performance metric, the number of features each SNP requires on the chip, etc., are also considered. The SNP selection/ranking was achieved in a greedy approach derived from the work of Carlson, et al1. At each step, a single SNP that is considered to be the most valuable given the genetic coverage already achieved in breeds of interest is selected. The SNP selection goes on until maximal genetic coverage for selected breeds is achieved or enough SNPs are selected. A separate set of SNPs considered to be of important biological value is also included in the Axiom array design. Table 3: Test set performance. Validated SNPs are polymorphic and pass performance criteria for all the breeds and samples tested. Additional SNPs may be validated on a per-breed basis. These SNPs were used for sample performance and coverage calculations. 1. Carlson C. S., Eberle M. A., Rieder M. J., Yi Q., Kruglyak L., Nickerson D. A. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. American Journal of Human Genetics 74:106–20 (2004). Lim- ousin 88.4 Romag- nola 92.5 Brah- man 78.8 Gir 80.8 Hol- Fleck- Here- stein vieh ford 98.1 98.8 87.4 96.9 91.2 95.4 Table 2: Genetic coverage for each breed (%). 1000000 500000 0 Genetic selection Figure 2: Number of polymorphic  SNPs are chosen because they accurately represent groups SNPs. SNPs genetically covered on of SNPs in LD. Result: The fewest number of SNPs is used to the Axiom™ Genome-Wide BOS 1 cover the known genetic variation of a population. Axiom™ Genome -Wide BOS 1 Array Plate performanc Test set Number of SNPs 648,855 Number of validated SNPs 618,345 Sample pass rate 99.94% Genetic selection  Each validated SNP is indicated by a bar; the horizontal line represents the chromosome. SNPs in LD are the same color. Physical selection Ran screen with array plate format 20 breeds; ~400 samples Process Sequencing effort by the Affymetrix Bovine Consortium Filtered to SNPs that physically cover the genome and were identified in multiple breeds Axiom Bovine Genomic Database ~ 3M & growing validated SNPs Axiom™ Genome-Wide BOS 1 Array Plate Breed screen information Poly. Avg. SNPs MAF Afrikander 1.4M 0.27 Angus 1.4M 0.21 Ayrshire 0.96M 0.29 Blonde d'Aquitaine 0.99M 0.27 Boran 2.2M 0.26 Brahman 2.4M 0.23 Brown Swiss 0.99M 0.28 Simmental 1.4M 0.21 Gir 2.1M 0.24 Hanwoo 1.3M 0.25 Hereford 1.2M 0.22 Holstein 1.6M 0.21 Genetic SNP selection Rouge des Prés 0.7M 0.33 Romagnola 1.6M 0.20 Tuli 1.4M 0.29 Table 1: Polymorphic SNPs per breed in the 3M validated list. Average minor allele frequency for unrelated samples.