SlideShare a Scribd company logo
Applica t i o n Brief
Abstract
High-density arrays that simultaneously genotype hundreds of thousands of single nucleotide polymorphisms (SNPs) can theoretically be
a powerful tool for population genetics studies, whether for learning about human history or natural selection. However, SNP arrays
designed for medical genetics have been limited in their utility for population genetics because the polymorphisms on the arrays were
discovered and selected for inclusion in a complicated manner that is difficult to model. This “ascertainment bias” has severely limited the
types of population genetic analyses that can be carried out with SNP arrays. Here we report on the first SNP array for human population
genetics studies, developed in collaboration with Harvard Medical School.
A total of 1.81 million candidate SNPs, all from genome locations covered by sequencing reads from Neanderthals, Denisovans, and
chimpanzees, were ascertained using a simple SNP discovery procedure first described by Keinan, et al., 2007.1 The most important
ascertainment involved using whole-genome shotgun sequencing data to discover differences between the two chromosomes carried by
individuals from 11 populations (San, Yoruba, Mbuti, French, Sardinian, Han, Cambodian, Mongolian, Karitiana, Papuan, and Bougainville).
This resulted in 13 panels of SNPs ascertained in simple ways that are appropriate for population genetic analysis (Figure 1). An Axiom®
genotyping screen was carried to validate 1.35 million SNPs on the Axiom platform (Table 2). A total of 542,399 SNPs were selected
that produced genotypes that passed rigorous quality thresholds appropriate for inclusion in a commercial array design. To facilitate
joint analysis with other data sets that have been collected on diverse populations, the final array contains an additional approximately
87,044 SNPs that overlap between the Affymetrix® Genome-Wide Human SNP Array 6.0 and the Illumina® 650Y Array.
We genotyped 943 unrelated samples from 53 populations in the CEPH-Human Genome Diversity Panel (Cann, et al., 20022), and have
made the data freely available in the CEPH-HGDP database (Table 1). In addition, we report on the genotyping of approximately 400
samples from 70 other diverse worldwide populations. Preliminary analyses demonstrate the promise of this SNP array for population
genetics. As examples, we use the data to provide a new line of evidence for gene flow from Neanderthals into modern humans (Figure 2).
1
A SNP array for human population genetics studies
Yontao Lu1
, Teri Genschoreck1
, Swapan Mallick2,3
, Amy Ollmann1
, Nick Patterson3
, Yiping Zhan1
,
Teresa Webster1
, David Reich2,3
1. Affymetrix, Inc., 3420 Central Expressway, Santa Clara, CA 95051
2. Harvard Medical School Department of Genetics, Boston, MA 02115
3. Broad Institute of Harvard and MIT, Cambridge, MA 02142
2
1.81M
candidate
SNPs
Sequencing
data from
individual N
SNPs
ascertained
from
individual N
1.35M
candidate
SNPs
to validate
B. Phase I SNP validation
6 replicates of
individual N
SNPs
passed
6 replicates of
a chimp
6 replicates of
a bonobo
600K
Phase I
validated
SNPs
SNPs
identified in
individual N
C. Phase II SNP validation and the final design of the array
542K
Phase II
validated
SNPs
600K
Phase I
validated
SNPs
952
HGDP-CEPH
samples
Compatibility
SNPs 87K
84.7K SNP6 and 650Y
2.1K chrY
259 chrMT
Discover
SNPs from
individual N
Map to
PanTro2
Combine
SNPs from 13
panels
SNP
selection
criteria for
validation
Validated
SNPs from 13
panels
Additional
Phase I
validation
criteria
Phase I
validation
criteriaGenotype
Genotype
Figure 1: From candidate SNPs ascertained from 13 panels to the Axiom®
Human Origins Array
A. Select 1.35M of 1.81M candidate SNPs for validation
Figure 1: An overview of the Axiom Human Origins Array development.
A. Sequencing data were mapped to the PanTro2 reference genome, and heterozygous sites were identified. SNPs in Panels 1-12 were discovered
using the strategy described by Keinan, et al.1 For Panel 13, SNPs were ascertained by considering one allele from a San individual and one allele
from Denisova. SNPs whose ancestral and derived alleles were from Denisova and the San individual, respectively, were kept.
B. Phase I SNP validation: Candidate SNPs discovered in Panel N (N:1-13) were tested by genotyping 18 samples (6 x chimpanzee, 6 x bonobo, and 6
x individual N). During Phase I validation, 1.35M of 1.81M unique candidate SNPs were tested. A total of 599K SNPs were validated in at least one
of the 13 panels.
C. Phase II SNP validation: SNPs that passed Phase I validation were further tested against 952 unrelated HGDP-CEPH samples to evaluate the
performance. 943 samples and 542K SNPs passed the validation criteria. An additional 87K SNPs were added to the Axiom Human Origins Array
to allow haplotype inference at mitochondrial DNA and the Y chromosome and to provide overlap with previous Affymetrix and Illumina
genotyping arrays. The final design contains 629K SNPs.
Details about SNP ascertainment, array design, and SNP validation are described in the technical design document.3
3
Table 2: SNPs available in the Axiom Human Origins Array
Ascertainment
panel Sample ID
Sequencing
depth
No. of Candidate
SNPs found
No. of SNPs placed
on screening arrays
No. of Phase I
validated SNPs
No. of Phase II
validated SNPs
1 – French HGDP00521 4.4 333,492 241,707 123,574 111,970
2 – Han HGDP00778 3.8 281,819 204,841 87,515 78,253
3 – Papuan1 HGDP00542 3.6 312,941 232,408 56,518 48,531
4 – San HGDP01029 5.9 548,189 401,052 185,066 163,313
5 – Yoruba HGDP00927 4.3 412,685 302,413 136,759 124,115
6 – Mbuti HGDP00456 1.2 39,178 28,532 14,435 12,162
7 – Karitiana HGDP00998 1.1 12,449 8,535 3,619 2,635
8 – Sardinian HGDP00665 1.3 40,826 29,358 15,260 12,922
9 – Melanesian HGDP00491 1.5 51,237 36,392 17,723 14,988
10 – Cambodian HGDP00711 1.7 53,542 38,399 20,129 16,987
11 – Mongolian HGDP01224 1.4 35,087 24,858 12,872 10,757
12 – Papuan2 HGDP00551 1.4 40,996 29,305 14,739 12,117
13 – Denisova-San
Denisova-
HGDP01029 – 418,841 308,210 166,422 151,435
Unique SNPs from 13 panels 1,812,990 1,354,003 599,175 542,399
Compatibility SNPs NA NA NA 87,044
Table 2: Numbers of SNPs at different stages for different panels. Genotypes of 629,443 SNPs across 943 unrelated HGDP-CEPH individuals
(Harvard HGDP-CEPH Genotypes) were available at http://www.cephb.fr/en/hgdp/
Table 1: Performance metrics for the Axiom® Human Origins Array
Performance metric results Overlap with other arrays
Sample pass rate 98.90%
Sample call rate 99.56%
Sample concordance 99.86%
Reproducibility 99.86%
Table 1: Performance metrics for the Axiom Human Origins Array. Numbers are derived from genotyping 952 HGDP-CEPH samples and four HapMap
samples (replicated 11 times each).
A. Example allele frequency spectra B. Han spectrum if Denisova is
derived, Neanderthal ancestral
C. Han spectrum if Neanderthal is
derived, Denisovan ancestral
Denisova=Derived, Neanderthal=Ancestral Neanderthal=Derived, Denisova=Ancestral
References
1. Keinan A., et al. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nature Genetics 39(10):1251-5
(2007).
2. Cann H. M., et al. A human genome diversity cell line panel. Science 296(5566):261-2 (2002).
3. Lu Y., et al. Technical design document for a SNP array that is optimized for population genetics.
ftp://ftp.cephb.fr/hgdp_supp10/8_12_2011_Technical_Array_Design_Document.pdf
Figure 2: Allele frequency spectra
A. Allele frequency spectra for six populations, based on ascertaining in two chromosomes from an individual in that population. The highest rate of rare alleles
is in Yoruba, reflecting large population sizes in this group, whereas Karitiana have the lowest rate of rare alleles, indicating strong bottlenecks.
B. Han allele frequency spectrum at sites where Denisova carries the derived allele and Neanderthal is ancestral. We observe the upside-down
U expected if these were polymorphic in the ancestral population.
C. Han allele frequency spectrum where Neanderthal carries the derived allele and Denisova is ancestral. We see an excess of low-frequency alleles
compared with B, which can only be explained by recent gene flow from Neanderthals into modern humans.
Figure 2: Preliminary results
Affymetrix, Inc. Tel: +1-888-362-2447  Affymetrix UK Ltd. Tel: +44-(0)-1628-552550  Affymetrix Japan K.K. Tel: +81-(0)3-6430-4020
Panomics Solutions Tel: +1-877-726-6642 panomics.affymetrix.com  USB Products Tel: +1-800-321-9322 usb.affymetrix.com
www.affymetrix.com Please visit our website for international distributor contact information.
“For Research Use Only. Not for use in diagnostic procedures.”
P/N DNA01075 Rev. 1
©Affymetrix, Inc. All rights reserved. Affymetrix® , Axiom® , Command Console® , CytoScan® , DMET™ , GeneAtlas® , GeneChip® , GeneChip-compatible™ , GeneTitan® , Genotyping Console™ , myDesign™ , NetAffx® ,
OncoScan™ , Powered by Affymetrix™ , PrimeView® , Procarta® , and QuantiGene® are trademarks or registered trademarks of Affymetrix, Inc. All other trademarks are the property of their respective owners.
Products may be covered by one or more of the following patents: U.S. Patent Nos. 5,445,934; 5,744,305; 5,945,334; 6,140,044; 6,399,365; 6,420,169; 6,551,817; 6,733,977; 7,629,164;
7,790,389 and D430,024 and other U.S. or foreign patents. Products are manufactured and sold under license from OGT under 5,700,637 and 6,054,270.
4

More Related Content

What's hot

SNP genotyping using Illumina BeadXpress for germplasm diversity studies in c...
SNP genotyping using Illumina BeadXpress for germplasm diversity studies in c...SNP genotyping using Illumina BeadXpress for germplasm diversity studies in c...
SNP genotyping using Illumina BeadXpress for germplasm diversity studies in c...
ICRISAT
 
Biomarkers: Next Generation Sequencing and Updates on NTRK and ctDNA
Biomarkers: Next Generation Sequencing and Updates on NTRK and ctDNABiomarkers: Next Generation Sequencing and Updates on NTRK and ctDNA
Biomarkers: Next Generation Sequencing and Updates on NTRK and ctDNA
Fight Colorectal Cancer
 
MS thesis presentation_FINAL
MS thesis presentation_FINALMS thesis presentation_FINAL
MS thesis presentation_FINAL
Tom Hajek
 
The trivial case of the missing heritability
The trivial case of the missing heritabilityThe trivial case of the missing heritability
The trivial case of the missing heritability
Max Moldovan
 
Eshg poster roman-naranjo
Eshg poster roman-naranjoEshg poster roman-naranjo
Eshg poster roman-naranjo
Pablo Román-Naranjo Varela
 
Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920
Sucheta Tripathy
 
Mobile CRISPRi
Mobile CRISPRiMobile CRISPRi
Mobile CRISPRi
Nikunj tyagi
 
Research project
Research project Research project
Research project
Dingquan Yu
 
West Immunotherapy, Vaccines for Lung Cancer Mage-A3, Stimuvax, and Lucanix
West Immunotherapy, Vaccines for Lung Cancer Mage-A3, Stimuvax, and LucanixWest Immunotherapy, Vaccines for Lung Cancer Mage-A3, Stimuvax, and Lucanix
West Immunotherapy, Vaccines for Lung Cancer Mage-A3, Stimuvax, and Lucanix
H. Jack West
 
CDAC 2018 Boeva discovery
CDAC 2018 Boeva discoveryCDAC 2018 Boeva discovery
CDAC 2018 Boeva discovery
Marco Antoniotti
 
Exp3 lab report revised
Exp3 lab report revisedExp3 lab report revised
Exp3 lab report revised
Jason Gramling
 
PURE poster
PURE posterPURE poster
PURE poster
Aparna Singh
 
Qpcr
QpcrQpcr
High Resolution Outbreak Tracing and Resistance Detection using Whole Genome ...
High Resolution Outbreak Tracing and Resistance Detection using Whole Genome ...High Resolution Outbreak Tracing and Resistance Detection using Whole Genome ...
High Resolution Outbreak Tracing and Resistance Detection using Whole Genome ...
QIAGEN
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
Despoina Kalfakakou
 
Techniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-goodTechniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-good
rcolatru
 
Proof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal diseaseProof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal disease
European Centre for Disease Prevention and Control (ECDC)
 
MIMG 199 P. acnes Poster Final - Lauren and Rachelle
MIMG 199 P. acnes Poster Final - Lauren and RachelleMIMG 199 P. acnes Poster Final - Lauren and Rachelle
MIMG 199 P. acnes Poster Final - Lauren and Rachelle
Rachelle Ann Gonzales
 
nikolakopoulou et al., 2006a
nikolakopoulou et al., 2006anikolakopoulou et al., 2006a
nikolakopoulou et al., 2006a
Mariangela Nikolakopoulou
 

What's hot (19)

SNP genotyping using Illumina BeadXpress for germplasm diversity studies in c...
SNP genotyping using Illumina BeadXpress for germplasm diversity studies in c...SNP genotyping using Illumina BeadXpress for germplasm diversity studies in c...
SNP genotyping using Illumina BeadXpress for germplasm diversity studies in c...
 
Biomarkers: Next Generation Sequencing and Updates on NTRK and ctDNA
Biomarkers: Next Generation Sequencing and Updates on NTRK and ctDNABiomarkers: Next Generation Sequencing and Updates on NTRK and ctDNA
Biomarkers: Next Generation Sequencing and Updates on NTRK and ctDNA
 
MS thesis presentation_FINAL
MS thesis presentation_FINALMS thesis presentation_FINAL
MS thesis presentation_FINAL
 
The trivial case of the missing heritability
The trivial case of the missing heritabilityThe trivial case of the missing heritability
The trivial case of the missing heritability
 
Eshg poster roman-naranjo
Eshg poster roman-naranjoEshg poster roman-naranjo
Eshg poster roman-naranjo
 
Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920
 
Mobile CRISPRi
Mobile CRISPRiMobile CRISPRi
Mobile CRISPRi
 
Research project
Research project Research project
Research project
 
West Immunotherapy, Vaccines for Lung Cancer Mage-A3, Stimuvax, and Lucanix
West Immunotherapy, Vaccines for Lung Cancer Mage-A3, Stimuvax, and LucanixWest Immunotherapy, Vaccines for Lung Cancer Mage-A3, Stimuvax, and Lucanix
West Immunotherapy, Vaccines for Lung Cancer Mage-A3, Stimuvax, and Lucanix
 
CDAC 2018 Boeva discovery
CDAC 2018 Boeva discoveryCDAC 2018 Boeva discovery
CDAC 2018 Boeva discovery
 
Exp3 lab report revised
Exp3 lab report revisedExp3 lab report revised
Exp3 lab report revised
 
PURE poster
PURE posterPURE poster
PURE poster
 
Qpcr
QpcrQpcr
Qpcr
 
High Resolution Outbreak Tracing and Resistance Detection using Whole Genome ...
High Resolution Outbreak Tracing and Resistance Detection using Whole Genome ...High Resolution Outbreak Tracing and Resistance Detection using Whole Genome ...
High Resolution Outbreak Tracing and Resistance Detection using Whole Genome ...
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
Techniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-goodTechniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-good
 
Proof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal diseaseProof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal disease
 
MIMG 199 P. acnes Poster Final - Lauren and Rachelle
MIMG 199 P. acnes Poster Final - Lauren and RachelleMIMG 199 P. acnes Poster Final - Lauren and Rachelle
MIMG 199 P. acnes Poster Final - Lauren and Rachelle
 
nikolakopoulou et al., 2006a
nikolakopoulou et al., 2006anikolakopoulou et al., 2006a
nikolakopoulou et al., 2006a
 

Similar to A SNP array for human population genetics studies

New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
Vijay Raj Yanamala
 
New Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewNew Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overview
Paolo Dametto
 
Theusch 2009. GWAS AP
Theusch 2009. GWAS APTheusch 2009. GWAS AP
Theusch 2009. GWAS AP
Yuri Cheung
 
Dna sequence of the mitochondrial hypervariable region ii (krings et al.)
Dna sequence of the mitochondrial hypervariable region ii (krings et al.)Dna sequence of the mitochondrial hypervariable region ii (krings et al.)
Dna sequence of the mitochondrial hypervariable region ii (krings et al.)
Kristian Pedersen
 
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
Affymetrix
 
The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites
Family Tree DNA
 
Rivera valeria same
Rivera valeria sameRivera valeria same
Rivera valeria same
valrivera
 
Poster - determining the effects of tau on synaptic density in a mouse model ...
Poster - determining the effects of tau on synaptic density in a mouse model ...Poster - determining the effects of tau on synaptic density in a mouse model ...
Poster - determining the effects of tau on synaptic density in a mouse model ...
Shaun Croft, MScR
 
Arjun's Poster ACTUAL FINAL POSTER
Arjun's Poster ACTUAL FINAL POSTERArjun's Poster ACTUAL FINAL POSTER
Arjun's Poster ACTUAL FINAL POSTER
Arjun Mahadevan
 
SURCA 2016 poster
SURCA 2016 posterSURCA 2016 poster
SURCA 2016 poster
Mitchell Go
 
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Reid Robison
 
3 Afr. J. Biotech. Tariq%20et%20al
3  Afr. J. Biotech. Tariq%20et%20al3  Afr. J. Biotech. Tariq%20et%20al
3 Afr. J. Biotech. Tariq%20et%20al
Mohammad Tariq
 
Reference 1
Reference 1Reference 1
Reference 1
Sean Paul
 
Next Generation Sequencing - Prof. Frans Cremers
Next Generation Sequencing - Prof. Frans CremersNext Generation Sequencing - Prof. Frans Cremers
Next Generation Sequencing - Prof. Frans Cremers
Rahajeng Tunjungputri
 
ex-lab field report.docx
ex-lab field report.docxex-lab field report.docx
ex-lab field report.docx
Mesele Tilahun
 
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
t7260678
 
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
t7260678
 
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
Zach Rana
 
SIR WASIQ dna sequence ppt.pptx
SIR WASIQ dna sequence ppt.pptxSIR WASIQ dna sequence ppt.pptx
SIR WASIQ dna sequence ppt.pptx
SoniaAslam10
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learning
Patricia Francis-Lyon
 

Similar to A SNP array for human population genetics studies (20)

New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
 
New Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewNew Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overview
 
Theusch 2009. GWAS AP
Theusch 2009. GWAS APTheusch 2009. GWAS AP
Theusch 2009. GWAS AP
 
Dna sequence of the mitochondrial hypervariable region ii (krings et al.)
Dna sequence of the mitochondrial hypervariable region ii (krings et al.)Dna sequence of the mitochondrial hypervariable region ii (krings et al.)
Dna sequence of the mitochondrial hypervariable region ii (krings et al.)
 
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
 
The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites
 
Rivera valeria same
Rivera valeria sameRivera valeria same
Rivera valeria same
 
Poster - determining the effects of tau on synaptic density in a mouse model ...
Poster - determining the effects of tau on synaptic density in a mouse model ...Poster - determining the effects of tau on synaptic density in a mouse model ...
Poster - determining the effects of tau on synaptic density in a mouse model ...
 
Arjun's Poster ACTUAL FINAL POSTER
Arjun's Poster ACTUAL FINAL POSTERArjun's Poster ACTUAL FINAL POSTER
Arjun's Poster ACTUAL FINAL POSTER
 
SURCA 2016 poster
SURCA 2016 posterSURCA 2016 poster
SURCA 2016 poster
 
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
 
3 Afr. J. Biotech. Tariq%20et%20al
3  Afr. J. Biotech. Tariq%20et%20al3  Afr. J. Biotech. Tariq%20et%20al
3 Afr. J. Biotech. Tariq%20et%20al
 
Reference 1
Reference 1Reference 1
Reference 1
 
Next Generation Sequencing - Prof. Frans Cremers
Next Generation Sequencing - Prof. Frans CremersNext Generation Sequencing - Prof. Frans Cremers
Next Generation Sequencing - Prof. Frans Cremers
 
ex-lab field report.docx
ex-lab field report.docxex-lab field report.docx
ex-lab field report.docx
 
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
 
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
 
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
 
SIR WASIQ dna sequence ppt.pptx
SIR WASIQ dna sequence ppt.pptxSIR WASIQ dna sequence ppt.pptx
SIR WASIQ dna sequence ppt.pptx
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learning
 

More from Affymetrix

Axiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array PlateAxiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array Plate
Affymetrix
 
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate SetAxiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Affymetrix
 
Axiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array PlateAxiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array Plate
Affymetrix
 
Axiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping ArraysAxiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping Arrays
Affymetrix
 
Axiom® Genome-Wide AFR 1 Array World Array 3
Axiom®  Genome-Wide AFR 1 Array World Array 3Axiom®  Genome-Wide AFR 1 Array World Array 3
Axiom® Genome-Wide AFR 1 Array World Array 3
Affymetrix
 
FFPE Applications Solutions brochure
FFPE Applications Solutions brochureFFPE Applications Solutions brochure
FFPE Applications Solutions brochure
Affymetrix
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
Affymetrix
 
Download our publication archive list
Download our publication archive listDownload our publication archive list
Download our publication archive list
Affymetrix
 
Integrating arrays and RNA-Seq
Integrating arrays and RNA-Seq Integrating arrays and RNA-Seq
Integrating arrays and RNA-Seq
Affymetrix
 
Designing GWAS arrays for efficient imputation-based coverage
Designing GWAS arrays for efficient imputation-based coverageDesigning GWAS arrays for efficient imputation-based coverage
Designing GWAS arrays for efficient imputation-based coverage
Affymetrix
 
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Affymetrix
 
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
Affymetrix
 
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix
 
Phenotypic identification of subclones in multiple myeloma with different gen...
Phenotypic identification of subclones in multiple myeloma with different gen...Phenotypic identification of subclones in multiple myeloma with different gen...
Phenotypic identification of subclones in multiple myeloma with different gen...
Affymetrix
 
Mitigating genotyping application note
Mitigating genotyping application noteMitigating genotyping application note
Mitigating genotyping application note
Affymetrix
 
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
Affymetrix
 
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom ArraysAllopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Affymetrix
 
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
Affymetrix
 
Best practices for genotyping analysis of plant and animal genomes with Affym...
Best practices for genotyping analysis of plant and animal genomes with Affym...Best practices for genotyping analysis of plant and animal genomes with Affym...
Best practices for genotyping analysis of plant and animal genomes with Affym...
Affymetrix
 
SNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping SolutionSNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping Solution
Affymetrix
 

More from Affymetrix (20)

Axiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array PlateAxiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array Plate
 
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate SetAxiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
 
Axiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array PlateAxiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array Plate
 
Axiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping ArraysAxiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping Arrays
 
Axiom® Genome-Wide AFR 1 Array World Array 3
Axiom®  Genome-Wide AFR 1 Array World Array 3Axiom®  Genome-Wide AFR 1 Array World Array 3
Axiom® Genome-Wide AFR 1 Array World Array 3
 
FFPE Applications Solutions brochure
FFPE Applications Solutions brochureFFPE Applications Solutions brochure
FFPE Applications Solutions brochure
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
 
Download our publication archive list
Download our publication archive listDownload our publication archive list
Download our publication archive list
 
Integrating arrays and RNA-Seq
Integrating arrays and RNA-Seq Integrating arrays and RNA-Seq
Integrating arrays and RNA-Seq
 
Designing GWAS arrays for efficient imputation-based coverage
Designing GWAS arrays for efficient imputation-based coverageDesigning GWAS arrays for efficient imputation-based coverage
Designing GWAS arrays for efficient imputation-based coverage
 
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
 
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
 
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
 
Phenotypic identification of subclones in multiple myeloma with different gen...
Phenotypic identification of subclones in multiple myeloma with different gen...Phenotypic identification of subclones in multiple myeloma with different gen...
Phenotypic identification of subclones in multiple myeloma with different gen...
 
Mitigating genotyping application note
Mitigating genotyping application noteMitigating genotyping application note
Mitigating genotyping application note
 
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
 
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom ArraysAllopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
 
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
 
Best practices for genotyping analysis of plant and animal genomes with Affym...
Best practices for genotyping analysis of plant and animal genomes with Affym...Best practices for genotyping analysis of plant and animal genomes with Affym...
Best practices for genotyping analysis of plant and animal genomes with Affym...
 
SNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping SolutionSNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping Solution
 

Recently uploaded

Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
RASHMI M G
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 

Recently uploaded (20)

Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 

A SNP array for human population genetics studies

  • 1. Applica t i o n Brief Abstract High-density arrays that simultaneously genotype hundreds of thousands of single nucleotide polymorphisms (SNPs) can theoretically be a powerful tool for population genetics studies, whether for learning about human history or natural selection. However, SNP arrays designed for medical genetics have been limited in their utility for population genetics because the polymorphisms on the arrays were discovered and selected for inclusion in a complicated manner that is difficult to model. This “ascertainment bias” has severely limited the types of population genetic analyses that can be carried out with SNP arrays. Here we report on the first SNP array for human population genetics studies, developed in collaboration with Harvard Medical School. A total of 1.81 million candidate SNPs, all from genome locations covered by sequencing reads from Neanderthals, Denisovans, and chimpanzees, were ascertained using a simple SNP discovery procedure first described by Keinan, et al., 2007.1 The most important ascertainment involved using whole-genome shotgun sequencing data to discover differences between the two chromosomes carried by individuals from 11 populations (San, Yoruba, Mbuti, French, Sardinian, Han, Cambodian, Mongolian, Karitiana, Papuan, and Bougainville). This resulted in 13 panels of SNPs ascertained in simple ways that are appropriate for population genetic analysis (Figure 1). An Axiom® genotyping screen was carried to validate 1.35 million SNPs on the Axiom platform (Table 2). A total of 542,399 SNPs were selected that produced genotypes that passed rigorous quality thresholds appropriate for inclusion in a commercial array design. To facilitate joint analysis with other data sets that have been collected on diverse populations, the final array contains an additional approximately 87,044 SNPs that overlap between the Affymetrix® Genome-Wide Human SNP Array 6.0 and the Illumina® 650Y Array. We genotyped 943 unrelated samples from 53 populations in the CEPH-Human Genome Diversity Panel (Cann, et al., 20022), and have made the data freely available in the CEPH-HGDP database (Table 1). In addition, we report on the genotyping of approximately 400 samples from 70 other diverse worldwide populations. Preliminary analyses demonstrate the promise of this SNP array for population genetics. As examples, we use the data to provide a new line of evidence for gene flow from Neanderthals into modern humans (Figure 2). 1 A SNP array for human population genetics studies Yontao Lu1 , Teri Genschoreck1 , Swapan Mallick2,3 , Amy Ollmann1 , Nick Patterson3 , Yiping Zhan1 , Teresa Webster1 , David Reich2,3 1. Affymetrix, Inc., 3420 Central Expressway, Santa Clara, CA 95051 2. Harvard Medical School Department of Genetics, Boston, MA 02115 3. Broad Institute of Harvard and MIT, Cambridge, MA 02142
  • 2. 2 1.81M candidate SNPs Sequencing data from individual N SNPs ascertained from individual N 1.35M candidate SNPs to validate B. Phase I SNP validation 6 replicates of individual N SNPs passed 6 replicates of a chimp 6 replicates of a bonobo 600K Phase I validated SNPs SNPs identified in individual N C. Phase II SNP validation and the final design of the array 542K Phase II validated SNPs 600K Phase I validated SNPs 952 HGDP-CEPH samples Compatibility SNPs 87K 84.7K SNP6 and 650Y 2.1K chrY 259 chrMT Discover SNPs from individual N Map to PanTro2 Combine SNPs from 13 panels SNP selection criteria for validation Validated SNPs from 13 panels Additional Phase I validation criteria Phase I validation criteriaGenotype Genotype Figure 1: From candidate SNPs ascertained from 13 panels to the Axiom® Human Origins Array A. Select 1.35M of 1.81M candidate SNPs for validation Figure 1: An overview of the Axiom Human Origins Array development. A. Sequencing data were mapped to the PanTro2 reference genome, and heterozygous sites were identified. SNPs in Panels 1-12 were discovered using the strategy described by Keinan, et al.1 For Panel 13, SNPs were ascertained by considering one allele from a San individual and one allele from Denisova. SNPs whose ancestral and derived alleles were from Denisova and the San individual, respectively, were kept. B. Phase I SNP validation: Candidate SNPs discovered in Panel N (N:1-13) were tested by genotyping 18 samples (6 x chimpanzee, 6 x bonobo, and 6 x individual N). During Phase I validation, 1.35M of 1.81M unique candidate SNPs were tested. A total of 599K SNPs were validated in at least one of the 13 panels. C. Phase II SNP validation: SNPs that passed Phase I validation were further tested against 952 unrelated HGDP-CEPH samples to evaluate the performance. 943 samples and 542K SNPs passed the validation criteria. An additional 87K SNPs were added to the Axiom Human Origins Array to allow haplotype inference at mitochondrial DNA and the Y chromosome and to provide overlap with previous Affymetrix and Illumina genotyping arrays. The final design contains 629K SNPs. Details about SNP ascertainment, array design, and SNP validation are described in the technical design document.3
  • 3. 3 Table 2: SNPs available in the Axiom Human Origins Array Ascertainment panel Sample ID Sequencing depth No. of Candidate SNPs found No. of SNPs placed on screening arrays No. of Phase I validated SNPs No. of Phase II validated SNPs 1 – French HGDP00521 4.4 333,492 241,707 123,574 111,970 2 – Han HGDP00778 3.8 281,819 204,841 87,515 78,253 3 – Papuan1 HGDP00542 3.6 312,941 232,408 56,518 48,531 4 – San HGDP01029 5.9 548,189 401,052 185,066 163,313 5 – Yoruba HGDP00927 4.3 412,685 302,413 136,759 124,115 6 – Mbuti HGDP00456 1.2 39,178 28,532 14,435 12,162 7 – Karitiana HGDP00998 1.1 12,449 8,535 3,619 2,635 8 – Sardinian HGDP00665 1.3 40,826 29,358 15,260 12,922 9 – Melanesian HGDP00491 1.5 51,237 36,392 17,723 14,988 10 – Cambodian HGDP00711 1.7 53,542 38,399 20,129 16,987 11 – Mongolian HGDP01224 1.4 35,087 24,858 12,872 10,757 12 – Papuan2 HGDP00551 1.4 40,996 29,305 14,739 12,117 13 – Denisova-San Denisova- HGDP01029 – 418,841 308,210 166,422 151,435 Unique SNPs from 13 panels 1,812,990 1,354,003 599,175 542,399 Compatibility SNPs NA NA NA 87,044 Table 2: Numbers of SNPs at different stages for different panels. Genotypes of 629,443 SNPs across 943 unrelated HGDP-CEPH individuals (Harvard HGDP-CEPH Genotypes) were available at http://www.cephb.fr/en/hgdp/ Table 1: Performance metrics for the Axiom® Human Origins Array Performance metric results Overlap with other arrays Sample pass rate 98.90% Sample call rate 99.56% Sample concordance 99.86% Reproducibility 99.86% Table 1: Performance metrics for the Axiom Human Origins Array. Numbers are derived from genotyping 952 HGDP-CEPH samples and four HapMap samples (replicated 11 times each).
  • 4. A. Example allele frequency spectra B. Han spectrum if Denisova is derived, Neanderthal ancestral C. Han spectrum if Neanderthal is derived, Denisovan ancestral Denisova=Derived, Neanderthal=Ancestral Neanderthal=Derived, Denisova=Ancestral References 1. Keinan A., et al. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nature Genetics 39(10):1251-5 (2007). 2. Cann H. M., et al. A human genome diversity cell line panel. Science 296(5566):261-2 (2002). 3. Lu Y., et al. Technical design document for a SNP array that is optimized for population genetics. ftp://ftp.cephb.fr/hgdp_supp10/8_12_2011_Technical_Array_Design_Document.pdf Figure 2: Allele frequency spectra A. Allele frequency spectra for six populations, based on ascertaining in two chromosomes from an individual in that population. The highest rate of rare alleles is in Yoruba, reflecting large population sizes in this group, whereas Karitiana have the lowest rate of rare alleles, indicating strong bottlenecks. B. Han allele frequency spectrum at sites where Denisova carries the derived allele and Neanderthal is ancestral. We observe the upside-down U expected if these were polymorphic in the ancestral population. C. Han allele frequency spectrum where Neanderthal carries the derived allele and Denisova is ancestral. We see an excess of low-frequency alleles compared with B, which can only be explained by recent gene flow from Neanderthals into modern humans. Figure 2: Preliminary results Affymetrix, Inc. Tel: +1-888-362-2447  Affymetrix UK Ltd. Tel: +44-(0)-1628-552550  Affymetrix Japan K.K. Tel: +81-(0)3-6430-4020 Panomics Solutions Tel: +1-877-726-6642 panomics.affymetrix.com  USB Products Tel: +1-800-321-9322 usb.affymetrix.com www.affymetrix.com Please visit our website for international distributor contact information. “For Research Use Only. Not for use in diagnostic procedures.” P/N DNA01075 Rev. 1 ©Affymetrix, Inc. All rights reserved. Affymetrix® , Axiom® , Command Console® , CytoScan® , DMET™ , GeneAtlas® , GeneChip® , GeneChip-compatible™ , GeneTitan® , Genotyping Console™ , myDesign™ , NetAffx® , OncoScan™ , Powered by Affymetrix™ , PrimeView® , Procarta® , and QuantiGene® are trademarks or registered trademarks of Affymetrix, Inc. All other trademarks are the property of their respective owners. Products may be covered by one or more of the following patents: U.S. Patent Nos. 5,445,934; 5,744,305; 5,945,334; 6,140,044; 6,399,365; 6,420,169; 6,551,817; 6,733,977; 7,629,164; 7,790,389 and D430,024 and other U.S. or foreign patents. Products are manufactured and sold under license from OGT under 5,700,637 and 6,054,270. 4