SlideShare a Scribd company logo
Thermo Fisher Scientific • 5791 Van Allen Way • Carlsbad, CA 92008 • www.lifetechnologies.com 
Yuan-Chieh Ku1, Meredith L. Carpenter2, Martin Sikora2, Hannes Schroeder3, Clarence C. Lee1, Christopher Davies1, M. Thomas P. Gilbert3, 
Carlos D. Bustamante2, Gavin D. Meredith1 
Thermo Fisher Scientific, South San Francisco, CA, USA; 2. Stanford University, Stanford, CA , USA; 3. Centre for GeoGenetics, Copenhagen, 
Denmark. 
Decoding ancient Bulgarian DNA with semiconductor-based sequencing 
ABSTRACT 
With the development of Next Generation 
Sequencing Technology (NGS), the field of hominin 
paleogenetics has transformed significantly from 
studying specific DNA markers to revealing whole 
genome information. However, ancient DNA of 
interest is usually highly fragmented so an NGS 
library preparation protocol optimized to capture 
short DNA fragments (40bp to 200bp) was 
developed. The improved workflow includes the 
use of column-based DNA purification and 
concentration and automated gel-based size-selection. 
This workflow permitted production of 
“shotgun” genomic libraries from very limited input 
DNA (6ng to 39ng). Methods that permit the use 
of such low input, degraded DNA enable the 
partitioning of exceedingly rare samples into 
multiple analytical workflows. 
Data from two orthogonal sequencing platforms for 
these ancient Bulgarian samples demonstrated 
very similar base-substitution profiles with C>T and 
G>A variants accounting for ~75-80% of all SNPs 
called in both datasets. With such orthogonal 
verification, we expect to be able to reduce the 
false positive rate and generate a “truth” list of 
SNPs that will enhance our understanding of 
ancient population genomics and migrations. In 
summary, we have demonstrated a library 
preparation and semiconductor-based NGS 
workflow that is applicable for processing 
contaminated and degraded samples and can be 
used for ancient DNA research. 
Figure 1. Whole-genome in-solution 
capture 
INTRODUCTION 
Whole-genome in-solution capture (WISC) is an 
unbiased way to increase endogenous DNA 
proportion in ancient DNA samples. Human 
genomic “bait” libraries were created from a 
modern reference individual. These bait libraries 
were constructed with adapters containing T7 RNA 
polymerase which can be used for in vitro 
transcription to generate RNA baits covering the 
entire human genome. These baits were then 
hybridized to ancient DNA libraries in solution and 
pulled down with magnetic streptavidin-coated 
beads. Captured endogenous human DNA was 
then eluted and amplified for sequencing. 
RESULTS 
Figure 2. Ion Torrent low input 
fragment library workflow 
Small amount and highly degraded 
ancient DNA samples were end 
repaired and purified with Zymo DNA 
Clean & ConcentratorTM. These 
samples were then adapter ligated 
and purified with AMPure® XP beads 
to remove excess adapter dimer. 
Libraries were nick translated and 
amplified for 8 cycles. Final libraries 
were size selected for fragments 
range from100bp to 400bp by Pippin 
PrepTM follow by AMPure® XP beads 
cleanup. 
Figure 5. Mapping statistics comparisons 
High quality filtered reads were mapped to the human 
genome. Detected SNPs were cross-referenced with the 
1000 Genomes reference panel. 
Figure 4. DNA damage profiles comparison 
DNA substitution patterns of (A) Illumina pre-capture, 
(B) Illumina post-capture, (C) Ion Proton™ pre-capture, 
(D) Ion Proton™ post-capture. 
Sample 
(P192-1) 
Total read 
number 
filtered (MQ30, length >30) 
Mapped % Duplicates % 
Position 
covered 
SNPs in 
1000G 
Proton_precapture 312,510,164 6% 56% 528,158,981 6,981,671 
Proton_postcapture 51,644,781 18% 80% 169,396,091 2,298,092 
Illumina_precapture 705,234 4.3% 9% 2,248,978 30,081 
Illumina_postcapture 
829,256 23.2% 66% 50,003,999 67,221 
CONCLUSIONS 
1.We developed a library workflow specific for low input 
and highly degraded ancient DNA samples on an Ion 
Torrent Proton™ system. 
2.Ion Proton™ and Illumina results show very similar 
substitution profiles. A less biased “truth” list can be 
generated by orthogonal verification. 
Figure 6. Principal component analysis of 
Proton pre- and post-capture runs 
Detected SNPs that are overlapped with HGDP 650K 
and population datasets from the Estonian Biocentre 
were used for principal component analysis. 
(A) Ion Proton™ pre-capture, (B) Ion Proton™ post-capture. 
For Research Use Only. Not for use in diagnostic procedures. 
© 2014 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the 
property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified. 
TaqMan is a registered trademark of Roche Molecular Systems, Inc., used under 
permission and license. DNA Clean & Concentrator is a trademark of Zymo Research. 
AMPure is a trademark of Beckman Coulter, Inc. 
Ancient DNA 
(6 to 39ng) 
End repair 
Adapter ligation 
Nick translation 
& 
PCR amplification 
Pippin Prep size 
selection 
Index2_2x150.merge_and_trim.mappedonly.rmdup.LongerThan30.rg 
Single−end read length distribution 
Occurences 
Read length 
33 
43 
53 
63 
73 
83 
93 
103 
113 
123 
133 
143 
153 
163 
173 
183 
193 
203 
213 
0 
200 
400 
600 
800 
1000 
Single−end read length per strand 
subplus$Length 
subplus$Occurences 
Occurences 
Read length 
33 
43 
53 
63 
73 
83 
93 
103 
113 
123 
133 
143 
153 
163 
173 
183 
193 
203 
213 
0 
100 
200 
300 
400 
500 
+ strand 
− strand 
Index 
c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 
0 
10 
20 
30 
40 
50 
60 
70 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
C>T 
Read position 
Cumulative frequencies 
+ strand 
− strand 
Index 
c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 
0 
10 
20 
30 
40 
50 
60 
70 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
G>A 
Read position 
Cumulative frequencies 
+ strand 
− strand 
P1_ALL.rmdup.LongerThan30.rg 
Single−end read length distribution 
Occurences 
Read length 
28 
38 
48 
58 
68 
78 
88 
98 
108 
118 
128 
138 
148 
158 
168 
178 
188 
198 
208 
218 
228 
238 
248 
258 
268 
0 
20000 
40000 
60000 
80000 
Single−end read length per strand 
subplus$Length 
subplus$Occurences 
Occurences 
Read length 
28 
38 
48 
58 
68 
78 
88 
98 
108 
118 
128 
138 
148 
158 
168 
178 
188 
198 
208 
218 
228 
238 
248 
258 
0 
10000 
20000 
30000 
40000 
+ strand 
− strand 
Index 
c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 
0 
10 
20 
30 
40 
50 
60 
70 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
C>T 
Read position 
Cumulative frequencies 
+ strand 
− strand 
Index 
c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 
0 
10 
20 
30 
40 
50 
60 
70 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
G>A 
Read position 
Cumulative frequencies 
+ strand 
− strand 
Index2.merge_and_trim.hg19.mappedonly.rmdup.LongerThan30.rg 
Single−end read length distribution 
Occurences 
Read length 
29 
39 
49 
59 
69 
79 
89 
99 
109 
119 
129 
139 
149 
159 
0 
500 
1000 
1500 
2000 
2500 
3000 
Single−end read length per strand 
subplus$Length 
subplus$Occurences 
Occurences 
Read length 
29 
39 
49 
59 
69 
79 
89 
99 
109 
119 
129 
139 
149 
159 
0 
500 
1000 
1500 
+ strand 
− strand 
Index 
c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 
0 
10 
20 
30 
40 
50 
60 
70 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
C>T 
Read position 
Cumulative frequencies 
+ strand 
− strand 
Index 
c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 
0 
10 
20 
30 
40 
50 
60 
70 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
G>A 
Read position 
Cumulative frequencies 
+ strand 
− strand 
Figure 3. Read length comparison 
Library size distribution of (A) Illumina pre-capture, 
(B) Illumina post-capture, (C) Ion Proton™ pre-capture, 
(D) Ion Proton™ post-capture. 
A B 
C D 
A 
B 
C 
D 
A 
B

More Related Content

Similar to Decoding ancient Bulgarian DNA with semiconductor-based sequencing

Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen H...
Towards Ultra-Large-Scale System:  Design of Scalable Software and Next-Gen H...Towards Ultra-Large-Scale System:  Design of Scalable Software and Next-Gen H...
Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen H...
Arghya Kusum Das
 

Similar to Decoding ancient Bulgarian DNA with semiconductor-based sequencing (20)

Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Seminar 20150920.2
Seminar 20150920.2Seminar 20150920.2
Seminar 20150920.2
 
01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Sequence based Markers
Sequence based MarkersSequence based Markers
Sequence based Markers
 
Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen H...
Towards Ultra-Large-Scale System:  Design of Scalable Software and Next-Gen H...Towards Ultra-Large-Scale System:  Design of Scalable Software and Next-Gen H...
Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen H...
 
26072016 uc davis_small
26072016 uc davis_small26072016 uc davis_small
26072016 uc davis_small
 
MRS Dec 2010 Steel With Copper Precipitates Dierk Raabe
MRS  Dec 2010  Steel With  Copper Precipitates Dierk  Raabe  MRS  Dec 2010  Steel With  Copper Precipitates Dierk  Raabe
MRS Dec 2010 Steel With Copper Precipitates Dierk Raabe
 
Bacterial transcriptome profiling using Ion Torrent Proton™ technology
Bacterial transcriptome profiling using Ion Torrent Proton™ technologyBacterial transcriptome profiling using Ion Torrent Proton™ technology
Bacterial transcriptome profiling using Ion Torrent Proton™ technology
 
AGBT 2016 Workshop Magrini
AGBT 2016 Workshop MagriniAGBT 2016 Workshop Magrini
AGBT 2016 Workshop Magrini
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121
 
I010415255
I010415255I010415255
I010415255
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packages
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 
08039246
0803924608039246
08039246
 
Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...
Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...
Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...
 
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
 
Optical Absoprtion of Thin Film Semiconductors
Optical Absoprtion of Thin Film SemiconductorsOptical Absoprtion of Thin Film Semiconductors
Optical Absoprtion of Thin Film Semiconductors
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 

More from Thermo Fisher Scientific

Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNAImprovement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Thermo Fisher Scientific
 
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
Thermo Fisher Scientific
 

More from Thermo Fisher Scientific (20)

Why you would want a powerful hot-start DNA polymerase for your PCR
Why you would want a powerful hot-start DNA polymerase for your PCRWhy you would want a powerful hot-start DNA polymerase for your PCR
Why you would want a powerful hot-start DNA polymerase for your PCR
 
TCRB chain convergence in chronic cytomegalovirus infection and cancer
TCRB chain convergence in chronic cytomegalovirus infection and cancerTCRB chain convergence in chronic cytomegalovirus infection and cancer
TCRB chain convergence in chronic cytomegalovirus infection and cancer
 
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNAImprovement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
 
What can we learn from oncologists? A survey of molecular testing patterns
What can we learn from oncologists? A survey of molecular testing patternsWhat can we learn from oncologists? A survey of molecular testing patterns
What can we learn from oncologists? A survey of molecular testing patterns
 
Evaluation of ctDNA extraction methods and amplifiable copy number yield usin...
Evaluation of ctDNA extraction methods and amplifiable copy number yield usin...Evaluation of ctDNA extraction methods and amplifiable copy number yield usin...
Evaluation of ctDNA extraction methods and amplifiable copy number yield usin...
 
Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...
Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...
Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...
 
Novel Spatial Multiplex Screening of Uropathogens Associated with Urinary Tra...
Novel Spatial Multiplex Screening of Uropathogens Associated with Urinary Tra...Novel Spatial Multiplex Screening of Uropathogens Associated with Urinary Tra...
Novel Spatial Multiplex Screening of Uropathogens Associated with Urinary Tra...
 
Liquid biopsy quality control – the importance of plasma quality, sample prep...
Liquid biopsy quality control – the importance of plasma quality, sample prep...Liquid biopsy quality control – the importance of plasma quality, sample prep...
Liquid biopsy quality control – the importance of plasma quality, sample prep...
 
Streamlined next generation sequencing assay development using a highly multi...
Streamlined next generation sequencing assay development using a highly multi...Streamlined next generation sequencing assay development using a highly multi...
Streamlined next generation sequencing assay development using a highly multi...
 
Targeted T-cell receptor beta immune repertoire sequencing in several FFPE ti...
Targeted T-cell receptor beta immune repertoire sequencing in several FFPE ti...Targeted T-cell receptor beta immune repertoire sequencing in several FFPE ti...
Targeted T-cell receptor beta immune repertoire sequencing in several FFPE ti...
 
Development of Quality Control Materials for Characterization of Comprehensiv...
Development of Quality Control Materials for Characterization of Comprehensiv...Development of Quality Control Materials for Characterization of Comprehensiv...
Development of Quality Control Materials for Characterization of Comprehensiv...
 
A High Throughput System for Profiling Respiratory Tract Microbiota
A High Throughput System for Profiling Respiratory Tract MicrobiotaA High Throughput System for Profiling Respiratory Tract Microbiota
A High Throughput System for Profiling Respiratory Tract Microbiota
 
A high-throughput approach for multi-omic testing for prostate cancer research
A high-throughput approach for multi-omic testing for prostate cancer researchA high-throughput approach for multi-omic testing for prostate cancer research
A high-throughput approach for multi-omic testing for prostate cancer research
 
Why is selecting the right thermal cycler important?
Why is selecting the right thermal cycler important?Why is selecting the right thermal cycler important?
Why is selecting the right thermal cycler important?
 
A rapid library preparation method with custom assay designs for detection of...
A rapid library preparation method with custom assay designs for detection of...A rapid library preparation method with custom assay designs for detection of...
A rapid library preparation method with custom assay designs for detection of...
 
Generation of Clonal CRISPR/Cas9-edited Human iPSC Derived Cellular Models an...
Generation of Clonal CRISPR/Cas9-edited Human iPSC Derived Cellular Models an...Generation of Clonal CRISPR/Cas9-edited Human iPSC Derived Cellular Models an...
Generation of Clonal CRISPR/Cas9-edited Human iPSC Derived Cellular Models an...
 
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
 
Identifying novel and druggable targets in a triple negative breast cancer ce...
Identifying novel and druggable targets in a triple negative breast cancer ce...Identifying novel and druggable targets in a triple negative breast cancer ce...
Identifying novel and druggable targets in a triple negative breast cancer ce...
 
Evidence for antigen-driven TCRβ chain convergence in the melanoma-infiltrati...
Evidence for antigen-driven TCRβ chain convergence in the melanoma-infiltrati...Evidence for antigen-driven TCRβ chain convergence in the melanoma-infiltrati...
Evidence for antigen-driven TCRβ chain convergence in the melanoma-infiltrati...
 
Analytical performance of a novel next generation sequencing assay for Myeloi...
Analytical performance of a novel next generation sequencing assay for Myeloi...Analytical performance of a novel next generation sequencing assay for Myeloi...
Analytical performance of a novel next generation sequencing assay for Myeloi...
 

Recently uploaded

FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Sérgio Sacani
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
Jocelyn Atis
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
Sérgio Sacani
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
Sérgio Sacani
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
Sérgio Sacani
 

Recently uploaded (20)

Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 
SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
biotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptxbiotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptx
 
Shuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptxShuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptx
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on Earth
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
 

Decoding ancient Bulgarian DNA with semiconductor-based sequencing

  • 1. Thermo Fisher Scientific • 5791 Van Allen Way • Carlsbad, CA 92008 • www.lifetechnologies.com Yuan-Chieh Ku1, Meredith L. Carpenter2, Martin Sikora2, Hannes Schroeder3, Clarence C. Lee1, Christopher Davies1, M. Thomas P. Gilbert3, Carlos D. Bustamante2, Gavin D. Meredith1 Thermo Fisher Scientific, South San Francisco, CA, USA; 2. Stanford University, Stanford, CA , USA; 3. Centre for GeoGenetics, Copenhagen, Denmark. Decoding ancient Bulgarian DNA with semiconductor-based sequencing ABSTRACT With the development of Next Generation Sequencing Technology (NGS), the field of hominin paleogenetics has transformed significantly from studying specific DNA markers to revealing whole genome information. However, ancient DNA of interest is usually highly fragmented so an NGS library preparation protocol optimized to capture short DNA fragments (40bp to 200bp) was developed. The improved workflow includes the use of column-based DNA purification and concentration and automated gel-based size-selection. This workflow permitted production of “shotgun” genomic libraries from very limited input DNA (6ng to 39ng). Methods that permit the use of such low input, degraded DNA enable the partitioning of exceedingly rare samples into multiple analytical workflows. Data from two orthogonal sequencing platforms for these ancient Bulgarian samples demonstrated very similar base-substitution profiles with C>T and G>A variants accounting for ~75-80% of all SNPs called in both datasets. With such orthogonal verification, we expect to be able to reduce the false positive rate and generate a “truth” list of SNPs that will enhance our understanding of ancient population genomics and migrations. In summary, we have demonstrated a library preparation and semiconductor-based NGS workflow that is applicable for processing contaminated and degraded samples and can be used for ancient DNA research. Figure 1. Whole-genome in-solution capture INTRODUCTION Whole-genome in-solution capture (WISC) is an unbiased way to increase endogenous DNA proportion in ancient DNA samples. Human genomic “bait” libraries were created from a modern reference individual. These bait libraries were constructed with adapters containing T7 RNA polymerase which can be used for in vitro transcription to generate RNA baits covering the entire human genome. These baits were then hybridized to ancient DNA libraries in solution and pulled down with magnetic streptavidin-coated beads. Captured endogenous human DNA was then eluted and amplified for sequencing. RESULTS Figure 2. Ion Torrent low input fragment library workflow Small amount and highly degraded ancient DNA samples were end repaired and purified with Zymo DNA Clean & ConcentratorTM. These samples were then adapter ligated and purified with AMPure® XP beads to remove excess adapter dimer. Libraries were nick translated and amplified for 8 cycles. Final libraries were size selected for fragments range from100bp to 400bp by Pippin PrepTM follow by AMPure® XP beads cleanup. Figure 5. Mapping statistics comparisons High quality filtered reads were mapped to the human genome. Detected SNPs were cross-referenced with the 1000 Genomes reference panel. Figure 4. DNA damage profiles comparison DNA substitution patterns of (A) Illumina pre-capture, (B) Illumina post-capture, (C) Ion Proton™ pre-capture, (D) Ion Proton™ post-capture. Sample (P192-1) Total read number filtered (MQ30, length >30) Mapped % Duplicates % Position covered SNPs in 1000G Proton_precapture 312,510,164 6% 56% 528,158,981 6,981,671 Proton_postcapture 51,644,781 18% 80% 169,396,091 2,298,092 Illumina_precapture 705,234 4.3% 9% 2,248,978 30,081 Illumina_postcapture 829,256 23.2% 66% 50,003,999 67,221 CONCLUSIONS 1.We developed a library workflow specific for low input and highly degraded ancient DNA samples on an Ion Torrent Proton™ system. 2.Ion Proton™ and Illumina results show very similar substitution profiles. A less biased “truth” list can be generated by orthogonal verification. Figure 6. Principal component analysis of Proton pre- and post-capture runs Detected SNPs that are overlapped with HGDP 650K and population datasets from the Estonian Biocentre were used for principal component analysis. (A) Ion Proton™ pre-capture, (B) Ion Proton™ post-capture. For Research Use Only. Not for use in diagnostic procedures. © 2014 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified. TaqMan is a registered trademark of Roche Molecular Systems, Inc., used under permission and license. DNA Clean & Concentrator is a trademark of Zymo Research. AMPure is a trademark of Beckman Coulter, Inc. Ancient DNA (6 to 39ng) End repair Adapter ligation Nick translation & PCR amplification Pippin Prep size selection Index2_2x150.merge_and_trim.mappedonly.rmdup.LongerThan30.rg Single−end read length distribution Occurences Read length 33 43 53 63 73 83 93 103 113 123 133 143 153 163 173 183 193 203 213 0 200 400 600 800 1000 Single−end read length per strand subplus$Length subplus$Occurences Occurences Read length 33 43 53 63 73 83 93 103 113 123 133 143 153 163 173 183 193 203 213 0 100 200 300 400 500 + strand − strand Index c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 0 10 20 30 40 50 60 70 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 C>T Read position Cumulative frequencies + strand − strand Index c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 0 10 20 30 40 50 60 70 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 G>A Read position Cumulative frequencies + strand − strand P1_ALL.rmdup.LongerThan30.rg Single−end read length distribution Occurences Read length 28 38 48 58 68 78 88 98 108 118 128 138 148 158 168 178 188 198 208 218 228 238 248 258 268 0 20000 40000 60000 80000 Single−end read length per strand subplus$Length subplus$Occurences Occurences Read length 28 38 48 58 68 78 88 98 108 118 128 138 148 158 168 178 188 198 208 218 228 238 248 258 0 10000 20000 30000 40000 + strand − strand Index c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 0 10 20 30 40 50 60 70 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 C>T Read position Cumulative frequencies + strand − strand Index c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 0 10 20 30 40 50 60 70 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 G>A Read position Cumulative frequencies + strand − strand Index2.merge_and_trim.hg19.mappedonly.rmdup.LongerThan30.rg Single−end read length distribution Occurences Read length 29 39 49 59 69 79 89 99 109 119 129 139 149 159 0 500 1000 1500 2000 2500 3000 Single−end read length per strand subplus$Length subplus$Occurences Occurences Read length 29 39 49 59 69 79 89 99 109 119 129 139 149 159 0 500 1000 1500 + strand − strand Index c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 0 10 20 30 40 50 60 70 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 C>T Read position Cumulative frequencies + strand − strand Index c(0, cumsum(subplus[, mut]/sum(subplus[, mut]))) 0 10 20 30 40 50 60 70 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 G>A Read position Cumulative frequencies + strand − strand Figure 3. Read length comparison Library size distribution of (A) Illumina pre-capture, (B) Illumina post-capture, (C) Ion Proton™ pre-capture, (D) Ion Proton™ post-capture. A B C D A B C D A B