SlideShare a Scribd company logo
1 of 53
The Next, Next Generation of Sequencing -
 From Semiconductor to Single Molecule

                  Justin H. Johnson
              Director of Bioinformatics
                       EdgeBio
                 Washington DC, USA
Agenda
• Who We Are
• NGS at 30K
• The Challenges
  – Even Before We Get to the Platforms
  – When We Get to the Platforms
Who We Are
Life Tech
 Service
Provider
Contract Research Division
• Five SOLiD4 sequencing platforms
• One Life Techologies 5500XL
• Two Ion Torrent PGMs
• Bioinformatics consulting on Illumina, 454, and PacBio
• Automation thru Caliper Sciclone & Biomek FX
• Commercial partnerships with companies such as CLCBio,
  DNANexus and Genologics
• MD/PhD & Masters Level Scientists and Bioinformaticians
• IT Infrastructure of >100 CPUs and >100TB storage
Edge BioServ
                          Scientific Advisory Board

Elaine Mardis, Ph.D.                       Steven Salzberg, Ph.D.
Co-Director, Genome Sequencing Center      Director, Center for Bioinformatics and
Washington University School of Medicine   Computational Biology
                                           University of Maryland
Sam Levy, Ph.D.
Director of Genome Sciences                Gabor Marth, Ph.D.
Scripps Translational Science Institute    Professor of Bioinformatics
Scripps Genomic Medicine                   Boston College

Michael Zody, M.S.
Chief Technologist                         Elliott Margulies, Ph.D.
Broad Institute                            Investigator
                                           Genome Informatics Section
Ken Dewar, Ph.D.                           National Human Genome Research Institute
Assistant Professor                        National Institutes of Health
McGill University and Genome Quebec
NGS @ 30K Feet
Machines and Vendors
Obligatory NGS Exponential Growth Slide




Nature Biotechnology Volume 26 Number10 October2008
Ultra High Throughput + Lower Cost = Broader Applications

                          RNA-Seq/
                     Whole Transcriptome                     Epigenome
                    - mRNA Expression & Discovery   - Transcriptionally Active Sites
                    - Alternative Splicing          - Protein-DNA Interactions
                    - Allele-Specific Expression    - Methylation Analysis
                    - microRNA Expression &
                      Discovery



       Genome
- De Novo
- Resequencing/ Mutation                                                       Metagenome
 Discovery & Profiling                                                    - Microbial Diversity
- Exome Sequencing                                                        - Heterogeneous Samples
- Copy Number Variation
- Ancient DNA
Challenges
Challenges

Technical Expertise
Experimental Design Considerations
       Sequencing Platform in Use
       Choice of Library Construction
       Depth of coverage
       Re$ources
       Number of Replicates
       Number of Samples and Control
       Etc…
Challenges

Flexibility w/ Standards
Flexibility with Standards and Scale
• Then (CE) – The Norm
  – 10 Machines, 30 – 360 Days, 1 Project
• Now (Illumina/SOLiD/454) – Scale
  – 1 machine, 14 Days, 30 Projects
• Now (Ion Torrent) - Flexibility
  – 1 machine, 1 Day, 1 Project.
• Standardization of analysis (Details Later)
Challenges

Sample Preparation
Sample Sourcing for RNA Projects
– Blood: Large quantities of sample available, but
  with limited utility in transcriptome analysis
– Tissue: Needle biopsy most common, but sample
  quantity very low
– Surgical section: Larger quantities available, but
  limited utility; need laser capture microdissection
  to provide useful results, sample quantity very low
– FFPE Slides: Very useful in clinical research but
  amount of sample and quality low.
Unamplified vs Amplified
• Prostate Cancer Cell Line (Vcap) from CPDR
  – Well characterized
  – Differential Expression upon the addition of
    androgens.
  – Compared transcriptome from a single pool of
    RNA
     • Unamplified, ribosomally depleted (Ribominus™)
     • Amplified, no ribosomal depletion required
     • Two Pipelines for analysis
Amplification Gives Different Results
• Gene Expression in Unstimulated Cells



                  14,075
Spearman’s Correlation from 2
                  Pipelines
Pipeline A                   Unamplified            Amplified
              Androgen   +                 -    +               -
                 +       …            0.930    0.904        0.892
Unamplified
                 -       …                 …   0.896        0.900
                 +       …                 …    …           0.928
  Amplified
                 -       …                 …    …               …

Pipeline B                   Unamplified            Amplified
              Androgen   +                 -    +               -
                 +       …            0.853    0.757        0.701
Unamplified
                 -       …                 …   0.720        0.712
                 +       …                 …    …           0.848
 Amplified
                 -       …                 …    …               …
Challenges

Sample Analysis
Exome Seq Ultimately About Variants

• Coverage
• Project Design
  – Cohorts
  – Cancer
• Algorithms a Solved Problem?
  – Single open source pipelines
  – Single commercial pipelines
  – Proprietary internal algorithms.
  – A mixture?
Ultimately Comes to Variation
• Coverage
• Project Design
  – Cohorts
  – Cancer
• Algorithms Solved Problem?
  – Single open source pipelines
  – Single commercial pipelines
  – Proprietary internal algorithms.
  – A mixture?
Digging Deep with an Exome




Genetic variation in an individual human exome.
Ng PC, Levy S, Huang J, Stockwell TB, Walenz BP, Li K, Axelrod N, Busam DA, Strausberg RL, Venter JC.
PLoS Genet. 2008 Aug 15;4(8):e1000160.
Venter Genome - Algorithms
   • PLOS genetics 2008 vol 4 issue 8 e10000160
   • ~21K SNP in exons (29MB Targeted)
   • 36,206 expected SNPs for 50MB Kit
% Difference Homozygous     TP      TN     FP   FN   Sensitivity      Pos.pred.val
B                              1%        0% -39% -1%               1%               4%
A                             31%        0% 88% -41%              31%              -6%
C                            -32%        0% -49% 42%             -32%               2%

% Difference Heterozygous   TP      TN     FP     FN    Sensitivity          Pos.pred.val
B                              0%        0% 16%   0%                    0%                  -9%
A                            -15%        0% -44% 21%                  -15%                  16%
C                             15%        0% 28% -20%                   15%                  -7%
3 Tools and Associated SNP Counts
• Software A
  – 45,551
• Software B
  – 29,814
• Software C
  – 40,964
Software B v. Software A
             B                 A
           29,814            45,511




   8,564            21,250            24,261




               Union: 54,075
            Intersection: 21,250
                                               Not to Scale
Software B v. Software C
             B                 C
           29,814            40,964




   6,358            23,456            17,508




               Union: 47,322
            Intersection 23,456
Software A v. Software C
              A                 C
            45,511            40,964




   14,738            30,773            10,191




                Union: 55,702
             Intersection: 30,773
B                 A
    29,814            45,511




4,750        1,608             13,130



             19,642

    3,814              11,131



              6,377
                                    Union: 60,452
                                    Intersection: 19,642
                                    Voting Scheme (2/3): 36,195
               C
             40,964
Challenges

Platforms
The weight in…
                  Yield/Day           Read Error Rates   Read Lengths
Illumina MiSeq    2.0 Gb              1.3% (V4)          150
Ion Torrent PGM   0.5 Gb (316 Chip)   1.5% (316 Chip)    120 - 240
PacBio RS         3.0 Gb              2-15%              430-2900




• Illumina and PacBio numbers from Vendor Sequencing
• Ion Torrent from EdgeBio Sequencing
Illumina MiSeq
Mid-Range Length, Accurate Reads, Large Throughput
• All Resequencing
• All De novo Applications
• Transcriptome
• Methylation
Ion Torrent PGM
Long, Mostly Accurate Reads in 2.5 Hours
• Microbial & Viral Resequencing
• Microbial & Viral De novo Applications
• Eukaryotic Amplicon Sequencing
• Metagenomics
  – WGS
  – 16S Surveys
Pac Bio RS
Ultra Long, Less Accurate Reads & Rapid Sequence
• Microbial & Viral De novo Applications
• Structural Variation / Haplotyping
Ion Torrent PGM

                                   Mean Read                Total #            A20 Mean Read
     Name          Total # Reads    Length   Longest Read   (Mbp)     Q20 Mb       Length
    HG19-01         2,660,176        139         203        369.91    124.00        74

    HG19-02         2,321,405        121         202        281.43    116.43        75

    HG19-03         2,471,922        134          203       331.54    124.17        77

Microbe (37% GC)    2,869,789         122         202       350.23    160.48        82

Microbe (30% GC)    2,866,851         122         202       350.16    141.31        81
Ion Torrent PGM
                                                                         Percent of
                           # Aligned / % Aligned /                        Aligned
                 Total #                              #     N50 Largest             Consensus
    Name                   Assembled Assembled                            Genome
                 Reads                             Contigs Contig Contig             Accuracy
                             Reads       Reads                            Covered
                                                                          (AQ40)
DH10B Mapping   1,384,863 1,334,138    96.34%      90   107,749 326,368 99.51%      99.97%

 DH10B Denovo   1,384,863 1,335,604    96.44%     216    42,499 146,899 99.53%       1.73%



On Similar Illumina Data Set
• Normalizing for coverage and removing Paired Ends
• N50 of 94926 and Largest Contig of 236274
• Removing normalization improved numbers
Why the Difference?



Quality?
Quality?




Q-Q plots of the DH10B Ion Torrent 316 chip data expected vs empirical quality
before recalibration (left) and after recalibration (right).
Quality




Q-Q plots of DH10B MiSeq data expected vs empirical quality before recalibration
(left) and after recalibration (right).
Empirical Quality
Empirical Quality - Long Reads
Then Why?
• De Bruijn Graphs adversely affected by more
  frequent INDEL characteristics of Ion Torrent
• Higher Average Quality reads are less
  abundant in Ion Torrent
Does this matter in Resequencing?
• Depends on the tools used!
   – If you understand error profile, you can correct for it…
• Ran Simulated DH10B mutation experiment
  1.   make mutated e. coli reference (fakemut)
  2.   align data to mutated reference (clc, tmap, or other mappers)
  3.   calculate per base coverage on the BAM file
       (genomeCoverageBed)
  4.   run samtools/mpileup/vcffilter (or CLC SNP/INDELcalling) to
       call variants -run various settings to compare variant calling
  5.   Calculate false positives, true positives, and false negatives
  6.   Calculate number of variants missed due to low coverage
  7.   Calculate PPV and corrected sensitivity
  8.   Graph PPV and corrected sensitivity
Resequencing
    • Ion claims substitution issues with MiSeq 1
    • Illumina claims INDEL issues with Ion 2




1. http://www.iontorrent.com/lib/images/PDFs/co23743_pgm_app_note.pdf
2. http://www.illumina.com/Documents/products/appnotes/appnote_miseq_ecoli.pdf
Resequencing
                     Variants     Specificity   Sensitivity   PPV
                     Identified
Ion/TMAP/SamTools    460          100%          76.957%       97.676%
(Mod)
Ion/TMAP/SamTools    459          99.895%       91.939%       6.014% (~6500
(Default)                                                     False Negatives)
MiSeq/Eland/SamTools 220          99.99996%     95.91%        99.06%
(Default – SNPs ONLY)
MiSeq/CLC/SamTools   459          95.464%       99.998%       83.871 (~65 False
(Default)                                                     Negatives)


MiSeq SubSampled on DH10B         Ion SubSampled on DH10B
(TMAP/Samtools):                  (TMAP/Samtools):
9 total variants identified       16 total variants identified
8 SNPs and 1 INDEL                0 SNPs and 16 INDEL
Ion Data
                                               PPV and Sensitivity of Samtools Analyses
100.000%




 80.000%




 60.000%                                                                                                          Total PPV
                                                                                                                  SNPs PPV
                                                                                                                  INDELs PPV
                                                                                                                  Total Corrected Sensitivity
 40.000%
                                                                                                                  SNPs Corrected Sensitivity
                                                                                                                  INDELs Corrected Sensitivity


 20.000%




  0.000%
            Default   Q4, h100, o20, Q14, h75, o20, Q7, h50, o10, Q14, h50, o10, Variant Calling Q14, h50, o10,
           Samtools    e27, m1, H1 e21, m4, H2 e17, m4, H1 e17, m4, H1                            e17, m4, H2
MiSeq Data
                                             PPV and Sensitivity of Samtools Analyses of MiSeq Data
100.000%




 80.000%




 60.000%                                                                                                          Total PPV
                                                                                                                  SNPs PPV
                                                                                                                  INDELs PPV
                                                                                                                  Total Corrected Sensitivity
 40.000%
                                                                                                                  SNPs Corrected Sensitivity
                                                                                                                  INDELs Corrected Sensitivity


 20.000%




  0.000%
           MiSeq CLC with Default Samtools    MiSeq CLC Map with Variant Analysis   MiSeq TMAP Map with Variant
                                                                                             Analysis
Resequencing Conclusion

Using appropriate aligners and
 variant callers we show both
platforms have high accuracy,
   each with strengths and
         weaknesses…
What About PacBio?
• We have less experience with PacBio
• We (EdgeBio) thinks PacBio may have a niche,
  but given large initial investment, waiting.
• Many conferences and posters – only results
  seen are for de novo sequencing and finishing
  (Broad).
• Will be here all week and would love to hear
  why you love it.
Take This Home
• There are many challenges before we even get
  to picking a platform
  – Technical Expertise
  – Standards in Prep and Analysis

  With Great NGS Power Comes Great Responsibility
Acknowledgements
• CPDR (Center for Prostate Disease Research) Collaboration
   – Shyh-Han Tan, Ph.D.

            EdgeBio Sequencing            EdgeBio IFX
       Joy Adigun                John Seed
       Elyse Nagle               Anjana Varadarajan
       Jennifer Sheffield        David Jenkins
       Rossio Kersey             Phil Dagosto
       Ryan Mease                Quang Tri Nguyen
Questions
   Twitter: @Bioinfo
jjohnson@edgebio.com

More Related Content

What's hot

DNA Sequencing from Single Cell
DNA Sequencing from Single CellDNA Sequencing from Single Cell
DNA Sequencing from Single CellQIAGEN
 
BioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing ProductsBioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing Productsbiochain
 
Next-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesNext-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesJan Aerts
 
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...eventi-ITBbari
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingDayananda Salam
 
Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...QBiC_Tue
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approachHong ChangBum
 
Ngs microbiome
Ngs microbiomeNgs microbiome
Ngs microbiomejukais
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.mkim8
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSIntegrated DNA Technologies
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyQIAGEN
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)LOGESWARAN KA
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowBrian Krueger
 
Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Sebastian Schmeier
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Li Shen
 
A decade into Next Generation Sequencing on marine non-model organisms: curre...
A decade into Next Generation Sequencing on marine non-model organisms: curre...A decade into Next Generation Sequencing on marine non-model organisms: curre...
A decade into Next Generation Sequencing on marine non-model organisms: curre...Alexander Jueterbock
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionJatinder Singh
 
Molecular QC: Interpreting your Bioinformatics Pipeline
Molecular QC: Interpreting your Bioinformatics PipelineMolecular QC: Interpreting your Bioinformatics Pipeline
Molecular QC: Interpreting your Bioinformatics PipelineCandy Smellie
 

What's hot (20)

DNA Sequencing from Single Cell
DNA Sequencing from Single CellDNA Sequencing from Single Cell
DNA Sequencing from Single Cell
 
BioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing ProductsBioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing Products
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
 
Next-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesNext-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologies
 
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...
 
Ngs intro_v6_public
 Ngs intro_v6_public Ngs intro_v6_public
Ngs intro_v6_public
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approach
 
Ngs microbiome
Ngs microbiomeNgs microbiome
Ngs microbiome
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can Know
 
Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
A decade into Next Generation Sequencing on marine non-model organisms: curre...
A decade into Next Generation Sequencing on marine non-model organisms: curre...A decade into Next Generation Sequencing on marine non-model organisms: curre...
A decade into Next Generation Sequencing on marine non-model organisms: curre...
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
Molecular QC: Interpreting your Bioinformatics Pipeline
Molecular QC: Interpreting your Bioinformatics PipelineMolecular QC: Interpreting your Bioinformatics Pipeline
Molecular QC: Interpreting your Bioinformatics Pipeline
 

Viewers also liked

Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...Torsten Seemann
 
Microbiome Isolation and DNA Enrichment Protocol: Pathogen Detection Webinar ...
Microbiome Isolation and DNA Enrichment Protocol: Pathogen Detection Webinar ...Microbiome Isolation and DNA Enrichment Protocol: Pathogen Detection Webinar ...
Microbiome Isolation and DNA Enrichment Protocol: Pathogen Detection Webinar ...QIAGEN
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput SequencingMark Pallen
 
Metagenomics sequencing
Metagenomics sequencingMetagenomics sequencing
Metagenomics sequencingcdgenomics525
 
Aug2015 deanna church analytical validation
Aug2015 deanna church analytical validationAug2015 deanna church analytical validation
Aug2015 deanna church analytical validationGenomeInABottle
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsJoão André Carriço
 
The Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsThe Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsExternalEvents
 
Whole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyWhole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyPhilip Ashton
 
Genome Wide Methodologies and Future Perspectives
 Genome Wide Methodologies and Future Perspectives Genome Wide Methodologies and Future Perspectives
Genome Wide Methodologies and Future PerspectivesBrian Krueger
 
Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety? Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety? FAO
 
The Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingThe Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingEmiliano De Cristofaro
 
Toolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSToolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSMirko Rossi
 
What can we do with microbial WGS data? - t.seemann - mc gill summer 2016 - ...
What can we do with microbial WGS data?  - t.seemann - mc gill summer 2016 - ...What can we do with microbial WGS data?  - t.seemann - mc gill summer 2016 - ...
What can we do with microbial WGS data? - t.seemann - mc gill summer 2016 - ...Torsten Seemann
 
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015Torsten Seemann
 
Innovative NGS Library Construction Technology
Innovative NGS Library Construction TechnologyInnovative NGS Library Construction Technology
Innovative NGS Library Construction TechnologyQIAGEN
 
Top 30 Resources For Instructional Designers
Top 30 Resources For Instructional DesignersTop 30 Resources For Instructional Designers
Top 30 Resources For Instructional DesignersUpside Learning Solutions
 
7 weeks to 100 push ups..
7 weeks to 100 push ups.. 7 weeks to 100 push ups..
7 weeks to 100 push ups.. hellsingz
 

Viewers also liked (20)

Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
 
Microbiome Isolation and DNA Enrichment Protocol: Pathogen Detection Webinar ...
Microbiome Isolation and DNA Enrichment Protocol: Pathogen Detection Webinar ...Microbiome Isolation and DNA Enrichment Protocol: Pathogen Detection Webinar ...
Microbiome Isolation and DNA Enrichment Protocol: Pathogen Detection Webinar ...
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput Sequencing
 
Poster ESHG
Poster ESHGPoster ESHG
Poster ESHG
 
Metagenomics sequencing
Metagenomics sequencingMetagenomics sequencing
Metagenomics sequencing
 
Aug2015 deanna church analytical validation
Aug2015 deanna church analytical validationAug2015 deanna church analytical validation
Aug2015 deanna church analytical validation
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
Proposal for 2016 survey of WGS capacity in EU/EEA Member States
Proposal for 2016 survey of WGS capacity in EU/EEA Member StatesProposal for 2016 survey of WGS capacity in EU/EEA Member States
Proposal for 2016 survey of WGS capacity in EU/EEA Member States
 
The Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsThe Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groups
 
Whole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyWhole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiology
 
Genome Wide Methodologies and Future Perspectives
 Genome Wide Methodologies and Future Perspectives Genome Wide Methodologies and Future Perspectives
Genome Wide Methodologies and Future Perspectives
 
Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety? Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety?
 
The Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingThe Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome Sequencing
 
Toolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSToolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGS
 
What can we do with microbial WGS data? - t.seemann - mc gill summer 2016 - ...
What can we do with microbial WGS data?  - t.seemann - mc gill summer 2016 - ...What can we do with microbial WGS data?  - t.seemann - mc gill summer 2016 - ...
What can we do with microbial WGS data? - t.seemann - mc gill summer 2016 - ...
 
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
 
Innovative NGS Library Construction Technology
Innovative NGS Library Construction TechnologyInnovative NGS Library Construction Technology
Innovative NGS Library Construction Technology
 
Top 30 Resources For Instructional Designers
Top 30 Resources For Instructional DesignersTop 30 Resources For Instructional Designers
Top 30 Resources For Instructional Designers
 
7 weeks to 100 push ups..
7 weeks to 100 push ups.. 7 weeks to 100 push ups..
7 weeks to 100 push ups..
 

Similar to The Next, Next Generation of Sequencing - From Semiconductor to Single Molecule

Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGenomeInABottle
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917GenomeInABottle
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GenomeInABottle
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Thermo Fisher Scientific
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsAjit Shinde
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshopGenomeInABottle
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GenomeInABottle
 
Día 19 - Noel Chen - Introducción a Novogene
Día 19 - Noel Chen - Introducción a Novogene Día 19 - Noel Chen - Introducción a Novogene
Día 19 - Noel Chen - Introducción a Novogene Alejandro Borges
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsPawan Kumar
 
Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1Robert (Rob) Salomon
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Nathan Olson
 
Ion torrent and SOLiD Sequencing Techniques
Ion torrent and SOLiD Sequencing Techniques Ion torrent and SOLiD Sequencing Techniques
Ion torrent and SOLiD Sequencing Techniques fikrem24yahoocom6261
 
Oncogenomics july 2012
Oncogenomics july 2012Oncogenomics july 2012
Oncogenomics july 2012Elsa von Licy
 
Digiwest journa club presentation_18.10.2016
Digiwest journa club presentation_18.10.2016Digiwest journa club presentation_18.10.2016
Digiwest journa club presentation_18.10.2016Dhirend N. Singh
 
Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Ed Dodds
 
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Ilya Klabukov
 

Similar to The Next, Next Generation of Sequencing - From Semiconductor to Single Molecule (20)

Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Día 19 - Noel Chen - Introducción a Novogene
Día 19 - Noel Chen - Introducción a Novogene Día 19 - Noel Chen - Introducción a Novogene
Día 19 - Noel Chen - Introducción a Novogene
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.
 
ngs.pptx
ngs.pptxngs.pptx
ngs.pptx
 
Ion torrent and SOLiD Sequencing Techniques
Ion torrent and SOLiD Sequencing Techniques Ion torrent and SOLiD Sequencing Techniques
Ion torrent and SOLiD Sequencing Techniques
 
Oncogenomics july 2012
Oncogenomics july 2012Oncogenomics july 2012
Oncogenomics july 2012
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
Digiwest journa club presentation_18.10.2016
Digiwest journa club presentation_18.10.2016Digiwest journa club presentation_18.10.2016
Digiwest journa club presentation_18.10.2016
 
Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...
 
NGS.pptx
NGS.pptxNGS.pptx
NGS.pptx
 
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
 

Recently uploaded

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 

Recently uploaded (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

The Next, Next Generation of Sequencing - From Semiconductor to Single Molecule

  • 1. The Next, Next Generation of Sequencing - From Semiconductor to Single Molecule Justin H. Johnson Director of Bioinformatics EdgeBio Washington DC, USA
  • 2. Agenda • Who We Are • NGS at 30K • The Challenges – Even Before We Get to the Platforms – When We Get to the Platforms
  • 5. Contract Research Division • Five SOLiD4 sequencing platforms • One Life Techologies 5500XL • Two Ion Torrent PGMs • Bioinformatics consulting on Illumina, 454, and PacBio • Automation thru Caliper Sciclone & Biomek FX • Commercial partnerships with companies such as CLCBio, DNANexus and Genologics • MD/PhD & Masters Level Scientists and Bioinformaticians • IT Infrastructure of >100 CPUs and >100TB storage
  • 6. Edge BioServ Scientific Advisory Board Elaine Mardis, Ph.D. Steven Salzberg, Ph.D. Co-Director, Genome Sequencing Center Director, Center for Bioinformatics and Washington University School of Medicine Computational Biology University of Maryland Sam Levy, Ph.D. Director of Genome Sciences Gabor Marth, Ph.D. Scripps Translational Science Institute Professor of Bioinformatics Scripps Genomic Medicine Boston College Michael Zody, M.S. Chief Technologist Elliott Margulies, Ph.D. Broad Institute Investigator Genome Informatics Section Ken Dewar, Ph.D. National Human Genome Research Institute Assistant Professor National Institutes of Health McGill University and Genome Quebec
  • 7. NGS @ 30K Feet
  • 9. Obligatory NGS Exponential Growth Slide Nature Biotechnology Volume 26 Number10 October2008
  • 10. Ultra High Throughput + Lower Cost = Broader Applications RNA-Seq/ Whole Transcriptome Epigenome - mRNA Expression & Discovery - Transcriptionally Active Sites - Alternative Splicing - Protein-DNA Interactions - Allele-Specific Expression - Methylation Analysis - microRNA Expression & Discovery Genome - De Novo - Resequencing/ Mutation Metagenome Discovery & Profiling - Microbial Diversity - Exome Sequencing - Heterogeneous Samples - Copy Number Variation - Ancient DNA
  • 13. Experimental Design Considerations  Sequencing Platform in Use  Choice of Library Construction  Depth of coverage  Re$ources  Number of Replicates  Number of Samples and Control  Etc…
  • 15. Flexibility with Standards and Scale • Then (CE) – The Norm – 10 Machines, 30 – 360 Days, 1 Project • Now (Illumina/SOLiD/454) – Scale – 1 machine, 14 Days, 30 Projects • Now (Ion Torrent) - Flexibility – 1 machine, 1 Day, 1 Project. • Standardization of analysis (Details Later)
  • 17. Sample Sourcing for RNA Projects – Blood: Large quantities of sample available, but with limited utility in transcriptome analysis – Tissue: Needle biopsy most common, but sample quantity very low – Surgical section: Larger quantities available, but limited utility; need laser capture microdissection to provide useful results, sample quantity very low – FFPE Slides: Very useful in clinical research but amount of sample and quality low.
  • 18. Unamplified vs Amplified • Prostate Cancer Cell Line (Vcap) from CPDR – Well characterized – Differential Expression upon the addition of androgens. – Compared transcriptome from a single pool of RNA • Unamplified, ribosomally depleted (Ribominus™) • Amplified, no ribosomal depletion required • Two Pipelines for analysis
  • 19. Amplification Gives Different Results • Gene Expression in Unstimulated Cells 14,075
  • 20. Spearman’s Correlation from 2 Pipelines Pipeline A Unamplified Amplified Androgen + - + - + … 0.930 0.904 0.892 Unamplified - … … 0.896 0.900 + … … … 0.928 Amplified - … … … … Pipeline B Unamplified Amplified Androgen + - + - + … 0.853 0.757 0.701 Unamplified - … … 0.720 0.712 + … … … 0.848 Amplified - … … … …
  • 22. Exome Seq Ultimately About Variants • Coverage • Project Design – Cohorts – Cancer • Algorithms a Solved Problem? – Single open source pipelines – Single commercial pipelines – Proprietary internal algorithms. – A mixture?
  • 23. Ultimately Comes to Variation • Coverage • Project Design – Cohorts – Cancer • Algorithms Solved Problem? – Single open source pipelines – Single commercial pipelines – Proprietary internal algorithms. – A mixture?
  • 24. Digging Deep with an Exome Genetic variation in an individual human exome. Ng PC, Levy S, Huang J, Stockwell TB, Walenz BP, Li K, Axelrod N, Busam DA, Strausberg RL, Venter JC. PLoS Genet. 2008 Aug 15;4(8):e1000160.
  • 25. Venter Genome - Algorithms • PLOS genetics 2008 vol 4 issue 8 e10000160 • ~21K SNP in exons (29MB Targeted) • 36,206 expected SNPs for 50MB Kit % Difference Homozygous TP TN FP FN Sensitivity Pos.pred.val B 1% 0% -39% -1% 1% 4% A 31% 0% 88% -41% 31% -6% C -32% 0% -49% 42% -32% 2% % Difference Heterozygous TP TN FP FN Sensitivity Pos.pred.val B 0% 0% 16% 0% 0% -9% A -15% 0% -44% 21% -15% 16% C 15% 0% 28% -20% 15% -7%
  • 26. 3 Tools and Associated SNP Counts • Software A – 45,551 • Software B – 29,814 • Software C – 40,964
  • 27. Software B v. Software A B A 29,814 45,511 8,564 21,250 24,261 Union: 54,075 Intersection: 21,250 Not to Scale
  • 28. Software B v. Software C B C 29,814 40,964 6,358 23,456 17,508 Union: 47,322 Intersection 23,456
  • 29. Software A v. Software C A C 45,511 40,964 14,738 30,773 10,191 Union: 55,702 Intersection: 30,773
  • 30. B A 29,814 45,511 4,750 1,608 13,130 19,642 3,814 11,131 6,377 Union: 60,452 Intersection: 19,642 Voting Scheme (2/3): 36,195 C 40,964
  • 32. The weight in… Yield/Day Read Error Rates Read Lengths Illumina MiSeq 2.0 Gb 1.3% (V4) 150 Ion Torrent PGM 0.5 Gb (316 Chip) 1.5% (316 Chip) 120 - 240 PacBio RS 3.0 Gb 2-15% 430-2900 • Illumina and PacBio numbers from Vendor Sequencing • Ion Torrent from EdgeBio Sequencing
  • 33. Illumina MiSeq Mid-Range Length, Accurate Reads, Large Throughput • All Resequencing • All De novo Applications • Transcriptome • Methylation
  • 34. Ion Torrent PGM Long, Mostly Accurate Reads in 2.5 Hours • Microbial & Viral Resequencing • Microbial & Viral De novo Applications • Eukaryotic Amplicon Sequencing • Metagenomics – WGS – 16S Surveys
  • 35. Pac Bio RS Ultra Long, Less Accurate Reads & Rapid Sequence • Microbial & Viral De novo Applications • Structural Variation / Haplotyping
  • 36. Ion Torrent PGM Mean Read Total # A20 Mean Read Name Total # Reads Length Longest Read (Mbp) Q20 Mb Length HG19-01 2,660,176 139 203 369.91 124.00 74 HG19-02 2,321,405 121 202 281.43 116.43 75 HG19-03 2,471,922 134 203 331.54 124.17 77 Microbe (37% GC) 2,869,789 122 202 350.23 160.48 82 Microbe (30% GC) 2,866,851 122 202 350.16 141.31 81
  • 37. Ion Torrent PGM Percent of # Aligned / % Aligned / Aligned Total # # N50 Largest Consensus Name Assembled Assembled Genome Reads Contigs Contig Contig Accuracy Reads Reads Covered (AQ40) DH10B Mapping 1,384,863 1,334,138 96.34% 90 107,749 326,368 99.51% 99.97% DH10B Denovo 1,384,863 1,335,604 96.44% 216 42,499 146,899 99.53% 1.73% On Similar Illumina Data Set • Normalizing for coverage and removing Paired Ends • N50 of 94926 and Largest Contig of 236274 • Removing normalization improved numbers
  • 39. Quality? Q-Q plots of the DH10B Ion Torrent 316 chip data expected vs empirical quality before recalibration (left) and after recalibration (right).
  • 40. Quality Q-Q plots of DH10B MiSeq data expected vs empirical quality before recalibration (left) and after recalibration (right).
  • 42. Empirical Quality - Long Reads
  • 43. Then Why? • De Bruijn Graphs adversely affected by more frequent INDEL characteristics of Ion Torrent • Higher Average Quality reads are less abundant in Ion Torrent
  • 44. Does this matter in Resequencing? • Depends on the tools used! – If you understand error profile, you can correct for it… • Ran Simulated DH10B mutation experiment 1. make mutated e. coli reference (fakemut) 2. align data to mutated reference (clc, tmap, or other mappers) 3. calculate per base coverage on the BAM file (genomeCoverageBed) 4. run samtools/mpileup/vcffilter (or CLC SNP/INDELcalling) to call variants -run various settings to compare variant calling 5. Calculate false positives, true positives, and false negatives 6. Calculate number of variants missed due to low coverage 7. Calculate PPV and corrected sensitivity 8. Graph PPV and corrected sensitivity
  • 45. Resequencing • Ion claims substitution issues with MiSeq 1 • Illumina claims INDEL issues with Ion 2 1. http://www.iontorrent.com/lib/images/PDFs/co23743_pgm_app_note.pdf 2. http://www.illumina.com/Documents/products/appnotes/appnote_miseq_ecoli.pdf
  • 46. Resequencing Variants Specificity Sensitivity PPV Identified Ion/TMAP/SamTools 460 100% 76.957% 97.676% (Mod) Ion/TMAP/SamTools 459 99.895% 91.939% 6.014% (~6500 (Default) False Negatives) MiSeq/Eland/SamTools 220 99.99996% 95.91% 99.06% (Default – SNPs ONLY) MiSeq/CLC/SamTools 459 95.464% 99.998% 83.871 (~65 False (Default) Negatives) MiSeq SubSampled on DH10B Ion SubSampled on DH10B (TMAP/Samtools): (TMAP/Samtools): 9 total variants identified 16 total variants identified 8 SNPs and 1 INDEL 0 SNPs and 16 INDEL
  • 47. Ion Data PPV and Sensitivity of Samtools Analyses 100.000% 80.000% 60.000% Total PPV SNPs PPV INDELs PPV Total Corrected Sensitivity 40.000% SNPs Corrected Sensitivity INDELs Corrected Sensitivity 20.000% 0.000% Default Q4, h100, o20, Q14, h75, o20, Q7, h50, o10, Q14, h50, o10, Variant Calling Q14, h50, o10, Samtools e27, m1, H1 e21, m4, H2 e17, m4, H1 e17, m4, H1 e17, m4, H2
  • 48. MiSeq Data PPV and Sensitivity of Samtools Analyses of MiSeq Data 100.000% 80.000% 60.000% Total PPV SNPs PPV INDELs PPV Total Corrected Sensitivity 40.000% SNPs Corrected Sensitivity INDELs Corrected Sensitivity 20.000% 0.000% MiSeq CLC with Default Samtools MiSeq CLC Map with Variant Analysis MiSeq TMAP Map with Variant Analysis
  • 49. Resequencing Conclusion Using appropriate aligners and variant callers we show both platforms have high accuracy, each with strengths and weaknesses…
  • 50. What About PacBio? • We have less experience with PacBio • We (EdgeBio) thinks PacBio may have a niche, but given large initial investment, waiting. • Many conferences and posters – only results seen are for de novo sequencing and finishing (Broad). • Will be here all week and would love to hear why you love it.
  • 51. Take This Home • There are many challenges before we even get to picking a platform – Technical Expertise – Standards in Prep and Analysis With Great NGS Power Comes Great Responsibility
  • 52. Acknowledgements • CPDR (Center for Prostate Disease Research) Collaboration – Shyh-Han Tan, Ph.D. EdgeBio Sequencing EdgeBio IFX Joy Adigun John Seed Elyse Nagle Anjana Varadarajan Jennifer Sheffield David Jenkins Rossio Kersey Phil Dagosto Ryan Mease Quang Tri Nguyen
  • 53. Questions Twitter: @Bioinfo jjohnson@edgebio.com