The impact of next generation sequencing on human genetics  Prof. dr. Frans P.M. Cremers Department of Human Genetics,  Nijmegen, the Netherlands S1 student presentation, Cebior, Semarang,  25 July 2010
1953: Watson, J. & Crick, F.:  The double helix 1977: Sanger, F. et al.:  DNA sequencing 1983: Mullins, K.B.:  Polymerase Chain Reaction 2005: Margulies, M. et al. and many others:    Next generation sequencing Milestones in molecular genetics
Sanger sequencing: technique A T G C T T C G G C A A G A A T G C A T G C T A T G C T T A T G C T T C 1 2 5 4 3 Gene X exon exon exon exon exon PCR  amplification T A C G A A G C C G T T C T A T G DNA template C C T T T T C C A G G G A A C T T A C G A A G C C G T T C T A T G C T T C Selection on size ABI3730 3 3 3 3 3 3 Primers
Sanger sequencing: costs 48 electrophoresis capillaries 500 nucleotides per capillary ~25.000 nucleotides per run Costs: € 5 per capillary = € 250 / 25.000 nt €  0.01/nt = Rp. 100/nt
Sanger sequencing: applications Human Genetics: DNA Diagnostics: sequencing known disease genes (e.g. cystic fibrosis, retinoblastoma) Searching for new genes: analysis of candidate genes in genetic linkage interval
Genes: on average 10 exons that encode for the protein ATG TAA TGA TAG Translation stop (translation start codon for Methionine) Protein
Sanger sequencing: limitations when testing diseases with large genetic heterogeneity Disease # Genes Sanger sequencing costs Hereditary breast cancer 2 €  500 Rp. 5.000.000 Ataxia ~10 €  2.500 Rp. 25.000.000 Hereditary blindness ~100 €  25.000 Rp. 250.000.000 Mental retardation ~1000 €  250.000 Rp. 2.500.000.000
DNA-Enrichment by array sequence capture: 1.  DNA fragmentation 2.  Hybridization to synthesized probes Next generation sequencing (NGS) 5.  Sequencing  3.  Stringent washing  4.  Elution & amplification
Library preparation Emulsion-PCR Pyrosequencing NGS:  Massive parallel sequencing  (Roche 454)
NGS: 1000-fold increase in output 1 million parallel reads 500 bp per read   500,000,000 nt 20 x coverage needed Effective: 25.000.000 nt Sanger sequencing (ABI 3730): 25.000 nt
NGS: 100-fold cheaper NGS: € 2.500 / 25.000.000 nt €  0.0001 / nt  (Rp. 1 / nt) Sanger  sequencing: € 0.01 / nt  (Rp. 100 / nt)
Molecular Diagnostics: Sanger sequencing vs NGS Disease # Genes Sanger sequencing costs NGS costs Hereditary breast cancer 2 €  500 Rp. 5.000.000 €  5 Rp. 50.000 Ataxia ~10 €  2.500 Rp. 25.000.000 €  2.5 Rp. 250.000 Hereditary blindness ~100 €  25.000 Rp. 250.000.000 €  250 Rp. 2.500.000 Mental retardation ~1000 €  250.000 Rp. 2.500.000.000 €  2.500 Rp. 25.000.000
NGS, application 1: identifying defects in known disease genes Disease # Genes NGS costs Hereditary breast cancer 2 €  0.5 Rp. 50.000 Ataxia ~10 €  2.5 Rp. 250.000 Hereditary blindness ~100 €  250 Rp. 2.500.000 Mental retardation ~1000 €  2.500 Rp. 25.000.000
NGS, application 2: identifying genetic defect in genomic region Identification of a new gene for familial exudative vitreoretinopathy Nikopoulos K. et al.  Am J Hum Genet.  86 :240-247, 2010.
Familial exudative vitreoretinopathy Fundus: avascular zone
Familial exudative vitreoretinopathy Fundus: avascular zone retinal detachment “ stretched/dragged”  vasculature Visual acuity: normal    blindness
Linkage at chromosome 7 7 7 LOD LOD 0.50 0.00 0.50 1.00 1.50 2.00 2.50 3.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 chr chr 7 4.00
Candidate gene analysis 126.4  Mb 109.7 Mb 340 genes
Candidate gene analysis PhyloP score:  conservation of a nucleotide on a given   sequence among 44 vertebrate species. Position reference allele Ref. allele Variant allele Total # of reads # of variant reads % variant reads Ref. amino acid Variant amino acid Gene PhyloP score 120216091 C G 20 10 50 A P TSPAN12 5.32 98870495 G A 26 16 62 R C PTCD1 3.06 100209410 G A 15 8 53 R H ZAN 1.81 99835402 C T 13 6 46 P L PILRA 1.75 113306419 C T 15 6 40 S N PPP1R3A 1.05 100473466 A G 38 13 34 T A MUC17 0.60 128099699 C G 7 5 71 I M FAM71F2 0.42 115411632 C T 14 5 36 D N TFEC -0.45
NGS of 330 genes in 40 Mb region
Identification of a new gene for familial exudative vitreoretinopathy TSPAN12  p. Ala 237 Pro c.709 G > C
NGS, application 3: sequencing of whole genomes Analysis of natural variation of human genomes (“1000 genome project”) http://www.1000genomes.org/page.php Sequencing of 802 eukaryotic species: http://www.ncbi.nlm.nih.gov/genomes/leuks.cgi Sequencing of extinct species:  Neanderthal http://www.broadinstitute.org/
NGS, application 4: identifying genetic defects in whole genome June 2010; vol. 42, pp. 483-486
Schinzel-Giedion syndrome Severe mental retardation Distinctive facial features Multiple congenital abnormalities Neoplasias Sporadic occurrence ( de novo  mutations?)
Schinzel-Giedion syndrome Sequence analysis of all exons of 18,000 genes of 4 unrelated patients Exons constitute 1% of human genome
De novo SETBP1  mutations in 12 patients with Schinzel-Giedion syndrome  Normal Normal  *  Normal Normal Normal * Mutations:  Asp868Asn Gly870Ser Ile871Thr
1953: Watson, J. & Crick, F.:  The double helix 1977: Sanger, F. et al.:  DNA sequencing 1983: Mullins, K.B.:  Polymerase Chain Reaction 2005: Margulies, M. et al. and many others:    Next generation sequencing Milestones in molecular genetics
The impact of next generation sequencing on clinical genetics Predictions: in 2013 more than 90% of all human disease genes have been identified. in 2013 sequence analysis of all human genes will cost € 500  (Rp. 5.000.000)  per person
The impact of next generation sequencing on clinical genetics Challenges:  to understand the effect of DNA variants to understand the interaction between genetic  defects that can explain intra- en interfamilial  variability of the expression of human disease to make NGS technology available to developing countries
Acknowledgments Kostas Nikopoulos Rob Collin Ellen Blokland Marijke Zonneveld Anneke den Hollander Kornelia Neveling Nienke Wieskamp Michael Kwint Peer Arts Christian Gillisen Alex Hoischen Michael Buckley Hans Scheffer Joris Veltman

Next Generation Sequencing - Prof. Frans Cremers

  • 1.
    The impact ofnext generation sequencing on human genetics Prof. dr. Frans P.M. Cremers Department of Human Genetics, Nijmegen, the Netherlands S1 student presentation, Cebior, Semarang, 25 July 2010
  • 2.
    1953: Watson, J.& Crick, F.: The double helix 1977: Sanger, F. et al.: DNA sequencing 1983: Mullins, K.B.: Polymerase Chain Reaction 2005: Margulies, M. et al. and many others: Next generation sequencing Milestones in molecular genetics
  • 3.
    Sanger sequencing: techniqueA T G C T T C G G C A A G A A T G C A T G C T A T G C T T A T G C T T C 1 2 5 4 3 Gene X exon exon exon exon exon PCR amplification T A C G A A G C C G T T C T A T G DNA template C C T T T T C C A G G G A A C T T A C G A A G C C G T T C T A T G C T T C Selection on size ABI3730 3 3 3 3 3 3 Primers
  • 4.
    Sanger sequencing: costs48 electrophoresis capillaries 500 nucleotides per capillary ~25.000 nucleotides per run Costs: € 5 per capillary = € 250 / 25.000 nt € 0.01/nt = Rp. 100/nt
  • 5.
    Sanger sequencing: applicationsHuman Genetics: DNA Diagnostics: sequencing known disease genes (e.g. cystic fibrosis, retinoblastoma) Searching for new genes: analysis of candidate genes in genetic linkage interval
  • 6.
    Genes: on average10 exons that encode for the protein ATG TAA TGA TAG Translation stop (translation start codon for Methionine) Protein
  • 7.
    Sanger sequencing: limitationswhen testing diseases with large genetic heterogeneity Disease # Genes Sanger sequencing costs Hereditary breast cancer 2 € 500 Rp. 5.000.000 Ataxia ~10 € 2.500 Rp. 25.000.000 Hereditary blindness ~100 € 25.000 Rp. 250.000.000 Mental retardation ~1000 € 250.000 Rp. 2.500.000.000
  • 8.
    DNA-Enrichment by arraysequence capture: 1. DNA fragmentation 2. Hybridization to synthesized probes Next generation sequencing (NGS) 5. Sequencing 3. Stringent washing 4. Elution & amplification
  • 9.
    Library preparation Emulsion-PCRPyrosequencing NGS: Massive parallel sequencing (Roche 454)
  • 10.
    NGS: 1000-fold increasein output 1 million parallel reads 500 bp per read 500,000,000 nt 20 x coverage needed Effective: 25.000.000 nt Sanger sequencing (ABI 3730): 25.000 nt
  • 11.
    NGS: 100-fold cheaperNGS: € 2.500 / 25.000.000 nt € 0.0001 / nt (Rp. 1 / nt) Sanger sequencing: € 0.01 / nt (Rp. 100 / nt)
  • 12.
    Molecular Diagnostics: Sangersequencing vs NGS Disease # Genes Sanger sequencing costs NGS costs Hereditary breast cancer 2 € 500 Rp. 5.000.000 € 5 Rp. 50.000 Ataxia ~10 € 2.500 Rp. 25.000.000 € 2.5 Rp. 250.000 Hereditary blindness ~100 € 25.000 Rp. 250.000.000 € 250 Rp. 2.500.000 Mental retardation ~1000 € 250.000 Rp. 2.500.000.000 € 2.500 Rp. 25.000.000
  • 13.
    NGS, application 1:identifying defects in known disease genes Disease # Genes NGS costs Hereditary breast cancer 2 € 0.5 Rp. 50.000 Ataxia ~10 € 2.5 Rp. 250.000 Hereditary blindness ~100 € 250 Rp. 2.500.000 Mental retardation ~1000 € 2.500 Rp. 25.000.000
  • 14.
    NGS, application 2:identifying genetic defect in genomic region Identification of a new gene for familial exudative vitreoretinopathy Nikopoulos K. et al. Am J Hum Genet. 86 :240-247, 2010.
  • 15.
  • 16.
    Familial exudative vitreoretinopathyFundus: avascular zone retinal detachment “ stretched/dragged” vasculature Visual acuity: normal  blindness
  • 17.
    Linkage at chromosome7 7 7 LOD LOD 0.50 0.00 0.50 1.00 1.50 2.00 2.50 3.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 chr chr 7 4.00
  • 18.
    Candidate gene analysis126.4 Mb 109.7 Mb 340 genes
  • 19.
    Candidate gene analysisPhyloP score: conservation of a nucleotide on a given sequence among 44 vertebrate species. Position reference allele Ref. allele Variant allele Total # of reads # of variant reads % variant reads Ref. amino acid Variant amino acid Gene PhyloP score 120216091 C G 20 10 50 A P TSPAN12 5.32 98870495 G A 26 16 62 R C PTCD1 3.06 100209410 G A 15 8 53 R H ZAN 1.81 99835402 C T 13 6 46 P L PILRA 1.75 113306419 C T 15 6 40 S N PPP1R3A 1.05 100473466 A G 38 13 34 T A MUC17 0.60 128099699 C G 7 5 71 I M FAM71F2 0.42 115411632 C T 14 5 36 D N TFEC -0.45
  • 20.
    NGS of 330genes in 40 Mb region
  • 21.
    Identification of anew gene for familial exudative vitreoretinopathy TSPAN12 p. Ala 237 Pro c.709 G > C
  • 22.
    NGS, application 3:sequencing of whole genomes Analysis of natural variation of human genomes (“1000 genome project”) http://www.1000genomes.org/page.php Sequencing of 802 eukaryotic species: http://www.ncbi.nlm.nih.gov/genomes/leuks.cgi Sequencing of extinct species: Neanderthal http://www.broadinstitute.org/
  • 23.
    NGS, application 4:identifying genetic defects in whole genome June 2010; vol. 42, pp. 483-486
  • 24.
    Schinzel-Giedion syndrome Severemental retardation Distinctive facial features Multiple congenital abnormalities Neoplasias Sporadic occurrence ( de novo mutations?)
  • 25.
    Schinzel-Giedion syndrome Sequenceanalysis of all exons of 18,000 genes of 4 unrelated patients Exons constitute 1% of human genome
  • 26.
    De novo SETBP1 mutations in 12 patients with Schinzel-Giedion syndrome  Normal Normal  *  Normal Normal Normal * Mutations: Asp868Asn Gly870Ser Ile871Thr
  • 27.
    1953: Watson, J.& Crick, F.: The double helix 1977: Sanger, F. et al.: DNA sequencing 1983: Mullins, K.B.: Polymerase Chain Reaction 2005: Margulies, M. et al. and many others: Next generation sequencing Milestones in molecular genetics
  • 28.
    The impact ofnext generation sequencing on clinical genetics Predictions: in 2013 more than 90% of all human disease genes have been identified. in 2013 sequence analysis of all human genes will cost € 500 (Rp. 5.000.000) per person
  • 29.
    The impact ofnext generation sequencing on clinical genetics Challenges: to understand the effect of DNA variants to understand the interaction between genetic defects that can explain intra- en interfamilial variability of the expression of human disease to make NGS technology available to developing countries
  • 30.
    Acknowledgments Kostas NikopoulosRob Collin Ellen Blokland Marijke Zonneveld Anneke den Hollander Kornelia Neveling Nienke Wieskamp Michael Kwint Peer Arts Christian Gillisen Alex Hoischen Michael Buckley Hans Scheffer Joris Veltman

Editor's Notes

  • #3 Many monogenetic diseases have more than one possible underlying gene Complex monogenic disorders Number of genes varies a lot For some diseases no routine diagnostics available, since there are too many genes to be tested Medical need to sequence more! Limitations in Sequencing capacities + front end methods, i.e. enrichment BUT: technical limitations Definite need for novel (non-PCR based) front-end methods! Capillary (Sanger) Sequencing: 96/384-well, i.e. ~50-400 kb output/run
  • #25 Many monogenetic diseases have more than one possible underlying gene Complex monogenic disorders Number of genes varies a lot For some diseases no routine diagnostics available, since there are too many genes to be tested Medical need to sequence more! Limitations in Sequencing capacities + front end methods, i.e. enrichment BUT: technical limitations Definite need for novel (non-PCR based) front-end methods! Capillary (Sanger) Sequencing: 96/384-well, i.e. ~50-400 kb output/run
  • #26 Many monogenetic diseases have more than one possible underlying gene Complex monogenic disorders Number of genes varies a lot For some diseases no routine diagnostics available, since there are too many genes to be tested Medical need to sequence more! Limitations in Sequencing capacities + front end methods, i.e. enrichment BUT: technical limitations Definite need for novel (non-PCR based) front-end methods! Capillary (Sanger) Sequencing: 96/384-well, i.e. ~50-400 kb output/run
  • #27 Many monogenetic diseases have more than one possible underlying gene Complex monogenic disorders Number of genes varies a lot For some diseases no routine diagnostics available, since there are too many genes to be tested Medical need to sequence more! Limitations in Sequencing capacities + front end methods, i.e. enrichment BUT: technical limitations Definite need for novel (non-PCR based) front-end methods! Capillary (Sanger) Sequencing: 96/384-well, i.e. ~50-400 kb output/run
  • #28 Many monogenetic diseases have more than one possible underlying gene Complex monogenic disorders Number of genes varies a lot For some diseases no routine diagnostics available, since there are too many genes to be tested Medical need to sequence more! Limitations in Sequencing capacities + front end methods, i.e. enrichment BUT: technical limitations Definite need for novel (non-PCR based) front-end methods! Capillary (Sanger) Sequencing: 96/384-well, i.e. ~50-400 kb output/run
  • #29 Many monogenetic diseases have more than one possible underlying gene Complex monogenic disorders Number of genes varies a lot For some diseases no routine diagnostics available, since there are too many genes to be tested Medical need to sequence more! Limitations in Sequencing capacities + front end methods, i.e. enrichment BUT: technical limitations Definite need for novel (non-PCR based) front-end methods! Capillary (Sanger) Sequencing: 96/384-well, i.e. ~50-400 kb output/run
  • #30 Many monogenetic diseases have more than one possible underlying gene Complex monogenic disorders Number of genes varies a lot For some diseases no routine diagnostics available, since there are too many genes to be tested Medical need to sequence more! Limitations in Sequencing capacities + front end methods, i.e. enrichment BUT: technical limitations Definite need for novel (non-PCR based) front-end methods! Capillary (Sanger) Sequencing: 96/384-well, i.e. ~50-400 kb output/run