Marcel Dinger - Translating exome and whole genome sequencing to the clinic


Published on

Since sequencing the draft human genome in 2001, the number of diseases with known genetic basis has increased >50‐fold to over 3000. Despite this remarkable success, >2000 Mendelian disorders remain unsolved, and up to 70% of patients presenting at the clinic with genetic disorders remain undiagnosed. Clinical‐grade genome sequencing holds the dual promise of improving diagnostic rates, and empowering genetic research through the discovery of novel disease-associated variants. The long‐term research value of performing whole exome and genome sequencing in a diagnostic setting on thousands of individuals will offset the initially higher cost and complexity, than a targeted gene‐panel approach.
In late 2012, we established the Kinghorn Centre for Clinical Genomics (KCCG) with the aim of implementing genomic medicine in Sydney. At the heart of the KCCG are 2 Illumina HiSeq 2500 sequencers that are used for rapid turnover exome sequencing, and more recently, one the world’s first HiSeq X Ten sequencing suites, with capability of sequencing more than 300 whole human genomes per week. Since we intend to provide NATA‐certified, clinical‐grade sequencing, much of our work over the past 12 months has been focused on the development of standardised procedures for test procurement in the clinic through to wet‐lab processes, bioinformatics and clinical reporting. The bioinformatics workflow includes phenotype capture, read alignment, mutation calling, variant annotation and filtering by inheritance pattern, rarity, predicted functional impact and known disease association.
To date, we have sequenced exomes from >100 patients, from a range of conditions, largely reflecting the undiagnosed caseload at the Sydney Children’s Hospital. We will present some early success stories from sequencing these exomes and reflect on the possibilities presented by low‐cost whole genome sequencing in the diagnosis of inherited disease.

First presented at the 2014 Winter School in Mathematical and Computational Biology

Published in: Science
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Marcel Dinger - Translating exome and whole genome sequencing to the clinic

  1. 1. Translating exome and whole genome sequencing to the clinic Winter School in Mathematical and Computational Biology Institute for Molecular Bioscience, University of Queensland  9 July 2014 Winter School in Mathematical and Computational Biology Institute for Molecular Bioscience, University of Queensland  9 July 2014 Marcel Dinger Head of Clinical Genomics & Genome Informatics Garvan Institute of Medical Research Sydney Marcel Dinger Head of Clinical Genomics & Genome Informatics Garvan Institute of Medical Research Sydney
  2. 2. • Kinghorn Centre for Clinical Genomics • Clinical applications of genomic medicine • Implementation challenges • The future Overview
  3. 3. Kinghorn Centre for Clinical GenomicsKCCG was established at Garvan Institute of Medical Research in October 2012. The service is delivered in collaboration with the neighboring St Vincent’s Hospital and their pathology service (SydPath). Multidisciplinary team of 22 (and growing!) comprising laboratory scientists, bioinformaticians, software developers, geneticists and PhD students.
  4. 4. Kinghorn Centre for Clinical Genomics Two Illumina HiSeq 2500s capable of sequencing ~2 terabases per week (~350 exomes at 150X mean coverage). Dedicated high-performance computing cluster (1400 CPU cores, 10 TB of memory, 1 PB of storage) NATA accreditation (ISO15189 - medical testing) for exome sequencing and cancer enrichment panel scheduled for late 2014. In January 2014, KCCG became one of the world’s first sites to order the Illumina HiSeq X Ten (more on that later!)
  5. 5. Why the genome? The human genome provides the instruction for our development and function. Understanding the genetic basis for disease is a critical component not only in diagnosing and selecting treatment, but is also crucial for the design of new therapies. Mutations in the sequence that we inherit from our parents or that accumulate during life are the basis of the majority of human diseases, including cancer. Genomic sequencing has the potential to impact tremendously on health care treatment and prevention.
  6. 6. Clinical Genomics: what are the opportunities? Clinical genomics will (initially) have a major impact in three key areas: 1. Accurate diagnosis of inherited diseases, including rare diseases and intellectual impairment 2. Molecular stratification of cancer to direct treatment pathways 3. To optimise drug choices and drug dosages based on an individual’s genotype (pharmacogenomics)
  7. 7. Diagnosis of inherited disease Genetic testing has traditionally been done on a “per gene” basis. Humans have ~21,000 protein-coding genes - mutations in any of ~4,500 of these genes has been associated with an inherited disease. Rare diseases can be especially difficult to diagnose and clinicians are left to make informed guesses as to which gene test to order. With testing taking weeks to perform (often overseas) and costing upto $2,500 per test - such diagnoses can be extremely time-consuming and expensive. Whole genome (and exome) sequencing
  8. 8. Mendelian disorders Rare (but not collectively) monogenic diseases - many thousands solved, >2,000 left. Increased access to genomics has led to rapid identification of Mendelian disease genes. 0 23 45 68 90 3/2009 1/2010 3/2010 1/2011 3/2011 1/2012 3/2012 1/2013 Mendelian Disease Genes Published per Quarter 0. 22.5 45. 67.5 90. 3/2009 1/2010 3/2010 1/2011 3/2011 1/2012 3/2012 1/2013 Impact factor of journals of associated publications Mendelian gene discovery: Gene discovery is accelerating to 1 new Mendelian disease gene/day At currents rates, the primary allele for most recognized Mendelian disorders will be identified in 5 years
  9. 9. Example of simple diagnosis by WES Thin Basement Membrane (Kidney) Disease normal TBMD 4 typical culprits: COL4A3 COL4A4 COL4A5 AVPR2 COL4A3: p.Gly637Arg, c.1909G>A G/A G/GG/A G/A G/G Mark Cowley, Tim Furlong
  10. 10. Intellectual Disability (23) Epileptic encephalopathy (11) Skeletal (5) Immune (4) Syndromic (3) Eye (3) Haematological (3) Neurological (seizures) (1) Metabolic (1) Paediatric inherited disease cohort Tony Roscioli and Lisa Ewans, Sydney Children’s Hospital Network 53 families from Sydney Children’s Hospital Diverse phenotypes representative of a typical case- load for a clinical geneticist in a paediatric hospital Majority of cases had tested negative for routine genetic tests. Total of 122 exomes sequenced - mixture of trios, parent/child and individual. Let’s look at a few illustrative examples…
  11. 11. Phenotype • Consanguineous family • Siblings – brother and sister • ID and severe seizure disorder Analyse exome data with HomozygosityMapper Sibling 1 Sibling 2 Combined regions on Chr12 flagged Example I: A case for homozygosity Exome sequencing of brother and sister Tony Roscioli and Lisa Ewans
  12. 12. Example I: A case for homozygosity Identify genes in homozygous region on Chromosome 12 Genes in homozygous region on Chr 12 Determine genes with damaging mutations using CADD (Combined Annotation Dependent Depletion) Conclude variant AGAP2 is the likely causative mutation Literature evaluation of candidates • AGAP2 (Centaurin family gene) participates in the prevention of neuronal apoptosis by enhancing PI3 kinase activity. • Highly expressed in brain CADD is a highly sensitive tool to distinguish between pathogenic and benign variants Tony Roscioli and Lisa Ewans
  13. 13. Phenotype • 2 year 11 month old boy • Global developmental delay • Stopped standing, walking • Saudi family not known to be consanguineous • Hypotonia, weakness • Feeding difficulties • Weight loss • Brisk deep tendon reflexes • Impaired upgaze: Niemann-Pick considered; not confirmed on skin biopsy • 6 months later: his brother presented with the same features Example 2: A diagnostic odyssey Michell Farrar, Tony Roscioli and Lisa Ewans
  14. 14. Investigations • MRI revealed cerebellar atrophy • No cardiac or eye features noted • POLG, SURF1, common and rare mitochondrial mutations not identified • Urine: massive increase in dopamine metabolites (degenerating leaky neurones) • Neurophysiology: Motor axonal neuropathy, active denervation • Muscle biopsy revealed: Denervation of small fibre groups Large re-inervated fibres Tony Roscioli and Lisa Ewans
  15. 15. Homozygous regions extracted with HomozygosityMapper Sibling 1 Sibling 2 Combined regions: Peak on Chrom 22 Exome sequencing of brother and sister
  16. 16. GEMINI analysis Only one homozygous inframe codon deletion within a homozygous region Make sure you look at PLA2G6! Expert advice Paediatric neurologist, Dr Michelle Farrar Mutations confirmed in IGV
  17. 17. Diagnostic Odyssey ended for this family - Prenatal / Pre-implantation diagnosis now possible and family is confident to have a healthy child. Extra Evidence: Same mutation described in the literature
  18. 18. Family Phenotype Inheritance Consanguinity Candidate gene? Likelihood of real result 1a and b ID, severe seizures AR Yes Centaurin family gene High/Novel 2 plus trio ID polymicrogyria De novo AD? No GRIN2B Medium‐High 3 Severe developmental delay,  trigonocephaly De novo AD? AR? None known (KANSL1) Low‐Medium 4 Hermansky‐Pudlak syndrome,  oculocutaneous albinism AR Yes ?NCOA3 ?PLS1 ??VOPP1 Low‐Medium/Novel 5 Cone‐rod dystrophy X‐linked recessive No KCND1 Unknown 6 Retinitis pigmentosa AD No Unknown 7 Retinitis pigmentosa AD No Possible SNRNP200 Unknown  ‐ patient limited  exome after consent 8 plus trio Dysmorphic, ID, CdL like De novo AD? AR?  No SMC1A High 9 plus trio Severe DD, microcephaly,  chylothoraces De novo AD? AR?  X‐linked?  No paralog of ARHGEF6, known  MR gene Medium /Novel 10a and b Moderate ID, cerebellar  hypoplasia X‐linked No unclear unclear 11a and b Mild‐mod MR, cognitive decline,  glove‐stocking weakness X‐linked No unclear unclear 12a and b Mild‐mod ID X‐linked No unclear unclear 13a and b Severe ID, absent speech,  hypotonia, microcephaly X‐linked? AR? No unclear unclear 14a and b Neonatal Arthrogryposis AR comp het No RYR1 High 15a and b Neuronal Axonal Dystrophy AR Yes PLA2G6 High Diagnostic yield Cohort of 53 families: 20/53 probable (reportable) diagnosis (40%) 16/53 possible novel variants (30%) Varying degrees of success…
  19. 19. Is it economical? SMN1 molecular testing $690 Myotonic dystrophy DNA test on mother $506 Neurological appt for assessment mother $110 Myasthenia Gravis DNA testing $2,800 2 Micoarrays –both babies $1,200 3 Pathologists opinion both babies incl UK $720 Laminin A molecular testing $1,000 Postmortem $3,000 Muscle biopsy $144 Total $10,170 Cost summary: Case I MRI brain $441 Muscle biopsy $144 Skin biopsy $50 Surgical Session 4 h $2,651 Anaesthetist session 4h $1,884 Day Stay $1,946 ICU 24h non ventilated $326 EM concord $380 Dry ice to Melbourne $80 Mito analysis $1,400 SNP arrays $1,200 Nerve conduction $200 Total $10,703 Cost summary: Case 2 Average per family for two exomes: ~$,2000 Assuming would only get to a diagnosis 40% of the time: Still a saving of $6,000 per family Assume that half of the costs for medical care are still required Still an average saving of $3,000 per family 1,000 exomes per year would save $3 million per year
  20. 20. Whole genome sequencing will further increase diagnostic yields Diagnostic yields for ID with WGS estimated >60% Nature, June 2014
  21. 21. What would your genome tell you about yourself? Genome sequencing can provide vast information on health risks and disease susceptibilities. Of immediate benefit is carrier status of: (i) recessive disease for family planning (ii) cancer susceptibility genes (e.g. BRCA1/2) (iii) genes with known drug interactions (pharmacogenomics) Direct-to-consumer genetic testing (e.g. 23andMe) is growing rapidly. However, test information is relatively limited and offers minimal clinical information.
  22. 22. “4.” “Wellness” or “fitness” genomics Genome sequencing can provide vast information on health risks and disease susceptibilities. Of immediate benefit is carrier status of: (i) recessive disease for family planning (ii) cancer susceptibility genes (e.g. BRCA1/2) (iii) genes with known drug interactions (pharmacogenomics) Direct-to-consumer genetic testing (e.g. 23andMe) is growing rapidly. However, test information is relatively limited and offers minimal clinical information. Clinical interpretation of whole genome or exome sequences is more valuable, but remains very time-consuming. Many challenges remain before sequencing of newborns or well adults becomes clinically valuable.
  23. 23. What makes clinical genomics interesting to a researcher? Recruit patient cohort Sequence candidate genes Research Genetic test 8-10 years Patient Diagnosis Little interaction between research and implementation. aditional model of research translation:
  24. 24. The Opportunity and the Challenge: Bilateral Translational Research Genotype-Phenotype DatabasePatient Genome Sequencing Diagnosis Research Implementation of low-cost clinical whole genome sequencing will test whether this model can become reality.
  25. 25. Challenge 1. Clinical laboratory ≠ Research laboratory Return of results to patients requires that sequencing and analyses are performed to a clinical standard. Most countries require a form of accreditation (e.g. CLIA/CAP in USA, NATA in Australia and New Zealand). Accreditation requires demonstration of clinical utility, precision and accuracy of the test. Many variables including operator, instrument, reagent batch need to be routinely measured. Essentially all sources of variation and bias need to be accounted for and monitored. Other factors, such as persistent storage of data, independent validation, reporting and qualified expert interpretation also need to be considered. The combination of these overheads place considerable additional cost in the delivery of genomic data to the clinic. For clinical delivery, we need to massively streamline the process from end to end…
  26. 26. The journey from consult to report is long and complex… manage this journey in a clinical environment in the fast moving field of genomics?
  27. 27. Software Development for the Clinic Historically upgrades and change management are stressful and scary in a clinical setting Our process is not complete but strives for continuous improvement while retaining accuracy, documentation and accountability We build everything around the idea of constant change
  28. 28. Continuous Integration
  29. 29. Tracking and attribution of all commits and failures, with JIRA integration.
  30. 30. Genotype-Phenotype Database Challenge 3. Clinically robust genotype-phenotype database Development of federated database recording genotype-phenotype relationships - allow “Patients Like Mine” searches. Requires international collaboration between sequencing centers and careful recording of clinical phenotypes. The Human Gene Mutation Database (HGMD) is perhaps the gold-standard database for association of genotype with literature-annotated phenotypes (~7,000 entries). Controlled vocabulary for phenotyping is essential. Absence of characteristics can be just as important as presence of characteristics for identification of disease-causing variants. However, much of the literature is pre-genomic era: many annotations are incorrect. A mutation in HGMD cannot be assumed clinically relevant - extensive professional scrutiny is still required to make a clinical diagnosis.
  31. 31. Challenge 4. Delineation between clinic and research Many tools and databases are not suitable for clinical use…. PROVEAN clinical delivery will require a clearer delineation between clinic and research Homozygosity Mapper
  32. 32. The arrival of the “$1,000” genome
  33. 33. Illumina HiSeq X Ten Fleet of 10 instruments. First 6 installed - remaining 4 to come by October. 16 whole human genomes (>30X ~2 Tbases) every 3 days. Full capacity is 350 genomes per week or 18,000 per year. Real-world deliverable cost of 1,600 AUD per genome (interpretation costs will vary!) Possibility for population-scale sequencing and implementation into routine healthcare. Currently provided as an international service for research purposes - clinical accreditation targeted for 2015.
  34. 34. Slides from Jay Flatley @ Goldman Sachs Healthcare Conference 18/1/2014 What’s different about HiSeq X?
  35. 35. HiSeqX TenSix @ KCCG Machine Yield_G PCT_Q30 ST‐E00141 928.4 80.95 ST‐E00141 991.4 79.2 ST‐E00118 827.6 81.65 ST‐E00118 1035.4 87.8 ST‐E00118 986.6 84.6 ST‐E00118 1014.6 87.65 ST‐E00110 936 86.25 ST‐E00110 794.2 82.85 ST‐E00106 977.2 88.2 ST‐E00106 954 87.8 1 failed sample 40x30x
  36. 36. PERFORMANCE AT KCCG runs post June 2014-firmware upgrade 2 bad quality samples/preps 30/9/2014
  37. 37. PERFORMANCE AT KCCG runs post June 2014-firmware upgrade 2 bad quality samples/preps 30/9/2014
  38. 38. So what are we going to sequence?
  39. 39. Summary Many of the practical barriers for implementation of genomic medicine have been solved. Major limitations today are regulatory and societal (e.g. insurance). WES/WGS for diagnosis of inherited disorders is now practical and valuable with diagnostic yields approaching 50%. Genomic medicine blurs the lines between clinical and research. Many challenges remain in translating genomics in the clinic - particularly in the repurposing of research-grade software and tools for clinical applications. Clinical genomics represents a unique opportunity for bilateral translational medicine - mutually benefiting both the clinical research realms. Clinical-grade bioinformatics is tricky - but it presents many valuable lessons for researchers. High up-front investment, but many incidental errors can be avoided. High quality data relating genotype to phenotype is scarce. This vastly limits diagnostic accuracy without a phenotype (i.e. in well individuals) - lots of false positives.
  40. 40. Tony Roscioli Lisa Ewans Michael Buckley Scott Mead Michelle Farrar Glenda Mullan George Elakis Corrina Walsh Tony Roscioli Lisa Ewans Michael Buckley Scott Mead Michelle Farrar Glenda Mullan George Elakis Corrina Walsh Acknowledgements Sydney Children’s Hospital and SEALS Pathology
  41. 41. Warren Kaplan Mark Cowley Mark McCabe Kerith-Rae Dias Paula Morris Jiang Tao Aga Borcz Dahlia Saroufim Claire Horvat Liviu Constantinescu Peter Budd Kevin Ying Derrick Lin Shanny Dyer Russell Howard Bronwyn Terrill Amber Johns Warren Kaplan Mark Cowley Mark McCabe Kerith-Rae Dias Paula Morris Jiang Tao Aga Borcz Dahlia Saroufim Claire Horvat Liviu Constantinescu Peter Budd Kevin Ying Derrick Lin Shanny Dyer Russell Howard Bronwyn Terrill Amber Johns AcknowledgementsAcknowledgements Kinghorn Centre for Clinical Genomics