Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Francesc Lopez
Yale Center for Genome Analysis
Dept. of Genetics
(francesc.lopez@yale.edu)
Next-Generation Sequencing and
...
Brief History of DNA Sequencing
1953: Discovery of DNA structure by Watson and Crick
1973: First sequence of 24 bases publ...
Sequencing of the human
genome using Sanger
technology took more than
a decade and cost an
estimated $70 million
dollars
S...
Production facility. 7,000 Sq Ft
dedicated facility
25 Full time staff including 4
PhD level bioinformaticians
Yale Center...
7 Illumina HiSeqs
5: 2500
2: 4000
One PacBio RS
Illumina MiSeq Ion PGM™ Sequencer
Sequencing Platforms at YCGA
Trend of sequencing data output at YCGA
Sequencers are operated at ~70% of the max capacity
Progress made at YCGA in the p...
Protein coding genes
(exome) constitute 1.5% of
the human genome but
harbor 85% of disease
causing mutations.
 Significa...
FastQ format – single read
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%...
A
T
G
C
ExAC: 61,000 exomes
dbSNP
1000 genomes
NHLBI exomes: 6,500
Yale exomes: 2,500
Variant frequency DBs
44 vertebrate species
...
Sequencing a genome is simple
finding a cause of a disease is not
First clinical use of whole genome sequencing shows just...
DNA Sequencing and Precision Medicine
• Precision Medicine: Use of genomics to tailor medical care to
individuals based on...
Genetic diagnosis by whole exome capture and
massively parallel DNA sequencing.
Choi M, et al. (2009) PNAS 106 (45): 19096...
In the first 362 trios (affected proband), ~2000 putative de
novo pre-filtered variants were detected.
Gene burden analysis - unrelated patients of same
clinical output
Comparing variants from cases and controls per gene
allows for detection of gene causing diseases
 Broad, Baylor/Hopkins, U of Washington, and Yale
 More than 6,000 rare Mendelian disorders affecting more than 25 milli...
ACKNOWLEDGMENTS
Yale Center for Genomic Analysis
Prof. Lifton Lab
YCGA STAFF
In 2013, Angelina Jolie
tested positive for
BRCA1
Nine unrelated kindreds with an apparent recessive mode
of inheritance.
23
Filtering Recessive Variants
1 1
High quality
Protein altering
Rare in control
databases*
* Yale exome database, NHLBI ...
Some Machine learning applications in genetics and genomics
Gene prediction (2002): predict which regions of the genome co...
List of select publications resulting form the next-generation sequencing usage at YCGA
Whole-exome sequencing identifies ...
Next Generation Sequencing and its Applications in Medical Research - Francesc lopez
Next Generation Sequencing and its Applications in Medical Research - Francesc lopez
Upcoming SlideShare
Loading in …5
×

Next Generation Sequencing and its Applications in Medical Research - Francesc lopez

2,096 views

Published on

The so-called “next-generation” sequencing (NGS) technologies allows us, in a short time and in parallel, to sequence massive amounts of DNA, overcoming the limitations of the original Sanger sequencing methods used to sequence the first human genome. NGS technologies have had an enormous impact on biomedical research within a short time frame. This talk will give an overview of these applications with specific examples from Mendelian genomics and cancer research. #h2ony

Published in: Data & Analytics
  • Be the first to comment

Next Generation Sequencing and its Applications in Medical Research - Francesc lopez

  1. 1. Francesc Lopez Yale Center for Genome Analysis Dept. of Genetics (francesc.lopez@yale.edu) Next-Generation Sequencing and its Applications in Medical Research
  2. 2. Brief History of DNA Sequencing 1953: Discovery of DNA structure by Watson and Crick 1973: First sequence of 24 bases published 1977: Sanger sequencing method published 1982: GenBank started 1987: 1st automated sequencer: Applied Biosystems Prism 373 (up to 600 bases) 1996: First Capillary sequencer: ABI310 2000-2003: Human Genome Sequenced 2005- : First NGS sequencers 454 Life Sciences, Solexa/Illumina, Helicos, Ion Torrent
  3. 3. Sequencing of the human genome using Sanger technology took more than a decade and cost an estimated $70 million dollars Sanger VS NGS Bases Genes Human Genome 3.3x109 ~20,000 In 3 days (one run), Illumina HiSeq 4000 is able to produce 1,680x109 bases for ~$32,000
  4. 4. Production facility. 7,000 Sq Ft dedicated facility 25 Full time staff including 4 PhD level bioinformaticians Yale Center for Genome Analysis (YCGA) Dedicated computation infrastructure 3.5 Petabytes data storage 4500 cores HPC
  5. 5. 7 Illumina HiSeqs 5: 2500 2: 4000 One PacBio RS Illumina MiSeq Ion PGM™ Sequencer Sequencing Platforms at YCGA
  6. 6. Trend of sequencing data output at YCGA Sequencers are operated at ~70% of the max capacity Progress made at YCGA in the past years 1% 5% 30% 1% 63% Library Prep Sample Types ChIP Whole Genome mRNA micro RNA Seqcap Types of samples processed at YCGA Whole Exome
  7. 7. Protein coding genes (exome) constitute 1.5% of the human genome but harbor 85% of disease causing mutations.  Significantly cheaper than sequencing entire genome  >50,000 exomes sequenced at YCGA Whole-Genome VS Whole-Exome Sequencing Choi et al PNAS 2009
  8. 8. FastQ format – single read @SEQ_ID GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + !''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65 X 5x109 reads in a run of HiSeq 4000
  9. 9. A T G C
  10. 10. ExAC: 61,000 exomes dbSNP 1000 genomes NHLBI exomes: 6,500 Yale exomes: 2,500 Variant frequency DBs 44 vertebrate species 2 invertebrate species (fly and worm) PhyloP Conservation Polyphen-2 SIFT Functional prediction OMIM GO KEGG Jackson lab knockout Gene annotation Genome build: hg38 Variant caller: GATK Annotation gene reference: refGene General parameters Variant annotation
  11. 11. Sequencing a genome is simple finding a cause of a disease is not First clinical use of whole genome sequencing shows just how challenging it can be Genomes on prescription: Nature 2011
  12. 12. DNA Sequencing and Precision Medicine • Precision Medicine: Use of genomics to tailor medical care to individuals based on their genetic makeup. Which treatment?What are my chances? Which class of cancer? Is it benign? Therapeutic Choice PrognosisDiagnosis Classification How and why • Elucidation of mechanism of cause • Identification of cancer biomarkers • Therapeutic targets Discovery
  13. 13. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Choi M, et al. (2009) PNAS 106 (45): 19096-101  5 month child presented with failure to thrive and dehydration.  Treatments for kidney disease failed  Captured 180,000 exons of 18,673 protein-coding genes comprising 34.0 Mb of genomic sequence  Identified a mutation in SLC26A3 gene which causes congenital chloride diarrhea – treatments for which have effectively managed the disease  Demonstration of the clinical utility of whole-exome sequencing and its implications for disease gene discovery and clinical diagnosis
  14. 14. In the first 362 trios (affected proband), ~2000 putative de novo pre-filtered variants were detected.
  15. 15. Gene burden analysis - unrelated patients of same clinical output
  16. 16. Comparing variants from cases and controls per gene allows for detection of gene causing diseases
  17. 17.  Broad, Baylor/Hopkins, U of Washington, and Yale  More than 6,000 rare Mendelian disorders affecting more than 25 million individuals in US  Discover the genes and variants responsible for as many Mendelian phenotypes as possible  Develop and disseminate improved methods for disease gene discovery and analysis  Educate colleagues and public regarding Mendelian disease Whole-Exome/whole-genome analysis is carried out at no cost and on a collaborative basis
  18. 18. ACKNOWLEDGMENTS Yale Center for Genomic Analysis Prof. Lifton Lab YCGA STAFF
  19. 19. In 2013, Angelina Jolie tested positive for BRCA1
  20. 20. Nine unrelated kindreds with an apparent recessive mode of inheritance.
  21. 21. 23 Filtering Recessive Variants 1 1 High quality Protein altering Rare in control databases* * Yale exome database, NHLBI ESP exome, 1000 Genomes Kindred 1 Kindred 2 Subject 1 Subject 2 Subject 1 Subject 2 Same gene DGKE 4 1 3,151 3,072 12,326 12,094 2 5 3,283 3,227 12,959 12,753 Lemaire et al., Nature Genetics 2013
  22. 22. Some Machine learning applications in genetics and genomics Gene prediction (2002): predict which regions of the genome code for proteins. RNA secondary structure prediction (2006): predict the base-pairing interactions within a strand of RNA. Transcription factor target prediction (2007): predict the sequence of bases most likely to bind a specific transcription factor. Base calling (2009): predict the base photographed by an Illumina sequencing device during a sequencing by synthesis reaction. Enhancer prediction (2012): predict regions of the genome that act as enhancers for expression using information about the epigenetic marks present on the chromosomes. Splicing code (2015): predict how a mutation within a gene will affect the splicing of that gene's transcript. Pathogenicity prediction (2015): predict the functional impact of a mutation in a sample of DNA. Pharmacogenomics (2011): predict if mutations in a person's DNA will impact how a drug works in their body. Predicting the functions of long noncoding RNAs (2015) Predicting effects of noncoding variants using predicted DNaseI hypersensitivity, histone modifications, and transcription factor binding (2015) Predicting RNA editing (2016)
  23. 23. List of select publications resulting form the next-generation sequencing usage at YCGA Whole-exome sequencing identifies recessive WDR62 mutations in severe brain malformations. Bilguvar Nature, v467, 2010 A Novel miRNA Processing Pathway Independent of Dicer Requires Argonaute2 Activity. Cifuentes Science, v328, 2010 Mitotic recombination in ichthyosis causes reversion of dominant mutations in KRT10. Choate K Science, v330, 2010 Transcriptomic analysis of avian digits reveals conserved and derived digit identities in birds. Wang s. Nature, v477, 2011 Transposom-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals. Lynch and Wagner Nature, Genet. v43, 2011 K + channel mutations in adrenal aldosterone-producing adenomas and hereditary hypertension. Choi M Science, v331, 2011 Recessive LAMC3 mutations cause malformations of occipital cortical development. Barak and Gunel. Nat Genet., V43, 2011 Spatio-temporal transcriptome of the human brain. Kang and Sestan Nature, v478, 2011 Langerhans cells facilitate epithelial DNA damage and squamous cell carcinoma. Modi and Girardi Science, v335, 2012 Mutations in kelch-like 3 and cullin 3 causes hypertension and electrolyte abnormalities. Boyden et al Nature, v482, 2012 De novo point mutations are strongly associated with Autism Spectrum Disorders. Sanders and State Nature, v485, 2012 Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma. Krauthammer Nat Genet., V44, 2012 Genomic Analysis of Non-NF2 Meningiomas Reveals Mutations in TRAF7, KLF4, AKT1,& SMO. Clark V et al Science, v339, 2013 De novo mutations in histone-modifying genes in congenital heart disease. Zaidi and Lifton Nature, v498, 2013 Recessive mutations in DGKE cause atypical hemolytic-uremic syndrome. Lemaire and Lifton Nat Genet., V45, 2013 Somatic and germline CACNA1D calcium channel mutations in aldosterone-producing adenomas Scholl and Lifton Nat Genet., V45, 2013 The evolution of lineage-specific regulatory activities in the human embryonic limb. Cotney and Noonan Cell, v154, 2013 Mutations in DSTYK and dominant urinary tract malformations. Sanna-Cherchi and Gharavi N Eng J Med., 2013 Nanog, and SoxB1 activate zygotic gene expression during the maternal-to-zygotic transition. Lee et al Nature, 2013 Co-expression networks implicate mid-fetal deep cortical projection neurons in the pathogenesis of autism. State Cell, 2013 CLP1 Founder Mutation Links tRNA Splicing and Maturation to Cerebellar Development. Schaffer and Gleeson . Cell, V157, 2014 Exome sequencing links corticospinal motor neuron disease to neurodegenerative disorders. Novarino and Gleeson Science, V363, 2014 Recurrent mutations in NF1 and RASopathy genes in sun-exposed melanomas. Krauthammer and halaban Nat Genet. V47 2015 Genetic Causes for Congenital Heart Disease with Neurodevelopmental and Other Deficits. Homsy J et al Science , 2015

×