VAASTDeciphering Genetic Disease with Next-GenerationSequencingBarry Moore, M.S.Research ScientistDepartment of Human Gene...
Outline The VAAST Analysis Pipeline Ogden Syndrome: Application of VAAST to a Genetic Disease of Unknown Cause The Futu...
$10,000,000Venter Genome                $1,000,000                  Watson                             $5,000             ...
Next Generation Sequencing                                             Disease                                            ...
Variant Variant     AnnotationAnnotation   Tool             Variant Variant     SelectionSelection    Tool             Var...
GVFVAAST Pipeline                                             3.5 Million                                                 ...
GVFVAAST Pipeline                        Variant Effect                                                        3.5 Million...
GVFVAAST Pipeline                        Variant Effect                                                        3.5 Million...
CDR                       CDRBackground                  Target Genomes                   Genomes             VAAST       ...
Key Features of VAAST• Probabilistic• Feature Based• Both Allele and AAS Frequencies• Considers Inheritance Model• Fast• S...
VAAST Uses Variant Frequencies in aProbabilistic Fashion         Likelihood Ratio Test                   Maximum Likelihoo...
VAAST Uses Variant Frequencies in aProbabilistic Fashion
VAAST Uses Variant Frequencies in aProbabilistic Fashion•   VAAST gives us the likelihood of the composite genotype    at ...
Noise Decreases Dramatically withIncreasing Number of Genomes            1 genome target         1 genome background
1 genome target10 genome background
1 genome target250 genome background
1 genome target250 genome background       Trio Data
Alleles Responsible for Miller    Syndrome in Utah Kindred               CHR 16: DHODH                               CHR 5...
Schematic of VAAST Analysis of UtahMiller Kindred Using a Single Quartet                               DHODHDNAH5
Average Rank for 100 Dominant andRecessive Diseases                           1300   Ave. rank genome-wide                ...
Impact of Missing Data                          4000                          3500                                        ...
Outline The VAAST Analysis Pipeline Ogden Syndrome: Application of VAAST to a Genetic Disease of Unknown Cause The Futu...
An Rare X-linked Mendelian Disorder•   A Utah family coming to the    University Hospital for 20+    years•   About half o...
Four Affected Boys over Two      Generations  I IIIII
Exome Sequencing •   Agilent SureSelect In-Solution X Chromosome Capture •   Covaris S series Sonication (150-200 bp) •   ...
Identifying Candidate Genes VAAST Identifies NAA10 as Candidate Gene    •   About 20 min. run time    •   3 candidate gene...
Additional Analyses •   Microarray based CNV analysis     •   No likely causal variants found •   Sanger sequencing confir...
N(alpha)-acetyltransferase • N-alpha-acetylation is one of the most common protein      modifications that occurs during p...
Functional Analyses •   Quantitative in vitro N-terminal acetylation assay (RP-     HPLC). •   Four peptide substrates pre...
Functional Analyses
VAAST in Summary•   Probabilistic Disease Gene Finder•   Feature Based not Variant Based•   Both Allele and AAS Frequencie...
VAAST: Future Directions •   Indel support •   Splice-site •   No-call support •   Pedigree support •   Phylogenetic conse...
AcknowledgementsVAAST Development Ogden•Chad Huff        Syndrome                •Thomas Arnesen•HaoHu              •John ...
Acknowledgements
VAAST: Deciphering Genetic Disease with Next-Generation Sequencing
VAAST: Deciphering Genetic Disease with Next-Generation Sequencing
VAAST: Deciphering Genetic Disease with Next-Generation Sequencing
Upcoming SlideShare
Loading in …5
×

VAAST: Deciphering Genetic Disease with Next-Generation Sequencing

975
-1

Published on

Published in: Science, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
975
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
37
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • I’m going to begin the discussion of VAAST with a simple description of how the pipeline runs
  • Numerator = Null Model (No Difference)Denominator = Alternate Model (Difference)
  • The maximum likelihood of the null model over the maximum likelihood of the alternate model - weighted by the frequency of the AAS in the healthy dataset over the frequency of that AAS in a disease datasetn=frequency of that AAS in the background p=estimated probability of...B=T=Y=X=a=frequency of this AAS in OMIM
  • The maximum likelihood of the null model over the maximum likelihood of the alternate model - weighted by the frequency of the AAS in the healthy dataset over the frequency of that AAS in a disease datasetn=frequency of that AAS in the background p=estimated probability of...B=T=Y=X=a=frequency of this AAS in OMIM
  • Miller Syndrome
  • A-G TransitionSerine to Proline
  • Thank the family
  • VAAST: Deciphering Genetic Disease with Next-Generation Sequencing

    1. 1. VAASTDeciphering Genetic Disease with Next-GenerationSequencingBarry Moore, M.S.Research ScientistDepartment of Human GeneticsDepartment of Biomedical Informatics
    2. 2. Outline The VAAST Analysis Pipeline Ogden Syndrome: Application of VAAST to a Genetic Disease of Unknown Cause The Future of VAAST Development
    3. 3. $10,000,000Venter Genome $1,000,000 Watson $5,000 You?
    4. 4. Next Generation Sequencing Disease Healthy geneA geneB geneX geneY geneZ
    5. 5. Variant Variant AnnotationAnnotation Tool Variant Variant SelectionSelection Tool Variant Variant Annotation Analysis Analysis Search Tool
    6. 6. GVFVAAST Pipeline 3.5 Million Variants Reference VAT Reference Genome (Variant Annotation Tool) Genes Fasta GFF3 Annotated Annotated Annotated GVF Variants Variants Variants VST (Variant Selection Tool) CDR Merged Variant Sets
    7. 7. GVFVAAST Pipeline Variant Effect 3.5 Million •sequence_variant Variants •gene_variant Reference VAT Reference •five_prime_UTR_variant Genome Type Variant Genes •three_prime_UTR_variant (Variant Annotation Tool) •sequence_alteration Fasta •exon_variant GFF3 •deletion •splice_region_variant •insertion •splice_donor_variant •duplication Annotated Annotated •splice_acceptor_variant Annotated •inversion GVF •intron_variant Variants •substitution Variants Variants •coding_sequence_variant •SNV •stop_retained •MNP •stop_lost •complex substitution •stop_gained •translocation VST •synonymous_codon •non_synonymous_codon (Variant Selection Tool) •amino_acid_substitution •frameshift_variant •inframe_variant CDR Merged Variant Sets
    8. 8. GVFVAAST Pipeline Variant Effect 3.5 Million •sequence_variant Variants •gene_variant Reference VAT Reference •five_prime_UTR_variant Genome Type Variant Genes •three_prime_UTR_variant (Variant Annotation Tool) •sequence_alteration Fasta •exon_variant GFF3 •deletion •splice_region_variant •insertion •splice_donor_variant •duplication Annotated Annotated •splice_acceptor_variant Annotated •inversion GVF •intron_variant Variants •substitution Variants Variants •coding_sequence_variant •SNV •stop_retained •MNP •stop_lost •complex substitution •stop_gained •translocation VST •synonymous_codon •non_synonymous_codon (Variant Selection Tool) •amino_acid_substitution •frameshift_variant •inframe_variant CDR Merged Variant Sets
    9. 9. CDR CDRBackground Target Genomes Genomes VAAST Prioritized Candidate Genes VAAST Report
    10. 10. Key Features of VAAST• Probabilistic• Feature Based• Both Allele and AAS Frequencies• Considers Inheritance Model• Fast• Standardized Ontology Based Format• Modular and Flexible in Design
    11. 11. VAAST Uses Variant Frequencies in aProbabilistic Fashion Likelihood Ratio Test Maximum Likelihood of the Null Model (No Difference) Maximum Likelihood of the Alternate Model (There is Difference)
    12. 12. VAAST Uses Variant Frequencies in aProbabilistic Fashion
    13. 13. VAAST Uses Variant Frequencies in aProbabilistic Fashion• VAAST gives us the likelihood of the composite genotype at GENE X in the target given the background.• Do allele frequencies differ between Background and Target genomes within a given gene or feature?• Composite likelihood calculation assumes independence across sites. To control for LD, statistical significance is estimated by permutation test.• Multiple test correction for number of features (~20,000) is two orders of magnitude better than for the number of variants (~3,500,000).
    14. 14. Noise Decreases Dramatically withIncreasing Number of Genomes 1 genome target 1 genome background
    15. 15. 1 genome target10 genome background
    16. 16. 1 genome target250 genome background
    17. 17. 1 genome target250 genome background Trio Data
    18. 18. Alleles Responsible for Miller Syndrome in Utah Kindred CHR 16: DHODH CHR 5: DNAH5 Mom Dad Mom Dad G:R R:Q G:A R: * Son Daughter Son Daughter G:R G:R R:Q R:Q R: R: G:A G:A * *•Ng et al, Nature Genetics 42, 30–35 (2010) doi:10.1038/ng.499•Roach, et al, Science , 328 636, 2101
    19. 19. Schematic of VAAST Analysis of UtahMiller Kindred Using a Single Quartet DHODHDNAH5
    20. 20. Average Rank for 100 Dominant andRecessive Diseases 1300 Ave. rank genome-wide SIZE OF CASE COHORT 1100 2 allele copies 900 4 allele copies 700 6 allele copies 500 300 156 132 100 21 9 8 3 -100 DOMINANT RECESSIVE -300 -500 443 genomes in background
    21. 21. Impact of Missing Data 4000 3500 2 of 6 allele copies Ave. rank genome-wide 3000 4 of 6 allele copies 2500 6 of 6 allele copies 2000 1500 1000 639 500 373 61 21 9 3 0 -500 DOMINANT RECESSIVE 443 genomes in background
    22. 22. Outline The VAAST Analysis Pipeline Ogden Syndrome: Application of VAAST to a Genetic Disease of Unknown Cause The Future of VAAST Development
    23. 23. An Rare X-linked Mendelian Disorder• A Utah family coming to the University Hospital for 20+ years• About half of the male offspring die around 1 year of age• Aged appearance• Craniofacial anomalies• Hypotonia• Global developmental delays• Cardiac arrhythmias
    24. 24. Four Affected Boys over Two Generations I IIIII
    25. 25. Exome Sequencing • Agilent SureSelect In-Solution X Chromosome Capture • Covaris S series Sonication (150-200 bp) • 76 bp single-end reads on one lane each of the IlluminaGAIIxVariant Calling • Sequence alignment with bwa • Remove duplicate reads with PICARD • Realign indel regions with GATK • Variant calling with Samtools, GATK
    26. 26. Identifying Candidate Genes VAAST Identifies NAA10 as Candidate Gene • About 20 min. run time • 3 candidate genes (NAA10 ranked 2) proband only • 1 candidate gene (NAA10) with pedigree
    27. 27. Additional Analyses • Microarray based CNV analysis • No likely causal variants found • Sanger sequencing confirmation • Variant segregates perfectly with disease in 13 family members • Haplotype sharing (STR genotyping) • ~11 MB shared between two affected boys • A second family discovered – same mutation • IBD relatedness analysis – independent mutational events
    28. 28. N(alpha)-acetyltransferase • N-alpha-acetylation is one of the most common protein modifications that occurs during protein synthesis. • NatA (catalytic subunit NAA10 (hARD1) • Eight exons, Crick strand, highly conserved • A:G transition causes p.Ser37Pro
    29. 29. Functional Analyses • Quantitative in vitro N-terminal acetylation assay (RP- HPLC). • Four peptide substrates previously shown to be acetylated by NatA (NAA10) • Assays indicate loss-of-function allele.
    30. 30. Functional Analyses
    31. 31. VAAST in Summary• Probabilistic Disease Gene Finder• Feature Based not Variant Based• Both Allele and AAS Frequencies• Considers Inheritance Model• As few as two target genomes can be sufficient to identify causative gene.• Background Genomes are “Reusable”• Not Limited to Human Analyses
    32. 32. VAAST: Future Directions • Indel support • Splice-site • No-call support • Pedigree support • Phylogenetic conservation
    33. 33. AcknowledgementsVAAST Development Ogden•Chad Huff Syndrome •Thomas Arnesen•HaoHu •John Carey •Rune Evjenth•Lynn Jorde •Steven Chin •Johan R. Lillehaug•Barry Moore •Heidi Deborah Fain•Martin Reese •Gholson Lyon •Leslie G. Biesecker•Marc Singleton •John Optiz •Jennifer J.•Jinchuan Xing •Theodore J. Pysher Johnston•Mark Yandell •Alan Rope •Cathy A. StevensYandell Lab •Reid Robison •Sarah T. South •Brian Dalley•Michael Campbell •Tao Jiang•Daniel Ence •JeffereySwensen •Chad Huff•Guozhen Fan •Evan Johnson•Steven Flygare •HakonHakonarson •Barry Moore•HaoHu •Lynn B. Jorde •Christa Schank•Zev Kronenberg •Mark Yandell •Kai Wang•Barry Moore •Jinchuan Xing•Marc Singleton•Robert Ross•Mark Yandell
    34. 34. Acknowledgements
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×