Mouse Genomes Project + RNA-Editing

1,968 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,968
On SlideShare
0
From Embeds
0
Number of Embeds
18
Actions
Shares
0
Downloads
44
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Mouse Genomes Project + RNA-Editing

  1. 1. Mouse genomic variation and its effect on phenotypes and gene regulation Thomas Keane Vertebrate Resequencing Informatics Wellcome Trust Sanger Institute, Hinxton, Cambridge, UKNUI Maynooth 20th April, 2012
  2. 2. Mouse genomic variation and its effect on phenotypes and gene regulation  Mouse Genomes Project  RNA-EditingNUI Maynooth 20th April, 2012
  3. 3. Sequencing Technologies over past 30 years MR Stratton et al. Nature 458, 719-724 (2009)NUI Maynooth 20th April, 2012
  4. 4. Sanger total sequence (2007-2009)Gbp NUI Maynooth 20th April, 2012
  5. 5. Sanger total sequence to-date HiSeq 2000Gbp NUI Maynooth 20th April, 2012
  6. 6. The Laboratory MouseNUI Maynooth 20th April, 2012
  7. 7. Mouse Genome Project (2002)NUI Maynooth 20th April, 2012
  8. 8. International Knockout Mouse ConsortiumNUI Maynooth 20th April, 2012
  9. 9. Large Outbred Crosses Founder set of inbred strains and randomly cross   Heterogeneous stock   Collaborative Cross Large numbers of resulting mice   Comprehensively phenotyped   Recurrent phenotypes assessed   Identify QTL regions  Knowing the origin of haplotype blocks Collaborative Cross Consortium (2009) Genetics Full sequence variation of founder mice required to find potential causitive mutationsNUI Maynooth 20th April, 2012
  10. 10. Mouse Genomes Project Sequencing 18 laboratory mouse strains   Largest effort to date to sequence genomes of laboratory mouse strains Primary goals   Deep sequencing of each strain (>25x)   Comprehensive catalog sequence variation What sort of variation?   SNPs – single base changes (A->G etc.)   Indels – insertions or deletions of a few bases   Structural variation – larger structural differences Illumina sequencing platform   Raw data was generated in 2009   Approx. ~1.2TbpNUI Maynooth 20th April, 2012
  11. 11. What does the data look like? Whole-genome shotgun (WGS) Sequence in parallel ends of millions of fragments   300-500bp in size   Read sequence of 100bp of either end ReferenceNUI Maynooth 20th April, 2012
  12. 12. Variation Catalog Keane et al (2011) NatureNUI Maynooth 20th April, 2012
  13. 13. Variation CatalogNUI Maynooth 20th April, 2012
  14. 14. Results of the project Genomic variation and its effect on phenotypes Jonathan Flint & Richard Mott Keane et al (2011) NatureNUI Maynooth 20th April, 2012
  15. 15. Results of the project Genomic variation and its effect on phenotypes Structural variation catalog Binnaz Yalcin, Kim Wong, Thomas Keane, Jonathan Flint Yalcin et al. (2010) NatureNUI Maynooth 20th April, 2012
  16. 16. Results of the project Genomic variation and its effect on phenotypes SVMerge Structural variation catalog Structural variation methods Wong, Keane, Stalker, Adams (2010) Gen BiolNUI Maynooth 20th April, 2012
  17. 17. Results of the project Genomic variation and its effect on phenotypes Structural variation catalog Structural variation methods Novel structural variation types Binnaz Yalcin & Kim Wong Yalcin et al. (2012) Gen BiolNUI Maynooth 20th April, 2012
  18. 18. Results of the project Genomic variation and its effect on phenotypes Structural variation catalog Structural variation methods Novel structural variation types Transposable elements Nellaker, Keane, Wong et al., under reviewNUI Maynooth 20th April, 2012
  19. 19. Results of the project Genomic variation and its effect on phenotypes Structural variation catalog Structural variation methods Novel structural variation types Transposable elements RNA-Editing…….NUI Maynooth 20th April, 2012
  20. 20. Mouse genomic variation and its effect on phenotypes and gene regulation  Mouse Genomes Project  RNA-EditingNUI Maynooth 20th April, 2012
  21. 21. RNA-Editing Site-selective post-transcriptional alteration of double-stranded RNA Adenosine deaminase acting on RNA (ADAR) family of enzymes   Adenosine residues to inosines   Observe A-to-G SNPs in cDNA ADARs   Bind to double-stranded regions of RNA   Modify multiple neighbouring adenosines Apobec-1 mediated C-to-U RNA editing Novel source of protein isoform diversity Wulff and Nishikura (2009) WIREs RNA   HTR2C gene: five edit sites lead to 28 mRNAsNUI Maynooth 20th April, 2012
  22. 22. HTR2C gene Wahlstedt et al (2009) Gen ResNUI Maynooth 20th April, 2012
  23. 23. RNA-Seq Isolate RNA and reverse transcribe to cDNA   Fragment cDNA and directly sequence   No reference bias and huge dynamic range Uses   Gene expression analysis   Transcript discovery and annotation new genomes   Alternative splicing RNA-editing McIntyre et al (2011) BMC Gen   Align the RNA-seq reads to the reference genome   If the bases disagree with the genomic sequence data at the corresponding position…..NUI Maynooth 20th April, 2012
  24. 24. RNA-Editing?RNA-seqReplicate 1RNA-seqReplicate 2DNANUI Maynooth 20th April, 2012
  25. 25. Human RNA-Editing Li et al. (2011) ScienceNUI Maynooth 20th April, 2012
  26. 26. Human RNA-Editing Li et al. (2011) ScienceNUI Maynooth 20th April, 2012
  27. 27. RNA-Seq is not the same as genomic sequencing Alignment of RNA-Seq reads is not trivial   Most genomic short read aligners are not splice awareRNA-seqReplicate 1RNA-seqReplicate 2DNA NUI Maynooth 20th April, 2012
  28. 28. RNA-Seq is not the same as genomic sequencing What about processed pseudo-genes? Pink et al (2011) RNA cDNA fragment Pseudogene Functioning gene Exon 1 Exon 2 Exon 1 Exon 2NUI Maynooth 20th April, 2012
  29. 29. What about in mouse? Mouse Genomes Project   RNA-Seq of 15 mouse strains   Whole-brain tissue   2-4 biological replicates per strain   ~5Gbp per replicate Previous catalogs   Neeman et al.   Zaranek et al. - several tens of gigabases of human and mouse cDNA sequence   Rosenberg et al. - RNA-seq for C57BL/6J strain Hindered by lack of corresponding genomic sequencing We generated   Deep whole genome sequencing   Corresponding RNA-Seq from whole-brain tissue across 15 strains  2-4 biological replicatesNUI Maynooth 20th April, 2012
  30. 30. Our Pipeline gDNA cDNA Splice-aware SNVs SNVs realignment Minimum Depth 10x 31,923 sites 304,817 candidate sites 98,061 unambiguous sites Filtering Replicate Consistency 62,889 sites Estimated FDR 2.9%No assumptions about thenature of editing madeAssumed editing by ADARs 5,579 filtered siteswhich usually occurs in clusters End Distance Bias 59,775 sites One-type mismatch Cluster extension Strand Bias clusters added 42,238 sites Variant Distance Bias 7,389 final sites 7,133 sites 36,213 sites NUI Maynooth 20th April, 2012
  31. 31. Effect of Filtering StrategyNUI Maynooth 20th April, 2012
  32. 32. Validation Sequenom validation   Random set of 611 calls from both the filtered set of 5,579 RNA editing sites   19 non A-to-G editing sites raw calls -> all confirmed false positives   Discrepancy rate of 10.5%  Enriched at positions where editing level is <20% T-to-C editing   Novel form of RNA-editing?  Uncertainties in strand assignment of transcripts  Result of calls made in antisense transcripts, mis-annotations Assuming all non A-to-G edits are false   False-discovery rate of our call set is 2.9%NUI Maynooth 20th April, 2012
  33. 33. Striking ConservationNUI Maynooth 20th April, 2012
  34. 34. Editing LevelsNUI Maynooth 20th April, 2012
  35. 35. Genomic ContextNUI Maynooth 20th April, 2012
  36. 36. Protein Coding Edits 23 previously known non-synonymous coding edits Extended this by a further 30 sites   24 were by Sanger sequencing of cDNA Cacna1d gene   Encodes the Cav1.2 voltage- gated calcium channel   Known to undergo extensive alternative splicing   Two novel non-synonymous edits   Capillary sequencing validation   Observed 3 different transcriptsNUI Maynooth 20th April, 2012
  37. 37. Cacna1dNUI Maynooth 20th April, 2012
  38. 38. Rare C-to-U Edit: Mfn1NUI Maynooth 20th April, 2012
  39. 39. Rare C-to-U Edit: Mfn1NUI Maynooth 20th April, 2012
  40. 40. Cds2 - UTR D R a g a g a g a g a g a g a g a g a g a g a g a g a g a g a g g g g g g g Rat g g RNA-editing appears to revert genomic sequence back to ancestral state Mice homozygous for disruptions in this gene display a lethal phenotype Several known across-species examples   RNA-editing maintaining conservation at the protein level despite genomic sequence divergenceNUI Maynooth 20th April, 2012
  41. 41. Human Follow-up Studies Li et al (2011) Science Ramaswami et al. (2012) Nat Meth Bahn et al (2011) Gen Res Peng et al (2011) Nat BioNUI Maynooth 20th April, 2012
  42. 42. To do First phase of the project was cataloging variation Full denovo assemblies of the strains   Generating higher quality sequencing data for the 18 strains   Long fragment end sequencing – 3, 6, 10, 40kb fragments De novo assembly   Discover novel haplotypes   Novel gene structures in the divergent strains Mouse pan-genome   Reference bias   New mouse reference genome graph  Including novel non-reference haplotypes shared amongst subsets of the strainsNUI Maynooth 20th April, 2012
  43. 43. Acknowledgements and Questions Mouse Genomes Project   Sanger Insitute   David Adams, Petr Danecek, Kim Wong, Guy Slater, Sendu Bala et al.   Wellcome Trust Center for Human Genetics   Jonathan Flint, Binnaz Yalcin, Richard Mott, Leo Goodstadt et al.   EBI   Ewan Birney David Adams   University of Oxford   Chris Ponting, Chris Nellaker, Andres Heger, Grant Belgard RNA-Editing   Petr Danecek, David Adams, Chris Nellaker Jonathan Flint Email: thomas.keane@sanger.ac.ukNUI Maynooth 20th April, 2012
  44. 44. WTSI PhD ProgrammeNUI Maynooth 20th April, 2012

×