ENCODE project: brief summary of main findings

4,238 views

Published on

A brief summary of the ENCODE project and ist main finding. Most important publications for cancer researchers and how to make use of the ENCODe data.

Published in: Education
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
4,238
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
192
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

ENCODE project: brief summary of main findings

  1. 1. ENCODE Encyclopedia of DNA Elements Outline What and who is ENCODEKey ENCODE topics and most important papers for our research ENCODE data – make use of the encyclopedia… Maté Ongenaert
  2. 2. What and who is ENCODE Main aims, funding and the institutions/labs behind the 200 M $ Who? International consortiumFunded by NHGRI – National Human Genome Research Institute 200 million dollar Main collaborators (for human data) Broad Institute (ChIP-seq) HudsonAlpha Institute for Biotechnology (methylation) Sanger Institute (RNA-seq) Duke University (DNAse) Yale University (Pol II) EBI (data analysis) Main aims“Build a comprehensive parts list of functional elements in thehuman genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active”
  3. 3. What and who is ENCODE Main aims, funding and the institutions/labs behind the 200 M $ What’s so hot… It has been running for years? Started in 2007 – pilot project 1% of the genome 2007-2012 Since then, introduction of new technologies  Higher throughput  Genome-wide Much more samples and different tissues (different ‘tiers’ – see later)  Better data analysis and integration
  4. 4. What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $What’s so hot… It has been running for years? World wide press attention
  5. 5. What and who is ENCODE Main aims, funding and the institutions/labs behind the 200 M $What’s so hot… It has been running for years? World wide press attention… and criticisms“Popular” media focus on the “junk DNA aspect” The authors also claim in their press- release that > 80% of the genome is‘biologically active’ (<> may be involved in regulation in one way or another <> junk DNA)ENCODE reveals for the fist time a lot of factors of the very complex switching board controlling expression / …
  6. 6. What and who is ENCODE Main aims, funding and the institutions/labs behind the 200 M $ What’s so hot… It has been running for years?30 (!) research papers published in three journals at the same time
  7. 7. ENCODE Encyclopedia of DNA Elements Outline What and who is ENCODEKey ENCODE topics and most important papers for our research ENCODE data – make use of the encyclopedia…
  8. 8. Key ENCODE topicsMain ENCODE topics and selection of most important papers Key topics Transcription factor binding motifs Chromatin patterns at transcription factor binding sites Characterization of intergenic regions and gene definitions RNA and chromatin modification patterns around promoters Epigenetic regulation of RNA processing Non-coding RNA characterisation DNA-methylation Enhancer discovery and characterization 3D connections across the genome Characterisation of network topology Machine learning approaches to genomics Impact of functional information on understanding variation Impact of evolutionary selection on functional regions
  9. 9. Key ENCODE topics Main ENCODE topics and selection of most important papers Main paper 95% of the genome lies within 8 kilobases (kb) of a DNA–protein interactionClassifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like featuresIt is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter functionality can explain most of the variation in RNA expression Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enrichedwithin non-coding functional elements, with a majority residing in or near ENCODE-defined regions that are outside of protein-coding genes. In many cases, the disease phenotypes can be associated with a specific cell type or transcription factor
  10. 10. Key ENCODE topicsMain ENCODE topics and selection of most important papers Main paper Techniques used: RNA-seq ChIP-seq DNAse-seq DNA-methylation arrays and bisulfite seq FAIRE-seq Tier 1: three cell lines (K652 – GM12878 – H1 hESC) Tier 2: cell line panel (HeLa-S3 – HepG2 – HUVECs) Tier 3 (all other cell types) Total: 1640 datasets / 147 different cell types
  11. 11. Key ENCODE topicsMain ENCODE topics and selection of most important papers Main paper
  12. 12. Key ENCODE topics Main ENCODE topics and selection of most important papers Main paper 95% of the genome lies within 8 kilobases (kb) of a DNA–protein interactionClassifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like featuresIt is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter functionality can explain most of the variation in RNA expression Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enrichedwithin non-coding functional elements, with a majority residing in or near ENCODE-defined regions that are outside of protein-coding genes. In many cases, the disease phenotypes can be associated with a specific cell type or transcription factor
  13. 13. Key ENCODE topics Main ENCODE topics and selection of most important papersExpression – chromatin state Expression – transcription factors
  14. 14. Key ENCODE topicsMain ENCODE topics and selection of most important papers Expression – transcription factors
  15. 15. Key ENCODE topicsMain ENCODE topics and selection of most important papers Chromatin state patterns at transcription-factor binding sites
  16. 16. Key ENCODE topicsMain ENCODE topics and selection of most important papers Co-association between transcription factors (K562)
  17. 17. Key ENCODE topics Main ENCODE topics and selection of most important papersInsight in genomic variation – allele specific variation
  18. 18. Key ENCODE topics Main ENCODE topics and selection of most important papers Main paper 95% of the genome lies within 8 kilobases (kb) of a DNA–protein interactionClassifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like featuresIt is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter functionality can explain most of the variation in RNA expression Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enrichedwithin non-coding functional elements, with a majority residing in or near ENCODE-defined regions that are outside of protein-coding genes. In many cases, the disease phenotypes can be associated with a specific cell type or transcription factor
  19. 19. Key ENCODE topics Main ENCODE topics and selection of most important papers Overlap SNPs withregulatory elements
  20. 20. Key ENCODE topicsMain ENCODE topics and selection of most important papers Overlap SNPs with regulatory elements and ‘open’ chromatin
  21. 21. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  22. 22. Key ENCODE topicsMain ENCODE topics and selection of most important papers Accessible chromatin landscape DNAseI treatment Combined analysis with TFs and H3K4me3  Identification of “accessible” chromatin regions
  23. 23. Key ENCODE topicsMain ENCODE topics and selection of most important papers Accessible chromatin landscape – location of accessible regions
  24. 24. Key ENCODE topicsMain ENCODE topics and selection of most important papersAccessible chromatin landscape – association with ChIP-seq and TFs
  25. 25. Key ENCODE topicsMain ENCODE topics and selection of most important papers Accessible chromatin landscape – novel transcripts
  26. 26. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  27. 27. Key ENCODE topics Main ENCODE topics and selection of most important papers Landscape of transcription RNA-seq Get a grip on what is transcribed, including novel transcripts and RNAs
  28. 28. Key ENCODE topicsMain ENCODE topics and selection of most important papers Landscape of transcription – nucleolar fraction vs. whole cell
  29. 29. Key ENCODE topicsMain ENCODE topics and selection of most important papers Landscape of transcription
  30. 30. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  31. 31. Key ENCODE topicsMain ENCODE topics and selection of most important papers Long-range interaction of promoters 5C mapping (chromatin interaction mapping technology)  Long-range interactions of promoter regions
  32. 32. Key ENCODE topicsMain ENCODE topics and selection of most important papers Long-range interaction of promoters
  33. 33. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  34. 34. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  35. 35. Key ENCODE topicsMain ENCODE topics and selection of most important papers Transcriptional regulation ChIP-seq <> expression detection  Predict transcriptional regulation
  36. 36. Key ENCODE topicsMain ENCODE topics and selection of most important papers Transcriptional regulation – predict transcription
  37. 37. Key ENCODE topicsMain ENCODE topics and selection of most important papers Transcriptional regulation – expression prediction
  38. 38. Key ENCODE topics Main ENCODE topics and selection of most important papersTranscriptional regulation – TFs predict location of histone modifications
  39. 39. Key ENCODE topicsMain ENCODE topics and selection of most important papers Transcriptional regulation – model
  40. 40. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  41. 41. Key ENCODE topicsMain ENCODE topics and selection of most important papers Cell-type specific gene expression from open chromatin regions
  42. 42. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  43. 43. Key ENCODE topicsMain ENCODE topics and selection of most important papers Cell-type specific TF binding
  44. 44. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  45. 45. Key ENCODE topicsMain ENCODE topics and selection of most important papers SNPs in regulatory regions
  46. 46. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  47. 47. Key ENCODE topicsMain ENCODE topics and selection of most important papers TF binding - interactions
  48. 48. Key ENCODE topicsMain ENCODE topics and selection of most important papers TF binding – cell-type specificity
  49. 49. Key ENCODE topicsMain ENCODE topics and selection of most important papers Other important papers to us
  50. 50. Key ENCODE topicsMain ENCODE topics and selection of most important papers Classification of genomic regions
  51. 51. Key ENCODE topicsMain ENCODE topics and selection of most important papers Classification of genomic regions
  52. 52. Key ENCODE topicsMain ENCODE topics and selection of most important papers Classification of genomic regions
  53. 53. ENCODE Encyclopedia of DNA Elements Outline What and who is ENCODEKey ENCODE topics and most important papers for our research ENCODE data – make use of the encyclopedia…
  54. 54. ENCODE data Data availability Data availabilityAll data is available, from raw data to final processed data For end-level users:- Tracks in the UCSC browser with desired level of detail  Visualize tracks and explore genomic context For end-level users and bio-IT: - In UCSC “Table browser” and other UCSC toolsExport genomic information, including processed data For high end-level users and Bio-IT: - Raw data and semi-processed data in GEO and others
  55. 55. ENCODE data Data availabilityTracks in the UCSC browser with desired level of detail
  56. 56. ENCODE data Data availabilityTracks in the UCSC table browser
  57. 57. ENCODE dataData availability Raw data
  58. 58. ENCODE dataData availability Raw data
  59. 59. Blokde Van… ETER

×