J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments
Upcoming SlideShare
Loading in...5
×
 

J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

on

  • 523 views

Presentation by J Lichtenberg at BOSC2012 - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Presentation by J Lichtenberg at BOSC2012 - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Statistics

Views

Total Views
523
Views on SlideShare
523
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments Presentation Transcript

  • Discovery of motif-basedregulatory signatures in NextGenSequencing Experimentshttp://code.google.com/p/nextgen-signaturesGNU General Public License, version 3.0 (GPLv3)Jens LichtenbergHematopoiesis Section, Genetics and Molecular Biology BranchNational Human Genome Research Institute, National Institutes of Health
  • Motivation● Large variety of omics approaches that produce sequencing data● Common threads in the Methylation evaluation process Seq● Few approaches exist RNA Seq ChIP Seq that attempt the large Comprehensive scale analysis of omics Analysis data Protein Seq Histone Seq● Direct correlation of Systems Biology Insights multiple omics data into actual biological insights
  • Requirements● General ○ Quantification of sequencing data requires dynamic pipeline allowing for frequent adjustments ○ Close interaction between bench and analysis personnel● Specific ○ Quantitative analysis ○ Functional analysis ○ Regulatory analysis ○ Visualizations
  • General Analysis Approach
  • Hematopoietic Stem CellDifferentiation in Mouse Microarray Data curated in BloodExpress RNA Seq Data Methylation Seq Data ChIP Seq Data (EKLF) Histone Seq Data
  • Methylation Seq Peak Calling Expression Correlation Motif Discovery Occupancy Validation Transcription Occupied Sites Number Exp. Z-Score P-Value Factors Overlapping Overlapping ERG 36166 966 1983 -20.80 2.16e-96 FLI1 19601 348 1075 -21.32 3.70e-101 GATA2 9234 278 507 -9.87 2.81e-23 GFI1B 8853 235 486 -11.04 1.23e-28 ... RUNX1 5269 97 290 -11.11 5.61e-29 SCL 7096 146 389 -12.26 7.42e-35
  • ChIP Seq Peak Calling Methylation Correlation ERY (Meth.) MEP (Meth.) Total 1187 587 Dist. Prom. 210 102 Prox. Prom. 29 21 Downstream 345 207 RefSeq 983 513 Functional Analysis Motif Discovery● EKLF control in MEP can be found in the first intron (Siatecka and Bieker, Blood, 2011)● During erythropoiesis EKLF is restricted to hematopoietic organs (Siatecka and Bieker, Blood, 2011)● Down-regulation of EKLF expression in MEP cells leads megakaryopoiesis (Siatecka and Bieker, Blood, 2011)
  • Histone Seq Peak Calling EKLF/Methylation Correlation Functional Analysis Motif Discovery MEME (OOPS) MEME (ZOOPS) TomTom Lookup: TomTom Lookup: ● THI2, ZincFinger ● THI2, ZincFinger ● NKx2-5, Homeobox ● NKx2-3, Homeobox ● NKx2-6, Homeobox ● NKx2-5, Homeobox ● NKx2-6, Homeobox ● NKx3-1, Homeobox
  • RNA Seq Peak Calling Functional Analysis MEG Pathway Name ERY, MEP, MEG MEG, MEP ERY, MEG ERY, MEP 241 ERK/MAPK Sig. 1.83E-09 4.47E-16 5.01E-10 IGF-1 Sig. 1.04E-15 1.25E-10 47 1308 3338 MolMech. 3.72E-10 1.59E-22 1.13E-10 3.72E-10 Cancer 216 966 2408 ... PI3K/AKT Sig. 3.22E-20 2.84E-24 6.24E-18 1.33E-15ERY MEP mRNA Differentiation Motif Discovery Increase DecreaseMEP -> MEG 1238 7323MEP -> ERY 1198 9307
  • Comprehensive ApproachCurrent Status● Perl Framework ○ Commonly used applications and repositories ● Next-Generation Sequencing ○ Read Mapping ■ UCSC Genomic Data ○ Peak Calling/Partitioning ■ UCSC Genomic Data ○ Transcript Quantification ■ UCSC/Ensembl Genomic Data ● Functional Genomics ● Regulatory Genomics ○ Expression Correlation ○ Enumerative motif discovery ■ BloodExpress Database ■ Transfac/Jaspar ○ Pathway Analysis Database ■ KEGG/IPA ○ Occupancy validation ○ Ontology Analysis ■ Literature specific data ■ GO/IPA sets
  • Future IssuesData● Complete case study for Protein SeqImplementation● Complete implementation of all analysis facets● Transition Perl framework to C++ architecture● Parallelize software architecture for higher performance/throughputSupport● Update web-interface and documentation to allow unassisted data analysis
  • Conclusions and Availability● A comprehensive approach is possible● Meaningful results can be extracted using the approach● Regulatory genomics can be used as a suitable post- processing analysis● Comprehensive hematopoiesis study is feasible● http://code.google.com/p/nextgen-signatures (GNU General Public License, version 3.0)
  • Acknowledgements NHGRI - GMBB - Hematopoiesis Section David Bodine and Amber Hogart NHGRI Intramural Training Program