SlideShare a Scribd company logo
1 of 13
Download to read offline
Discovery of motif-based
regulatory signatures in NextGen
Sequencing Experiments
http://code.google.com/p/nextgen-signatures
GNU General Public License, version 3.0 (GPLv3)



Jens Lichtenberg
Hematopoiesis Section, Genetics and Molecular Biology Branch
National Human Genome Research Institute, National Institutes of Health
Motivation

● Large variety of omics approaches that produce
    sequencing data
●   Common threads in the
                                                 Methylation
    evaluation process                              Seq


● Few approaches exist            RNA Seq                          ChIP Seq

    that attempt the large
                                               Comprehensive
    scale analysis of omics                      Analysis


    data                         Protein Seq                      Histone Seq


● Direct correlation of                            Systems
                                               Biology Insights
    multiple omics data into
    actual biological insights
Requirements

● General
  ○ Quantification of sequencing data requires dynamic
       pipeline allowing for frequent adjustments
    ○ Close interaction between bench and analysis
       personnel
●   Specific
    ○ Quantitative analysis
    ○ Functional analysis
    ○ Regulatory analysis
    ○ Visualizations
General Analysis Approach
Hematopoietic Stem Cell
Differentiation in Mouse
 Microarray Data curated in BloodExpress
 RNA Seq Data
 Methylation Seq Data
 ChIP Seq Data (EKLF)
 Histone Seq Data
Methylation Seq
       Peak Calling                        Expression Correlation




      Motif Discovery                       Occupancy Validation
                        Transcription Occupied Sites    Number                Exp.       Z-Score   P-Value
                           Factors                     Overlapping         Overlapping


                            ERG           36166           966                 1983       -20.80    2.16e-96

                            FLI1          19601           348                 1075       -21.32    3.70e-101

                           GATA2          9234            278                 507         -9.87    2.81e-23

                           GFI1B          8853            235                 486        -11.04    1.23e-28

                                                                     ...

                          RUNX1           5269             97                 290        -11.11    5.61e-29

                            SCL           7096            146                 389        -12.26    7.42e-35
ChIP Seq
                               Peak Calling                                                   Methylation Correlation

                                                                                                                       ERY (Meth.)   MEP (Meth.)


                                                                                                         Total         1187          587


                                                                                                         Dist. Prom.   210           102


                                                                                                         Prox. Prom.   29            21


                                                                                                         Downstream    345           207


                                                                                                         RefSeq        983           513




                         Functional Analysis                                                     Motif Discovery




●   EKLF control in MEP can be found in the first intron (Siatecka and Bieker, Blood, 2011)
●   During erythropoiesis EKLF is restricted to hematopoietic organs (Siatecka and Bieker,
    Blood, 2011)
●   Down-regulation of EKLF expression in MEP cells leads megakaryopoiesis (Siatecka
    and Bieker, Blood, 2011)
Histone Seq
        Peak Calling                      EKLF/Methylation Correlation




     Functional Analysis                                Motif Discovery

                                 MEME (OOPS)                        MEME (ZOOPS)




                           TomTom Lookup:                       TomTom Lookup:
                               ●     THI2, ZincFinger               ●     THI2, ZincFinger
                               ●     NKx2-5, Homeobox               ●     NKx2-3, Homeobox
                               ●     NKx2-6, Homeobox               ●     NKx2-5, Homeobox
                                                                    ●     NKx2-6, Homeobox
                                                                    ●     NKx3-1, Homeobox
RNA Seq
                        Peak Calling                                    Functional Analysis
                 MEG
                                                  Pathway Name    ERY, MEP, MEG    MEG, MEP         ERY, MEG         ERY, MEP
                 241
                                                  ERK/MAPK Sig.         1.83E-09         4.47E-16         5.01E-10


                                                  IGF-1 Sig.            1.04E-15                          1.25E-10
            47          1308
                 3338                             MolMech.              3.72E-10         1.59E-22         1.13E-10         3.72E-10
                                                  Cancer


      216        966      2408                    ...


                                                  PI3K/AKT Sig.         3.22E-20         2.84E-24         6.24E-18         1.33E-15
ERY                              MEP

                 mRNA Differentiation                                       Motif Discovery


                        Increase       Decrease

MEP -> MEG              1238           7323

MEP -> ERY              1198           9307
Comprehensive Approach

Current Status
● Perl Framework
  ○ Commonly used applications and repositories
    ●   Next-Generation Sequencing
         ○ Read Mapping
              ■ UCSC Genomic Data
         ○ Peak Calling/Partitioning
              ■ UCSC Genomic Data
         ○ Transcript Quantification
              ■ UCSC/Ensembl Genomic Data

    ●   Functional Genomics              ●   Regulatory Genomics
         ○ Expression Correlation             ○ Enumerative motif discovery
               ■ BloodExpress Database             ■ Transfac/Jaspar
         ○ Pathway Analysis                             Database
               ■ KEGG/IPA                     ○ Occupancy validation
         ○ Ontology Analysis                       ■ Literature specific data
               ■ GO/IPA                                 sets
Future Issues

Data
● Complete case study for Protein Seq
Implementation
● Complete implementation of all analysis facets
● Transition Perl framework to C++ architecture
● Parallelize software architecture for higher
   performance/throughput
Support
● Update web-interface and documentation to allow
   unassisted data analysis
Conclusions and Availability
● A comprehensive approach is possible

● Meaningful results can be extracted using the approach

● Regulatory genomics can be used as a suitable post-
   processing analysis

● Comprehensive hematopoiesis study is feasible

● http://code.google.com/p/nextgen-signatures (GNU
  General Public License, version 3.0)
Acknowledgements

  NHGRI - GMBB - Hematopoiesis Section
     David Bodine and Amber Hogart




      NHGRI Intramural Training Program

More Related Content

Similar to J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Toolsfornetworkbiology 2
Toolsfornetworkbiology 2Toolsfornetworkbiology 2
Toolsfornetworkbiology 2pluskjw
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pubsesejun
 
100603_TogoWS_SOAP
100603_TogoWS_SOAP100603_TogoWS_SOAP
100603_TogoWS_SOAPocha_kaneko
 
Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Copenhagenomics
 
2011 Rna Course Part 1
2011 Rna Course Part 12011 Rna Course Part 1
2011 Rna Course Part 1ICGEB
 
New advances and future outlook in the management and cure of hemoglobin diso...
New advances and future outlook in the management and cure of hemoglobin diso...New advances and future outlook in the management and cure of hemoglobin diso...
New advances and future outlook in the management and cure of hemoglobin diso...Thalassaemia International Federation
 
2009 11 16 UCR Comp Sci
2009 11 16 UCR Comp Sci2009 11 16 UCR Comp Sci
2009 11 16 UCR Comp SciJason Stajich
 
Glycomics2004-CrKa
Glycomics2004-CrKaGlycomics2004-CrKa
Glycomics2004-CrKaCrKa
 
M Reich - GenomeSpace
M Reich - GenomeSpaceM Reich - GenomeSpace
M Reich - GenomeSpaceJan Aerts
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
Bioinfomatics Presentation
Bioinfomatics PresentationBioinfomatics Presentation
Bioinfomatics PresentationZhenhong Bao
 
Wireless LAN (WLAN) concepts: Modulation to Aggregation
Wireless LAN (WLAN) concepts: Modulation to AggregationWireless LAN (WLAN) concepts: Modulation to Aggregation
Wireless LAN (WLAN) concepts: Modulation to AggregationChaitanya Tata, PMP
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
Fpga based implementation of a double precision ieee floating point adder
Fpga based implementation of a double precision ieee floating point adderFpga based implementation of a double precision ieee floating point adder
Fpga based implementation of a double precision ieee floating point adderSomsubhra Ghosh
 

Similar to J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments (19)

Toolsfornetworkbiology 2
Toolsfornetworkbiology 2Toolsfornetworkbiology 2
Toolsfornetworkbiology 2
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
100603_TogoWS_SOAP
100603_TogoWS_SOAP100603_TogoWS_SOAP
100603_TogoWS_SOAP
 
Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...Sequencing the transcriptome reveals complex layers of regulation, Department...
Sequencing the transcriptome reveals complex layers of regulation, Department...
 
2011 Rna Course Part 1
2011 Rna Course Part 12011 Rna Course Part 1
2011 Rna Course Part 1
 
New advances and future outlook in the management and cure of hemoglobin diso...
New advances and future outlook in the management and cure of hemoglobin diso...New advances and future outlook in the management and cure of hemoglobin diso...
New advances and future outlook in the management and cure of hemoglobin diso...
 
DTEx 02042009
DTEx 02042009DTEx 02042009
DTEx 02042009
 
2009 11 16 UCR Comp Sci
2009 11 16 UCR Comp Sci2009 11 16 UCR Comp Sci
2009 11 16 UCR Comp Sci
 
Seminar jun 7 12
Seminar jun 7 12Seminar jun 7 12
Seminar jun 7 12
 
Glycomics2004-CrKa
Glycomics2004-CrKaGlycomics2004-CrKa
Glycomics2004-CrKa
 
M Reich - GenomeSpace
M Reich - GenomeSpaceM Reich - GenomeSpace
M Reich - GenomeSpace
 
Church gmod2012 pt2
Church gmod2012 pt2Church gmod2012 pt2
Church gmod2012 pt2
 
Honors ~ DNA 1011
Honors ~ DNA 1011Honors ~ DNA 1011
Honors ~ DNA 1011
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
Bioinfomatics Presentation
Bioinfomatics PresentationBioinfomatics Presentation
Bioinfomatics Presentation
 
Wireless LAN (WLAN) concepts: Modulation to Aggregation
Wireless LAN (WLAN) concepts: Modulation to AggregationWireless LAN (WLAN) concepts: Modulation to Aggregation
Wireless LAN (WLAN) concepts: Modulation to Aggregation
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Fpga based implementation of a double precision ieee floating point adder
Fpga based implementation of a double precision ieee floating point adderFpga based implementation of a double precision ieee floating point adder
Fpga based implementation of a double precision ieee floating point adder
 

More from Jan Aerts

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationJan Aerts
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Jan Aerts
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Jan Aerts
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Jan Aerts
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Jan Aerts
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data AnalysisJan Aerts
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualizationJan Aerts
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsJan Aerts
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...Jan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumJan Aerts
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisJan Aerts
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...Jan Aerts
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...Jan Aerts
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...Jan Aerts
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsJan Aerts
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesJan Aerts
 

More from Jan Aerts (20)

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

  • 1. Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments http://code.google.com/p/nextgen-signatures GNU General Public License, version 3.0 (GPLv3) Jens Lichtenberg Hematopoiesis Section, Genetics and Molecular Biology Branch National Human Genome Research Institute, National Institutes of Health
  • 2. Motivation ● Large variety of omics approaches that produce sequencing data ● Common threads in the Methylation evaluation process Seq ● Few approaches exist RNA Seq ChIP Seq that attempt the large Comprehensive scale analysis of omics Analysis data Protein Seq Histone Seq ● Direct correlation of Systems Biology Insights multiple omics data into actual biological insights
  • 3. Requirements ● General ○ Quantification of sequencing data requires dynamic pipeline allowing for frequent adjustments ○ Close interaction between bench and analysis personnel ● Specific ○ Quantitative analysis ○ Functional analysis ○ Regulatory analysis ○ Visualizations
  • 5. Hematopoietic Stem Cell Differentiation in Mouse Microarray Data curated in BloodExpress RNA Seq Data Methylation Seq Data ChIP Seq Data (EKLF) Histone Seq Data
  • 6. Methylation Seq Peak Calling Expression Correlation Motif Discovery Occupancy Validation Transcription Occupied Sites Number Exp. Z-Score P-Value Factors Overlapping Overlapping ERG 36166 966 1983 -20.80 2.16e-96 FLI1 19601 348 1075 -21.32 3.70e-101 GATA2 9234 278 507 -9.87 2.81e-23 GFI1B 8853 235 486 -11.04 1.23e-28 ... RUNX1 5269 97 290 -11.11 5.61e-29 SCL 7096 146 389 -12.26 7.42e-35
  • 7. ChIP Seq Peak Calling Methylation Correlation ERY (Meth.) MEP (Meth.) Total 1187 587 Dist. Prom. 210 102 Prox. Prom. 29 21 Downstream 345 207 RefSeq 983 513 Functional Analysis Motif Discovery ● EKLF control in MEP can be found in the first intron (Siatecka and Bieker, Blood, 2011) ● During erythropoiesis EKLF is restricted to hematopoietic organs (Siatecka and Bieker, Blood, 2011) ● Down-regulation of EKLF expression in MEP cells leads megakaryopoiesis (Siatecka and Bieker, Blood, 2011)
  • 8. Histone Seq Peak Calling EKLF/Methylation Correlation Functional Analysis Motif Discovery MEME (OOPS) MEME (ZOOPS) TomTom Lookup: TomTom Lookup: ● THI2, ZincFinger ● THI2, ZincFinger ● NKx2-5, Homeobox ● NKx2-3, Homeobox ● NKx2-6, Homeobox ● NKx2-5, Homeobox ● NKx2-6, Homeobox ● NKx3-1, Homeobox
  • 9. RNA Seq Peak Calling Functional Analysis MEG Pathway Name ERY, MEP, MEG MEG, MEP ERY, MEG ERY, MEP 241 ERK/MAPK Sig. 1.83E-09 4.47E-16 5.01E-10 IGF-1 Sig. 1.04E-15 1.25E-10 47 1308 3338 MolMech. 3.72E-10 1.59E-22 1.13E-10 3.72E-10 Cancer 216 966 2408 ... PI3K/AKT Sig. 3.22E-20 2.84E-24 6.24E-18 1.33E-15 ERY MEP mRNA Differentiation Motif Discovery Increase Decrease MEP -> MEG 1238 7323 MEP -> ERY 1198 9307
  • 10. Comprehensive Approach Current Status ● Perl Framework ○ Commonly used applications and repositories ● Next-Generation Sequencing ○ Read Mapping ■ UCSC Genomic Data ○ Peak Calling/Partitioning ■ UCSC Genomic Data ○ Transcript Quantification ■ UCSC/Ensembl Genomic Data ● Functional Genomics ● Regulatory Genomics ○ Expression Correlation ○ Enumerative motif discovery ■ BloodExpress Database ■ Transfac/Jaspar ○ Pathway Analysis Database ■ KEGG/IPA ○ Occupancy validation ○ Ontology Analysis ■ Literature specific data ■ GO/IPA sets
  • 11. Future Issues Data ● Complete case study for Protein Seq Implementation ● Complete implementation of all analysis facets ● Transition Perl framework to C++ architecture ● Parallelize software architecture for higher performance/throughput Support ● Update web-interface and documentation to allow unassisted data analysis
  • 12. Conclusions and Availability ● A comprehensive approach is possible ● Meaningful results can be extracted using the approach ● Regulatory genomics can be used as a suitable post- processing analysis ● Comprehensive hematopoiesis study is feasible ● http://code.google.com/p/nextgen-signatures (GNU General Public License, version 3.0)
  • 13. Acknowledgements NHGRI - GMBB - Hematopoiesis Section David Bodine and Amber Hogart NHGRI Intramural Training Program