SlideShare a Scribd company logo
1 of 31
Download to read offline
Integrative causality analysis of genetic, epigenetic, and
transcriptomic data in a large cohort
Rosemary McCloskey and Sara Mostafavi
rmcclosk.math@gmail.com
http://slideshare.net/rmcclosk/omics-integration
March 27, 2015
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 1 / 12
Motivation
genetic, epigenetic, and transcriptomic data provide snapshots of
cellular processes
GATTACA
gene
expression
methylation
histone
acetylation
genotype
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 2 / 12
Motivation
genetic, epigenetic, and transcriptomic data provide snapshots of
cellular processes
usually one data type is studied at a time, in relation to a phenotype
or disease
GATTACA
gene
expression
methylation
histone
acetylation
genotype
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 2 / 12
Motivation
genetic, epigenetic, and transcriptomic data provide snapshots of
cellular processes
usually one data type is studied at a time, in relation to a phenotype
or disease
GATTACA
?
gene
expression
methylation
histone
acetylation
genotype
how do these data fit together?
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 2 / 12
The data
large cohort designed
to study cognitive
decline and
Alzheimer’s disease
2
19
1080
0
3
392
152
20
0
1
40 61
47
17
11
expression methylation
acetylation genotype
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 3 / 12
The data
large cohort designed
to study cognitive
decline and
Alzheimer’s disease
genotype, gene
expression, DNA
methylation, and
histone acetylation
(CHiP-seq) data
2
19
1080
0
3
392
152
20
0
1
40 61
47
17
11
expression methylation
acetylation genotype
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 3 / 12
The data
large cohort designed
to study cognitive
decline and
Alzheimer’s disease
genotype, gene
expression, DNA
methylation, and
histone acetylation
(CHiP-seq) data
392 individuals with
all four data types
were used for this
analysis
2
19
1080
0
3
392
152
20
0
1
40 61
47
17
11
expression methylation
acetylation genotype
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 3 / 12
Quantitative trait loci (QTLs)
a QTL is a genetic locus
correlated with a
phenotype
-2
-1
0
1
2
3
-2
-1
0
1
2
-1
0
1
expressionacetylationmethylation
0 1 2
genotype
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 4 / 12
Quantitative trait loci (QTLs)
a QTL is a genetic locus
correlated with a
phenotype
we are interested in
QTLs for gene
expression (eQTLs),
histone acetylation
(aceQTLs), and
methylation (meQTLs)
-2
-1
0
1
2
3
-2
-1
0
1
2
-1
0
1
expressionacetylationmethylation
0 1 2
genotype
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 4 / 12
Quantitative trait loci (QTLs)
a QTL is a genetic locus
correlated with a
phenotype
we are interested in
QTLs for gene
expression (eQTLs),
histone acetylation
(aceQTLs), and
methylation (meQTLs)
QTLs provide a tool to
study interaction
between other molecular
phenotypes
-2
-1
0
1
2
3
-2
-1
0
1
2
-1
0
1
expressionacetylationmethylation
0 1 2
genotype
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 4 / 12
Identifying QTLs
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 5 / 12
Identifying QTLs
↓
SNPs in 200 kb window
Spearman’s ρ
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 5 / 12
Identifying QTLs
↓
SNPs in 200 kb window
Spearman’s ρ
↓
Holm-Bonferroni correction
best SNP per feature
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 5 / 12
Identifying QTLs
↓
SNPs in 200 kb window
Spearman’s ρ
↓
Holm-Bonferroni correction
best SNP per feature
↓ FDR correction
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 5 / 12
Removing Principal Components
technical, environmental,
and biological covariates
can swamp out QTL
effects
4000
4500
5000
5500
6000
3000
3500
4000
75000
80000
85000
90000
95000
genespeaksCpGs
0 5 10 15 20
PCs removed
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 6 / 12
Removing Principal Components
technical, environmental,
and biological covariates
can swamp out QTL
effects
correct by removing
principal components
4000
4500
5000
5500
6000
3000
3500
4000
75000
80000
85000
90000
95000
genespeaksCpGs
0 5 10 15 20
PCs removed
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 6 / 12
Removing Principal Components
technical, environmental,
and biological covariates
can swamp out QTL
effects
correct by removing
principal components
number of peaks with a
QTL plateaus at 10 PCs,
while genes and CpGs
continue to increase
4000
4500
5000
5500
6000
3000
3500
4000
75000
80000
85000
90000
95000
genespeaksCpGs
0 5 10 15 20
PCs removed
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 6 / 12
Removing Principal Components
technical, environmental,
and biological covariates
can swamp out QTL
effects
correct by removing
principal components
number of peaks with a
QTL plateaus at 10 PCs,
while genes and CpGs
continue to increase
for this analysis, removed
10 PCs from all data
4000
4500
5000
5500
6000
3000
3500
4000
75000
80000
85000
90000
95000
genespeaksCpGs
0 5 10 15 20
PCs removed
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 6 / 12
Identifying multi-QTLs
By intersecting QTL sets, found
240 gene, CpG, and peak triples
which shared the same QTL
2984
1799
50981
127
240
1604
2129
eQTL meQTL
aceQTL
2984
1799
50981
127
240
1604
2129
eQTL meQTL
aceQTL
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 7 / 12
Identifying multi-QTLs
By intersecting QTL sets, found
240 gene, CpG, and peak triples
which shared the same QTL
2984
1799
50981
127
240
1604
2129
eQTL meQTL
aceQTL
2984
1799
50981
127
240
1604
2129
eQTL meQTL
aceQTL
Also assessed QTL overlap using
π0 approach
100 %
46 %
14 %
31 %
100 %
11 %
83 %
84 %
100 %
eQTLs
aceQTLs
meQTLs
eQTLs
aceQTLs
meQTLs
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 7 / 12
Bayesian networks
Bayesian networks are directed graphical models, where the directed
edges represent causal relationships
temperature precipitation
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 8 / 12
Bayesian networks
Bayesian networks are directed graphical models, where the directed
edges represent causal relationships
We use conditional Gaussian networks
temperature precipitation
Pr(temp) ∼ N(0, 1) Pr(precip | temp) ∼ N(0, 1)
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 8 / 12
Bayesian networks
Bayesian networks are directed graphical models, where the directed
edges represent causal relationships
We use conditional Gaussian networks
Score = likelihood of data given network
temperature precipitation
Pr(temp) ∼ N(0, 1) Pr(precip | temp) ∼ N(0, 1)
0.7 0.5
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 8 / 12
Bayesian networks
Bayesian networks are directed graphical models, where the directed
edges represent causal relationships
We use conditional Gaussian networks
Score = likelihood of data given network
temperature precipitation
Pr(temp) ∼ N(0, 1) Pr(precip | temp) ∼ N(0, 1)
0.7 0.5
Pr(N(0, 1) = 0.7) Pr(N(0.7, 1) = 0.5)×
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 8 / 12
Networks for QTLs
deal and CGBayesNets packages to construct one Bayesian network
for each multi-QTL by exhaustive search
genotypeexpression acetylation
methylation
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 9 / 12
Networks for QTLs
deal and CGBayesNets packages to construct one Bayesian network
for each multi-QTL by exhaustive search
With deal, edges into genotype were blacklisted
genotypeexpression acetylation
methylation
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 9 / 12
Networks for QTLs
deal and CGBayesNets packages to construct one Bayesian network
for each multi-QTL by exhaustive search
With deal, edges into genotype were blacklisted
Most common network structure was independence
genotypeexpression acetylation
methylation
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 9 / 12
Networks for QTLs
deal and CGBayesNets packages to construct one Bayesian network
for each multi-QTL by exhaustive search
With deal, edges into genotype were blacklisted
Most common network structure was independence
Accounted for 42% of deal networks, 29% of CGBayesNets networks
genotypeexpression acetylation
methylation
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 9 / 12
Future Work
Expand the number of multi-QTLs
More that just the best SNP per feature
Identify overlapping QTLs intelligently
More rigourous criterion for number of PCs to remove
Try other packages for network learning (HyPhy)
Are QTLs enriched in SNPs identified in GWAS studies?
Correlations with phenotype (cognitive decline etc.)
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 10 / 12
Thank you!
Harvard / Broad
Philip L. D. Jager
Lori Chibnik
Jishu Xu
Charles White
Cristin McCabe
Towfique Raj
Rush
David A Bennett
Chris Gaiteri
Lei Yu
Bioinformatics Training Program
All the students
Sharon Ruschkowski
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 11 / 12
Software
QTL analysis
Matrix eQTL
qvalue
Bayesian networks
deal
CGBayesNets
Slides
beamer
TikZ
tikzDevice
Plots
pheatmap
ggplot2
VennDiagram
Colour Scheme
solarized
R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 12 / 12

More Related Content

Viewers also liked

Direct organogenesis, embryogenesis, micro grafting, meristem culture and its...
Direct organogenesis, embryogenesis, micro grafting, meristem culture and its...Direct organogenesis, embryogenesis, micro grafting, meristem culture and its...
Direct organogenesis, embryogenesis, micro grafting, meristem culture and its...Pawan Nagar
 
Techniques of in vitro clonal propagation for fruit crops
Techniques of in vitro  clonal propagation for fruit cropsTechniques of in vitro  clonal propagation for fruit crops
Techniques of in vitro clonal propagation for fruit cropsPawan Nagar
 
Invitro mutation selection for biotic stresses in Plants
Invitro mutation selection for biotic stresses in PlantsInvitro mutation selection for biotic stresses in Plants
Invitro mutation selection for biotic stresses in Plantsamvannan
 
Genetically Modified Organisms (GMO)
Genetically Modified Organisms (GMO)Genetically Modified Organisms (GMO)
Genetically Modified Organisms (GMO)Tanvi Vasani
 
GMO presentation
GMO presentationGMO presentation
GMO presentationtangeld
 
Somaclonal variation
Somaclonal variationSomaclonal variation
Somaclonal variationTarit Ghosh
 

Viewers also liked (10)

Anther culture
Anther cultureAnther culture
Anther culture
 
Cybrids
CybridsCybrids
Cybrids
 
Direct organogenesis, embryogenesis, micro grafting, meristem culture and its...
Direct organogenesis, embryogenesis, micro grafting, meristem culture and its...Direct organogenesis, embryogenesis, micro grafting, meristem culture and its...
Direct organogenesis, embryogenesis, micro grafting, meristem culture and its...
 
In vitro plant development
In vitro plant developmentIn vitro plant development
In vitro plant development
 
Techniques of in vitro clonal propagation for fruit crops
Techniques of in vitro  clonal propagation for fruit cropsTechniques of in vitro  clonal propagation for fruit crops
Techniques of in vitro clonal propagation for fruit crops
 
Invitro mutation selection for biotic stresses in Plants
Invitro mutation selection for biotic stresses in PlantsInvitro mutation selection for biotic stresses in Plants
Invitro mutation selection for biotic stresses in Plants
 
Protoplast culture
Protoplast cultureProtoplast culture
Protoplast culture
 
Genetically Modified Organisms (GMO)
Genetically Modified Organisms (GMO)Genetically Modified Organisms (GMO)
Genetically Modified Organisms (GMO)
 
GMO presentation
GMO presentationGMO presentation
GMO presentation
 
Somaclonal variation
Somaclonal variationSomaclonal variation
Somaclonal variation
 

Similar to Omics Integration

Bda2015 tutorial-part2-data&databases
Bda2015 tutorial-part2-data&databasesBda2015 tutorial-part2-data&databases
Bda2015 tutorial-part2-data&databasesInterpretOmics
 
Jax bio dataworldcongress.ngs.20181128finalwithoutbu
Jax bio dataworldcongress.ngs.20181128finalwithoutbuJax bio dataworldcongress.ngs.20181128finalwithoutbu
Jax bio dataworldcongress.ngs.20181128finalwithoutbuAnne Deslattes Mays
 
Analyzing Genomic Data for Whole Populations
Analyzing Genomic Data for Whole PopulationsAnalyzing Genomic Data for Whole Populations
Analyzing Genomic Data for Whole Populations Amazon Web Services
 
Thesis def
Thesis defThesis def
Thesis defJay Vyas
 
Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Robin Gutell
 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways: Chris Evelo
 
Phylogenetic Workflows
Phylogenetic WorkflowsPhylogenetic Workflows
Phylogenetic WorkflowsNaim Matasci
 
Phylogenetic Workflows
Phylogenetic WorkflowsPhylogenetic Workflows
Phylogenetic WorkflowsNaim Matasci
 
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Anubis Hosein
 
Structural Systems Pharmacology
Structural Systems PharmacologyStructural Systems Pharmacology
Structural Systems PharmacologyPhilip Bourne
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Denis C. Bauer
 
Integrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataIntegrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataJoão André Carriço
 
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET
 
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesProf. Wim Van Criekinge
 
QTLNetMiner - Efficient search and prioritization of gene evidence networks
QTLNetMiner - Efficient search and prioritization of gene evidence networksQTLNetMiner - Efficient search and prioritization of gene evidence networks
QTLNetMiner - Efficient search and prioritization of gene evidence networksKeywan Hassani-Pak
 
Lecture at Reading University 2015
Lecture at Reading University 2015Lecture at Reading University 2015
Lecture at Reading University 2015Nicolas Le Novère
 
Population-Based DNA Variant Analysis
Population-Based DNA Variant AnalysisPopulation-Based DNA Variant Analysis
Population-Based DNA Variant AnalysisGolden Helix
 
Distributed stream consistency checking
Distributed stream consistency checkingDistributed stream consistency checking
Distributed stream consistency checkingDaniele Dell'Aglio
 

Similar to Omics Integration (20)

Bda2015 tutorial-part2-data&databases
Bda2015 tutorial-part2-data&databasesBda2015 tutorial-part2-data&databases
Bda2015 tutorial-part2-data&databases
 
BioData World Basel 2018
BioData World Basel 2018BioData World Basel 2018
BioData World Basel 2018
 
Jax bio dataworldcongress.ngs.20181128finalwithoutbu
Jax bio dataworldcongress.ngs.20181128finalwithoutbuJax bio dataworldcongress.ngs.20181128finalwithoutbu
Jax bio dataworldcongress.ngs.20181128finalwithoutbu
 
Analyzing Genomic Data for Whole Populations
Analyzing Genomic Data for Whole PopulationsAnalyzing Genomic Data for Whole Populations
Analyzing Genomic Data for Whole Populations
 
Thesis def
Thesis defThesis def
Thesis def
 
Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289
 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways:
 
Phylogenetic Workflows
Phylogenetic WorkflowsPhylogenetic Workflows
Phylogenetic Workflows
 
Phylogenetic Workflows
Phylogenetic WorkflowsPhylogenetic Workflows
Phylogenetic Workflows
 
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
 
Structural Systems Pharmacology
Structural Systems PharmacologyStructural Systems Pharmacology
Structural Systems Pharmacology
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
 
Integrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataIntegrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS data
 
AI Math Agents
AI Math AgentsAI Math Agents
AI Math Agents
 
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
 
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
 
QTLNetMiner - Efficient search and prioritization of gene evidence networks
QTLNetMiner - Efficient search and prioritization of gene evidence networksQTLNetMiner - Efficient search and prioritization of gene evidence networks
QTLNetMiner - Efficient search and prioritization of gene evidence networks
 
Lecture at Reading University 2015
Lecture at Reading University 2015Lecture at Reading University 2015
Lecture at Reading University 2015
 
Population-Based DNA Variant Analysis
Population-Based DNA Variant AnalysisPopulation-Based DNA Variant Analysis
Population-Based DNA Variant Analysis
 
Distributed stream consistency checking
Distributed stream consistency checkingDistributed stream consistency checking
Distributed stream consistency checking
 

Recently uploaded

Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 

Recently uploaded (20)

Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 

Omics Integration

  • 1. Integrative causality analysis of genetic, epigenetic, and transcriptomic data in a large cohort Rosemary McCloskey and Sara Mostafavi rmcclosk.math@gmail.com http://slideshare.net/rmcclosk/omics-integration March 27, 2015 R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 1 / 12
  • 2. Motivation genetic, epigenetic, and transcriptomic data provide snapshots of cellular processes GATTACA gene expression methylation histone acetylation genotype R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 2 / 12
  • 3. Motivation genetic, epigenetic, and transcriptomic data provide snapshots of cellular processes usually one data type is studied at a time, in relation to a phenotype or disease GATTACA gene expression methylation histone acetylation genotype R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 2 / 12
  • 4. Motivation genetic, epigenetic, and transcriptomic data provide snapshots of cellular processes usually one data type is studied at a time, in relation to a phenotype or disease GATTACA ? gene expression methylation histone acetylation genotype how do these data fit together? R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 2 / 12
  • 5. The data large cohort designed to study cognitive decline and Alzheimer’s disease 2 19 1080 0 3 392 152 20 0 1 40 61 47 17 11 expression methylation acetylation genotype R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 3 / 12
  • 6. The data large cohort designed to study cognitive decline and Alzheimer’s disease genotype, gene expression, DNA methylation, and histone acetylation (CHiP-seq) data 2 19 1080 0 3 392 152 20 0 1 40 61 47 17 11 expression methylation acetylation genotype R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 3 / 12
  • 7. The data large cohort designed to study cognitive decline and Alzheimer’s disease genotype, gene expression, DNA methylation, and histone acetylation (CHiP-seq) data 392 individuals with all four data types were used for this analysis 2 19 1080 0 3 392 152 20 0 1 40 61 47 17 11 expression methylation acetylation genotype R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 3 / 12
  • 8. Quantitative trait loci (QTLs) a QTL is a genetic locus correlated with a phenotype -2 -1 0 1 2 3 -2 -1 0 1 2 -1 0 1 expressionacetylationmethylation 0 1 2 genotype R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 4 / 12
  • 9. Quantitative trait loci (QTLs) a QTL is a genetic locus correlated with a phenotype we are interested in QTLs for gene expression (eQTLs), histone acetylation (aceQTLs), and methylation (meQTLs) -2 -1 0 1 2 3 -2 -1 0 1 2 -1 0 1 expressionacetylationmethylation 0 1 2 genotype R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 4 / 12
  • 10. Quantitative trait loci (QTLs) a QTL is a genetic locus correlated with a phenotype we are interested in QTLs for gene expression (eQTLs), histone acetylation (aceQTLs), and methylation (meQTLs) QTLs provide a tool to study interaction between other molecular phenotypes -2 -1 0 1 2 3 -2 -1 0 1 2 -1 0 1 expressionacetylationmethylation 0 1 2 genotype R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 4 / 12
  • 11. Identifying QTLs R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 5 / 12
  • 12. Identifying QTLs ↓ SNPs in 200 kb window Spearman’s ρ R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 5 / 12
  • 13. Identifying QTLs ↓ SNPs in 200 kb window Spearman’s ρ ↓ Holm-Bonferroni correction best SNP per feature R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 5 / 12
  • 14. Identifying QTLs ↓ SNPs in 200 kb window Spearman’s ρ ↓ Holm-Bonferroni correction best SNP per feature ↓ FDR correction R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 5 / 12
  • 15. Removing Principal Components technical, environmental, and biological covariates can swamp out QTL effects 4000 4500 5000 5500 6000 3000 3500 4000 75000 80000 85000 90000 95000 genespeaksCpGs 0 5 10 15 20 PCs removed R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 6 / 12
  • 16. Removing Principal Components technical, environmental, and biological covariates can swamp out QTL effects correct by removing principal components 4000 4500 5000 5500 6000 3000 3500 4000 75000 80000 85000 90000 95000 genespeaksCpGs 0 5 10 15 20 PCs removed R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 6 / 12
  • 17. Removing Principal Components technical, environmental, and biological covariates can swamp out QTL effects correct by removing principal components number of peaks with a QTL plateaus at 10 PCs, while genes and CpGs continue to increase 4000 4500 5000 5500 6000 3000 3500 4000 75000 80000 85000 90000 95000 genespeaksCpGs 0 5 10 15 20 PCs removed R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 6 / 12
  • 18. Removing Principal Components technical, environmental, and biological covariates can swamp out QTL effects correct by removing principal components number of peaks with a QTL plateaus at 10 PCs, while genes and CpGs continue to increase for this analysis, removed 10 PCs from all data 4000 4500 5000 5500 6000 3000 3500 4000 75000 80000 85000 90000 95000 genespeaksCpGs 0 5 10 15 20 PCs removed R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 6 / 12
  • 19. Identifying multi-QTLs By intersecting QTL sets, found 240 gene, CpG, and peak triples which shared the same QTL 2984 1799 50981 127 240 1604 2129 eQTL meQTL aceQTL 2984 1799 50981 127 240 1604 2129 eQTL meQTL aceQTL R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 7 / 12
  • 20. Identifying multi-QTLs By intersecting QTL sets, found 240 gene, CpG, and peak triples which shared the same QTL 2984 1799 50981 127 240 1604 2129 eQTL meQTL aceQTL 2984 1799 50981 127 240 1604 2129 eQTL meQTL aceQTL Also assessed QTL overlap using π0 approach 100 % 46 % 14 % 31 % 100 % 11 % 83 % 84 % 100 % eQTLs aceQTLs meQTLs eQTLs aceQTLs meQTLs R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 7 / 12
  • 21. Bayesian networks Bayesian networks are directed graphical models, where the directed edges represent causal relationships temperature precipitation R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 8 / 12
  • 22. Bayesian networks Bayesian networks are directed graphical models, where the directed edges represent causal relationships We use conditional Gaussian networks temperature precipitation Pr(temp) ∼ N(0, 1) Pr(precip | temp) ∼ N(0, 1) R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 8 / 12
  • 23. Bayesian networks Bayesian networks are directed graphical models, where the directed edges represent causal relationships We use conditional Gaussian networks Score = likelihood of data given network temperature precipitation Pr(temp) ∼ N(0, 1) Pr(precip | temp) ∼ N(0, 1) 0.7 0.5 R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 8 / 12
  • 24. Bayesian networks Bayesian networks are directed graphical models, where the directed edges represent causal relationships We use conditional Gaussian networks Score = likelihood of data given network temperature precipitation Pr(temp) ∼ N(0, 1) Pr(precip | temp) ∼ N(0, 1) 0.7 0.5 Pr(N(0, 1) = 0.7) Pr(N(0.7, 1) = 0.5)× R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 8 / 12
  • 25. Networks for QTLs deal and CGBayesNets packages to construct one Bayesian network for each multi-QTL by exhaustive search genotypeexpression acetylation methylation R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 9 / 12
  • 26. Networks for QTLs deal and CGBayesNets packages to construct one Bayesian network for each multi-QTL by exhaustive search With deal, edges into genotype were blacklisted genotypeexpression acetylation methylation R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 9 / 12
  • 27. Networks for QTLs deal and CGBayesNets packages to construct one Bayesian network for each multi-QTL by exhaustive search With deal, edges into genotype were blacklisted Most common network structure was independence genotypeexpression acetylation methylation R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 9 / 12
  • 28. Networks for QTLs deal and CGBayesNets packages to construct one Bayesian network for each multi-QTL by exhaustive search With deal, edges into genotype were blacklisted Most common network structure was independence Accounted for 42% of deal networks, 29% of CGBayesNets networks genotypeexpression acetylation methylation R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 9 / 12
  • 29. Future Work Expand the number of multi-QTLs More that just the best SNP per feature Identify overlapping QTLs intelligently More rigourous criterion for number of PCs to remove Try other packages for network learning (HyPhy) Are QTLs enriched in SNPs identified in GWAS studies? Correlations with phenotype (cognitive decline etc.) R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 10 / 12
  • 30. Thank you! Harvard / Broad Philip L. D. Jager Lori Chibnik Jishu Xu Charles White Cristin McCabe Towfique Raj Rush David A Bennett Chris Gaiteri Lei Yu Bioinformatics Training Program All the students Sharon Ruschkowski R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 11 / 12
  • 31. Software QTL analysis Matrix eQTL qvalue Bayesian networks deal CGBayesNets Slides beamer TikZ tikzDevice Plots pheatmap ggplot2 VennDiagram Colour Scheme solarized R. McCloskey & S. Mostafavi () Omics data integration March 27, 2015 12 / 12