SlideShare a Scribd company logo
1 of 33
Download to read offline
(Deep) learning from multi-view
data: beyond ImageNet and CelebA
Bianca Dumitrascu
SAMSI/Duke
Deep Learning Workshop
08/13/2019
Talk Goals
introduce a clinically relevant question
provide a temporary solution
(joint work with Greg Gundersen, Jordan Ash, Barbara E
Engelhardt)
suggest paths to (statistical) improvement
1 / 1
Modern genetics allows the collection of diverse type of data.
personal fan art drawing for the video game Hollow Knight
Each data type and data type combination comes with its own
set of computational challenges and opportunities (both
shallow and deep).
Simple additive models can relate SNPs and gene expression
with the goal of explaining variation in phenotypic differences.
Y = Xβ + ,
Y ∈ Rn×G
, X ∈ {0, 1, 2}n×p
, ∼ N(0, σf(X))
Statistical concepts: linear regression, false discovery rate,
heteroskedasticity, Bayes factors
Statistical tests for detecting variance effects in quantitative trait studies. BD, G. Darnell, J. Ayroles, and B. E Engelhardt.
In Bioinformatics, 2018
Modelling gene expression can determine differences in cell
types (single cell) or can uncover information about
transcriptional programs and pathway dynamics.
Statistical concepts: embeddings, visualization, spectral
clustering, regularization, convex relaxation
netNMF-sc: Leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression
analysis. R. Elyanow, BD, B. E Engelhardt, B. J Raphael. bioRxiv, 2019
Statistical differences at macro levels: in medical health record
data modeling covariate information can formalize
interventional challenges regarding medication and diagnostics.
HP
HP
Statistical concepts: Gaussian processes, hierarchical priors,
latent force models
Sparse Multi-Output Gaussian Processes for Medical Time Series Prediction. Li-Fang Cheng, Gregory Darnell, BD, Corey
Chivers, Michael E Draugelis, Kai Li, and Barbara E Engelhardt. arxiv
Causal Convolutional Gaussian Processes for Modeling Personalized Dynamics of Clinical Treatments. Li-Fang Cheng, BD,
and Barbara E Engelhardt. in preparation.
Focus: genomics and histology
Improvements in technology, data storage, financial incentives
have lead to a focus on medical applications involving imaging
tasks.
Example: Cancer prediction from histology slides
Google Brain: An augmented reality microscope with real-time
artificial intelligence integration for cancer diagnosis. Po-Hsuan
Cameron Chen et al., Nature Medicine 2019.
Example: Cancer prediction from histology slides
a straightforward, yet challenging engineering approach
required a large number of manually annotated samples by
experts (prior human knowledge of features)
Mechanism: The scientist’s dream
A few notes:
performing well on prediction tasks (i.e. for cancer
classification & detection) is very important
detecting structures that are associated with cancer (by
experts) is very important
detecting what drives the formation of such structures is
difficult
is there a a genetic basis to structures?
Relating gene expression to morphology
Data: paired gene expression and histology slides images from
the GTEx project.
End-to-end Training of Deep Probabilistic CCA on Paired
Biomedical Observations. (UAI 2019) Gregory Gundersen,
Bianca Dumitrascu, Jordan T. Ash, Barbara E. Engelhardt
Relating gene expression to morphology
Data: paired gene expression and histology slides images from
the GTEx project.
GTEx Consortium. Genetic effects on gene expression across
human tissues. Nature, 2017.
Scientific goals
Collecting images is cheaper than collecting gene
expression: can we predict images from gene expression?
(reconstruction)
Understand the effect of genetic variation at both molecular
and morphological levels! (good performance on
downstream tasks)
Understand what group of genes might give raise to
observable morphology! (interpretibility)
Multi-view learning
Goals of Multi-modal learning
Canonical correlation analysis
Given X1 ∈ Rn,d1 and X2 ∈ Rn,d2 , find Λ∗
1 ∈ Rd1 and Λ∗
2 ∈ Rd2 , such
that:
Λ∗
1 , Λ∗
2 = argmaxΛ1,Λ2 corr(X1Λ1, X2Λ2)
H. Hotelling. Relations between two sets of variates.
Biometrika, 1936.
F.R. Bach and M.I. Jordan. A probabilistic interpretation of
canonical correlation analysis. 2005. (Bayesian, solved
through EM)
Two related solutions: Image CCA & DeepPCCA
ImageCCA: Ash, Darnell, Munro, Engelhardt, Joint analysis of gene expression levels and histological images
identifies genes associated with tissue morphology, biorxiv 2018
(deep)PCCA
Input: Paired data (xi
1, xi
2), with i = 1, n
Ingredients: Variational loss + Factor Analysis (Probabilistic CCA)+
Domain Knowledge (linear map for gene expression,
convolutional neural networks for image features);
z ∼ N(0; I)
z1, z2 ∼ N(0; I)
x1 ∼ N(zΛ1 + z1B1; Ψ1)
x2 ∼ N(fθ(zΛ2 + z2B2); Ψ2),
where fθ is a convolutional neural network (same architecture as
ImageNet).
GTEx training
Deep probabilistic CCA fits PCCA embeddings of two
variational autoencoders, training the model end-to-end
with backpropagation
The embeddings are inputs to PCCA module whose output
are latent embeddings z1, z2, zc ∼ N(0k, Ik), with
yj ∼ N(zjΛj + zcΛjc, Ψj)
Obtain Λ∗, Ψ∗ via EM parameter updates
Backprop trough the L-penalized reconstruction loss
L =
1
n
n
i=1
Dec(x1
)i − x1
i
2
2 +
1
n
n
i=1
Dec(x2
)i − x2
i
2
2+
γ1 Λ1
1 + γ2 Λ2
1 + γ3 θdec 1
GTEx training
z ∼ N(0; I)
z1, z2 ∼ N(0; I)
x1 ∼ N(zΛ1 + z1B1; Ψ1)
x2 ∼ N(fθ(zΛ2 + z2B2); Ψ2),
A. Klami, S. Virtanen, and S. Kaski. Bayesian canonical
correlation analysis. JLMR, 2013.
D.P. Kingma and M. Welling. Auto-encoding variational Bayes.
arXiv preprint, 2013.
Toy Data
Input: paired vectors (x1, x2), where x1 represents image features
from images of 0, 1, 2, and x2 are sampled from multivariate
normal distributions with different means (not pictured below).
Toy Data
Latent embeddings of the Toy Data views
GTEx: Latent Space Organization
Latent joint image embeddings using DPCCA(left) show structural
organization, whereas clustering of images based on image specific
embeddings (right) do not.
GTEx: Modality recovery
Subset gene covariance matrix (left) and recovered histology slides
(right)
GTEx: Downstream analysis
Downstream eQTL discovery:
Meaningful latent space embeddings:
GTEx: Modality recovery
Desiderata: cell painting - are we there yet?
No!
Experimental challenge (only temporary, data TBD): bulk vs
single cell
Necessary Modelling Extensions
Necessary Modelling Extensions: Interpretability!
introduce adversarial training (avoid information bottleneck)
introduce sparsity on gene expression (horseshoe, spike and
slab)
extract features that are most responsible for embedding
[e.g. adapt Deep Learning for Case-Based Reasoning through
Prototypes: A Neural Network that Explains Its Predictions
Oscar Li, Hao Liu, Chaofan Chen, Cynthia Rudin]
A parting problem
Scenario where samples are not aligned (matched):
from multi-view to domain adaptation.
Requires optimizing over a large discrete structure Π
(permutation matrix) or a different modelling formulation.
z ∼ N(0; I)
z1, z2 ∼ N(0; I)
x1 ∼ N(ΠzΛ1 + Πz1B1; Ψ1)
x2 ∼ N(fθ(zΛ2 + z2B2); Ψ2),
Thank You!

More Related Content

What's hot

Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practicetuxette
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity datatuxette
 
Investigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysisInvestigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysistuxette
 
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYIMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYijcsit
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology tuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesDmitrii Ignatov
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewMohamed Farouk
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biologytuxette
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clusteringDmitrii Ignatov
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...tuxette
 
La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...tuxette
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNNtuxette
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distributionAlexander Decker
 
A short and naive introduction to epistasis in association studies
A short and naive introduction to epistasis in association studiesA short and naive introduction to epistasis in association studies
A short and naive introduction to epistasis in association studiestuxette
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-feiTianlu Wang
 
Igh maa-2015 nov
Igh maa-2015 novIgh maa-2015 nov
Igh maa-2015 novZach Zhang
 

What's hot (20)

Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity data
 
Investigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysisInvestigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysis
 
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYIMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequences
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clustering
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distribution
 
A short and naive introduction to epistasis in association studies
A short and naive introduction to epistasis in association studiesA short and naive introduction to epistasis in association studies
A short and naive introduction to epistasis in association studies
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-fei
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
 
Bayesian Core: Chapter 8
Bayesian Core: Chapter 8Bayesian Core: Chapter 8
Bayesian Core: Chapter 8
 
Igh maa-2015 nov
Igh maa-2015 novIgh maa-2015 nov
Igh maa-2015 nov
 

Similar to Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a deep learning take on medical pathology - Bianca Dumitrascu, August 13, 2019

Brain tissue discovery and classification in HSI
Brain tissue discovery and classification in HSIBrain tissue discovery and classification in HSI
Brain tissue discovery and classification in HSIRavi Kiran B.
 
Texture features from Chaos Game Representation Images of Genomes
Texture features from Chaos Game Representation Images of GenomesTexture features from Chaos Game Representation Images of Genomes
Texture features from Chaos Game Representation Images of GenomesCSCJournals
 
Uncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep LearningUncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep LearningSungjoon Choi
 
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...DataScienceConferenc1
 
Texture-Based Computational Models of Tissue in Biomedical Images: Initial Ex...
Texture-Based Computational Models of Tissue in Biomedical Images: Initial Ex...Texture-Based Computational Models of Tissue in Biomedical Images: Initial Ex...
Texture-Based Computational Models of Tissue in Biomedical Images: Initial Ex...Institute of Information Systems (HES-SO)
 
M Sc Thesis Presentation Eitan Lavi
M Sc Thesis Presentation   Eitan LaviM Sc Thesis Presentation   Eitan Lavi
M Sc Thesis Presentation Eitan Lavieitanla1
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
 
Identification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning MethodIdentification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning Methodpraveena06
 
Yolos you only look one sequence
Yolos you only look one sequenceYolos you only look one sequence
Yolos you only look one sequencetaeseon ryu
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...tuxette
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEmanuele Ghelfi
 
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)XAIC
 

Similar to Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a deep learning take on medical pathology - Bianca Dumitrascu, August 13, 2019 (20)

Brain tissue discovery and classification in HSI
Brain tissue discovery and classification in HSIBrain tissue discovery and classification in HSI
Brain tissue discovery and classification in HSI
 
Texture features from Chaos Game Representation Images of Genomes
Texture features from Chaos Game Representation Images of GenomesTexture features from Chaos Game Representation Images of Genomes
Texture features from Chaos Game Representation Images of Genomes
 
esquizofrenia
esquizofreniaesquizofrenia
esquizofrenia
 
Uncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep LearningUncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep Learning
 
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
 
Texture-Based Computational Models of Tissue in Biomedical Images: Initial Ex...
Texture-Based Computational Models of Tissue in Biomedical Images: Initial Ex...Texture-Based Computational Models of Tissue in Biomedical Images: Initial Ex...
Texture-Based Computational Models of Tissue in Biomedical Images: Initial Ex...
 
M Sc Thesis Presentation Eitan Lavi
M Sc Thesis Presentation   Eitan LaviM Sc Thesis Presentation   Eitan Lavi
M Sc Thesis Presentation Eitan Lavi
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networks
 
Identification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning MethodIdentification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning Method
 
Yolos you only look one sequence
Yolos you only look one sequenceYolos you only look one sequence
Yolos you only look one sequence
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
Rodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_PosterRodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_Poster
 
Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...
 
Poster(3)-1
Poster(3)-1Poster(3)-1
Poster(3)-1
 
Content-based Image Retrieval
Content-based Image RetrievalContent-based Image Retrieval
Content-based Image Retrieval
 
Deep Feature Consistent VAE
Deep Feature Consistent VAEDeep Feature Consistent VAE
Deep Feature Consistent VAE
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
 
AI Math Agents
AI Math AgentsAI Math Agents
AI Math Agents
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and Applications
 
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 

Recently uploaded (20)

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 

Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a deep learning take on medical pathology - Bianca Dumitrascu, August 13, 2019

  • 1. (Deep) learning from multi-view data: beyond ImageNet and CelebA Bianca Dumitrascu SAMSI/Duke Deep Learning Workshop 08/13/2019
  • 2. Talk Goals introduce a clinically relevant question provide a temporary solution (joint work with Greg Gundersen, Jordan Ash, Barbara E Engelhardt) suggest paths to (statistical) improvement 1 / 1
  • 3. Modern genetics allows the collection of diverse type of data. personal fan art drawing for the video game Hollow Knight
  • 4. Each data type and data type combination comes with its own set of computational challenges and opportunities (both shallow and deep).
  • 5. Simple additive models can relate SNPs and gene expression with the goal of explaining variation in phenotypic differences. Y = Xβ + , Y ∈ Rn×G , X ∈ {0, 1, 2}n×p , ∼ N(0, σf(X)) Statistical concepts: linear regression, false discovery rate, heteroskedasticity, Bayes factors Statistical tests for detecting variance effects in quantitative trait studies. BD, G. Darnell, J. Ayroles, and B. E Engelhardt. In Bioinformatics, 2018
  • 6. Modelling gene expression can determine differences in cell types (single cell) or can uncover information about transcriptional programs and pathway dynamics. Statistical concepts: embeddings, visualization, spectral clustering, regularization, convex relaxation netNMF-sc: Leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression analysis. R. Elyanow, BD, B. E Engelhardt, B. J Raphael. bioRxiv, 2019
  • 7. Statistical differences at macro levels: in medical health record data modeling covariate information can formalize interventional challenges regarding medication and diagnostics. HP HP Statistical concepts: Gaussian processes, hierarchical priors, latent force models Sparse Multi-Output Gaussian Processes for Medical Time Series Prediction. Li-Fang Cheng, Gregory Darnell, BD, Corey Chivers, Michael E Draugelis, Kai Li, and Barbara E Engelhardt. arxiv Causal Convolutional Gaussian Processes for Modeling Personalized Dynamics of Clinical Treatments. Li-Fang Cheng, BD, and Barbara E Engelhardt. in preparation.
  • 8. Focus: genomics and histology Improvements in technology, data storage, financial incentives have lead to a focus on medical applications involving imaging tasks.
  • 9. Example: Cancer prediction from histology slides Google Brain: An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis. Po-Hsuan Cameron Chen et al., Nature Medicine 2019.
  • 10. Example: Cancer prediction from histology slides a straightforward, yet challenging engineering approach required a large number of manually annotated samples by experts (prior human knowledge of features)
  • 11. Mechanism: The scientist’s dream A few notes: performing well on prediction tasks (i.e. for cancer classification & detection) is very important detecting structures that are associated with cancer (by experts) is very important detecting what drives the formation of such structures is difficult is there a a genetic basis to structures?
  • 12. Relating gene expression to morphology Data: paired gene expression and histology slides images from the GTEx project. End-to-end Training of Deep Probabilistic CCA on Paired Biomedical Observations. (UAI 2019) Gregory Gundersen, Bianca Dumitrascu, Jordan T. Ash, Barbara E. Engelhardt
  • 13. Relating gene expression to morphology Data: paired gene expression and histology slides images from the GTEx project. GTEx Consortium. Genetic effects on gene expression across human tissues. Nature, 2017.
  • 14. Scientific goals Collecting images is cheaper than collecting gene expression: can we predict images from gene expression? (reconstruction) Understand the effect of genetic variation at both molecular and morphological levels! (good performance on downstream tasks) Understand what group of genes might give raise to observable morphology! (interpretibility)
  • 17. Canonical correlation analysis Given X1 ∈ Rn,d1 and X2 ∈ Rn,d2 , find Λ∗ 1 ∈ Rd1 and Λ∗ 2 ∈ Rd2 , such that: Λ∗ 1 , Λ∗ 2 = argmaxΛ1,Λ2 corr(X1Λ1, X2Λ2) H. Hotelling. Relations between two sets of variates. Biometrika, 1936. F.R. Bach and M.I. Jordan. A probabilistic interpretation of canonical correlation analysis. 2005. (Bayesian, solved through EM)
  • 18. Two related solutions: Image CCA & DeepPCCA ImageCCA: Ash, Darnell, Munro, Engelhardt, Joint analysis of gene expression levels and histological images identifies genes associated with tissue morphology, biorxiv 2018
  • 19. (deep)PCCA Input: Paired data (xi 1, xi 2), with i = 1, n Ingredients: Variational loss + Factor Analysis (Probabilistic CCA)+ Domain Knowledge (linear map for gene expression, convolutional neural networks for image features); z ∼ N(0; I) z1, z2 ∼ N(0; I) x1 ∼ N(zΛ1 + z1B1; Ψ1) x2 ∼ N(fθ(zΛ2 + z2B2); Ψ2), where fθ is a convolutional neural network (same architecture as ImageNet).
  • 20. GTEx training Deep probabilistic CCA fits PCCA embeddings of two variational autoencoders, training the model end-to-end with backpropagation The embeddings are inputs to PCCA module whose output are latent embeddings z1, z2, zc ∼ N(0k, Ik), with yj ∼ N(zjΛj + zcΛjc, Ψj) Obtain Λ∗, Ψ∗ via EM parameter updates Backprop trough the L-penalized reconstruction loss L = 1 n n i=1 Dec(x1 )i − x1 i 2 2 + 1 n n i=1 Dec(x2 )i − x2 i 2 2+ γ1 Λ1 1 + γ2 Λ2 1 + γ3 θdec 1
  • 21. GTEx training z ∼ N(0; I) z1, z2 ∼ N(0; I) x1 ∼ N(zΛ1 + z1B1; Ψ1) x2 ∼ N(fθ(zΛ2 + z2B2); Ψ2), A. Klami, S. Virtanen, and S. Kaski. Bayesian canonical correlation analysis. JLMR, 2013. D.P. Kingma and M. Welling. Auto-encoding variational Bayes. arXiv preprint, 2013.
  • 22. Toy Data Input: paired vectors (x1, x2), where x1 represents image features from images of 0, 1, 2, and x2 are sampled from multivariate normal distributions with different means (not pictured below).
  • 23. Toy Data Latent embeddings of the Toy Data views
  • 24. GTEx: Latent Space Organization Latent joint image embeddings using DPCCA(left) show structural organization, whereas clustering of images based on image specific embeddings (right) do not.
  • 25. GTEx: Modality recovery Subset gene covariance matrix (left) and recovered histology slides (right)
  • 26. GTEx: Downstream analysis Downstream eQTL discovery: Meaningful latent space embeddings:
  • 28. Desiderata: cell painting - are we there yet?
  • 29. No! Experimental challenge (only temporary, data TBD): bulk vs single cell
  • 31. Necessary Modelling Extensions: Interpretability! introduce adversarial training (avoid information bottleneck) introduce sparsity on gene expression (horseshoe, spike and slab) extract features that are most responsible for embedding [e.g. adapt Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions Oscar Li, Hao Liu, Chaofan Chen, Cynthia Rudin]
  • 32. A parting problem Scenario where samples are not aligned (matched): from multi-view to domain adaptation. Requires optimizing over a large discrete structure Π (permutation matrix) or a different modelling formulation. z ∼ N(0; I) z1, z2 ∼ N(0; I) x1 ∼ N(ΠzΛ1 + Πz1B1; Ψ1) x2 ∼ N(fθ(zΛ2 + z2B2); Ψ2),