SlideShare a Scribd company logo
Deep Learning based
Multi-omics integration
A survey
Deep Learning in Bioinformatics
Min, Seonwoo, Byunghan Lee, and Sungroh Yoon. "Deep learning in bioinformatics." Briefings in Bioinformatics (2016)
Outline
• Summarize three related works on deep learning based
feature extraction / survival prediction on omics data
• Unsupervised feature construction and knowledge extraction from
genome-wide assays of breast cancer with denoising autoencoders
• A deep learning approach for cancer detection and relevant gene
identification
• Deep Learning based multi-omics integration robustly predicts
survival in liver cancer
Unsupervised feature construction and knowledge
extraction from genome-wide assays of breast cancer
with denoising autoencoders
Pacific Symposium on Biocomputing, 2015
Denoising Auto-Encoder (DAE)
• Build features that
reconstruct initial input data
from corrupted data
• Generate robust features
• Unsupervised learning
• Extract features in the non-
linear space
Data
• Two largest breast cancer dataset
• Train DAs and identify predictive features with METABRIC
dataset
• 2137 samples, 3000 2520 genes
• gene expression data from European Genomephenome Archive
• Evaluate with TCGA dataset independently
• 547 samples, 2520 genes
Features to clinical characteristics
• Genes are not linked to their neighbors
• Genes are linked by transcription
factors, pathway memberships
• Are constructed features linked to
clinical and molecular features of the
samples?
• Categorize tumor / normal samples
• Categorize ER+/- samples
• Categorize samples into molecular
subtypes(Luminal A/B, Basal-like, HER2-
enriched, Normal-like)
Features to clinical characteristics
• classifying tumor from
normal samples
• classifying ER + from ER -
samples
Robust performance across datasets
Features to transcription factor
• Breast cancer related transcription factors are linked to these high-
weight features (Node58)
• It contained genes that reflect activity of key ER-associated TFs
Most genes gave zero or low weight to a hidden node
High positive weightHigh negative weight
Features to patient survival
• Node whose activities best separated
two high / low survival groups
(Node5)
• Highly predictive of patient survival
Features to Biological pathways
• Pathways significantly associated with genes that consistently
gave high weights to a node
PID pathways enriched in Node5(5th feature)
Summary
• Unsupervised feature construction based on DAEs and
interpretation
• Apply to a breast cancer gene expression data
• Consistent results across different datasets
• In the future..
• Multiple layers of stacked DAEs
• Consistency across datasets will useful for data integration
• Limitations for large-scale data integration
A deep learning approach for cancer
detection and relevant gene identification
Pacific Symposium on Biocomputing, 2016
RNA-seq
samples
TCGA
Healthy
Cancer
Test Train
SDAE
features
DCGs
Model
Validation
weights
Overview
Supervised classification
(cancer detection)
Highly interactive genes
identification
1210 breast cancer samples
Stacked Denoising Auto-Encoder
• Extract functional features from high dimensional, noisy gene expression
profiles with reduced loss of information
• Select a layer has both low dimension and low validation error
Classification result
• Classify cancer samples from
healthy control samples
• Feature extraction
• SDAE
• Differentially expressed genes
(DIFFEXP)
• PCA
• KPCA (RBF kernel)
• Classification model
• SVM
• SVM (RBF kernel)
• single-layer ANN
Deeply connected genes
• Genes with the largest weights in W (the product of the
weight matrices for each layer) are the most strongly
connected to the extracted and highly predictive features
But lower performance than SDAE feateures
….
Summary
• SDAE to transform high-dimensional, noisy gene expression data
to a lower dimensional, meaningful representation
• Classify breast cancer samples from the healthy control samples
using new compact features
• Identify a set of highly interactive genes critical for the diagnosis
of breast cancer
• In the future..
• Need to improve the extraction of DCGs
• Limitation on the requirement for large data sets
• Identify cross-cancer biomarkers through the analysis of aggregated
heterogeneous cancer data
Deep Learning based multi-omics integration
robustly predicts survival in liver cancer
preprint, 2017
360 tumor samples
15629 genes 365 miRNAs 19883 genes
100 features
37 features
high/poor survival
Why Autoencoders?
• Produce features linked to
clinical outcomes
• Analyze high-dimensional gene
expression data
• Integrate heterogeneous data
• Interpret the biological
functions (aggregate genes
sharing similar pathways)
Classification result
PCA
Classification result
Single-omics based
DL models
Validation in five cohorts
• Robustness of the model at predicting survival outcomes
Adding clinical information
• Age, Stage, Grade, Race, Risk factors (HBV, HCV, Alcohol, …)
• DL-based multi-omics model performs sufficiently well even
without clinical features
Functional analysis of the survival-subgroups
• KEGG pathway analysis to
pinpoint the pathways enriched
in two subtypes
• Two subtypes have different and
disjoint active pathways
Enriched pathway-gene analysis for upregulated genes
• S1 aggressive tumor
sub-group
• Enriched with cancer
related pathways
Enriched pathway-gene analysis for upregulated genes
• S2 less aggressive tumor
sub-group
• Activated metabolism
related pathways
Summary
• Contributions
• Identified two subtypes from the molecular level
• Consistent performance implying the reliability and robustness of
the model
• Sufficient performance without adding clinical features
• AE has much more efficiency to infer features linked to survival
• Validated in five additional cohorts
• Challenges
• The absence of cluster label information in original reports
• Lack of survival data in some cases
Conclusion
• Feature extraction with SDAE
• Robust to noisy datasets
• Extract meaningful features and reflect both linear and non-linear
relationships
• Consistent performance, good for multi-omics integration
• Multi-omics integration
• More sophisticated strategy to combine multiple features
• May incorporate pathways, handle overlapping genes
Thank you!
Q & A

More Related Content

What's hot

Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Manikhandan Mudaliar
 
Genomic Data Analysis
Genomic Data AnalysisGenomic Data Analysis
Genomic Data Analysis
Data Driven Innovation
 
Protein-protein interaction networks
Protein-protein interaction networksProtein-protein interaction networks
Protein-protein interaction networks
Bioinformatics and Computational Biosciences Branch
 
Integrative omics approches
Integrative omics approches   Integrative omics approches
Integrative omics approches
Sayali Magar
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
mikaelhuss
 
Next Generation Sequencing and its Applications in Medical Research - Frances...
Next Generation Sequencing and its Applications in Medical Research - Frances...Next Generation Sequencing and its Applications in Medical Research - Frances...
Next Generation Sequencing and its Applications in Medical Research - Frances...
Sri Ambati
 
Assembly and gene_prediction
Assembly and gene_predictionAssembly and gene_prediction
Assembly and gene_prediction
Bas van Breukelen
 
Machine learning in biology
Machine learning in biologyMachine learning in biology
Machine learning in biology
Pranavathiyani G
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
Pawan Kumar
 
Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
Alireza Doustmohammadi
 
Comparative genomics 2
Comparative genomics 2Comparative genomics 2
Comparative genomics 2
GCUF
 
Introduction to Data Mining / Bioinformatics
Introduction to Data Mining / BioinformaticsIntroduction to Data Mining / Bioinformatics
Introduction to Data Mining / Bioinformatics
Gerald Lushington
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAmol Kunde
 
Microarray Analysis
Microarray AnalysisMicroarray Analysis
Microarray Analysis
James McInerney
 
Multi Omics Approach in Medicine
Multi Omics Approach in MedicineMulti Omics Approach in Medicine
Multi Omics Approach in Medicine
Shreya Gupta
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
Bioinformatics and Computational Biosciences Branch
 
Role of transcriptomics in gene expression studies and
Role of transcriptomics in gene expression studies andRole of transcriptomics in gene expression studies and
Role of transcriptomics in gene expression studies andSarla Rao
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysis
saberhussain9
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
Pragya Pai
 
The Cancer Genome Atlas Update
The Cancer Genome Atlas UpdateThe Cancer Genome Atlas Update
The Cancer Genome Atlas Update
Melanoma Research Foundation
 

What's hot (20)

Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
Genomic Data Analysis
Genomic Data AnalysisGenomic Data Analysis
Genomic Data Analysis
 
Protein-protein interaction networks
Protein-protein interaction networksProtein-protein interaction networks
Protein-protein interaction networks
 
Integrative omics approches
Integrative omics approches   Integrative omics approches
Integrative omics approches
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
 
Next Generation Sequencing and its Applications in Medical Research - Frances...
Next Generation Sequencing and its Applications in Medical Research - Frances...Next Generation Sequencing and its Applications in Medical Research - Frances...
Next Generation Sequencing and its Applications in Medical Research - Frances...
 
Assembly and gene_prediction
Assembly and gene_predictionAssembly and gene_prediction
Assembly and gene_prediction
 
Machine learning in biology
Machine learning in biologyMachine learning in biology
Machine learning in biology
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
 
Comparative genomics 2
Comparative genomics 2Comparative genomics 2
Comparative genomics 2
 
Introduction to Data Mining / Bioinformatics
Introduction to Data Mining / BioinformaticsIntroduction to Data Mining / Bioinformatics
Introduction to Data Mining / Bioinformatics
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Microarray Analysis
Microarray AnalysisMicroarray Analysis
Microarray Analysis
 
Multi Omics Approach in Medicine
Multi Omics Approach in MedicineMulti Omics Approach in Medicine
Multi Omics Approach in Medicine
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Role of transcriptomics in gene expression studies and
Role of transcriptomics in gene expression studies andRole of transcriptomics in gene expression studies and
Role of transcriptomics in gene expression studies and
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysis
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 
The Cancer Genome Atlas Update
The Cancer Genome Atlas UpdateThe Cancer Genome Atlas Update
The Cancer Genome Atlas Update
 

Similar to Deep learning based multi-omics integration, a survey

Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례
mothersafe
 
A Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge SystemA Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge System
Warren Kibbe
 
Artificial Intelligence in pathology
Artificial Intelligence in pathologyArtificial Intelligence in pathology
Artificial Intelligence in pathology
nehaSingh1543
 
Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...
Malachi Griffith
 
Precision oncology ncabr kibbe oct 2017
Precision oncology ncabr kibbe oct 2017Precision oncology ncabr kibbe oct 2017
Precision oncology ncabr kibbe oct 2017
Warren Kibbe
 
Evolution of molecular prognostic testing in ER positive breast cancer
Evolution of molecular prognostic testing in ER positive breast cancerEvolution of molecular prognostic testing in ER positive breast cancer
Evolution of molecular prognostic testing in ER positive breast cancer
Bell Symposium & MSP Seminar
 
A Method to facilitate cancer detection and type classification from gene exp...
A Method to facilitate cancer detection and type classification from gene exp...A Method to facilitate cancer detection and type classification from gene exp...
A Method to facilitate cancer detection and type classification from gene exp...
Xi Chen
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
DataScienceConferenc1
 
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Interactive Technologies and Games: Education, Health and Disability
 
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
OCRE | Open Clouds for Research Environments
 
Interrogating differences in expression of targeted gene sets to predict brea...
Interrogating differences in expression of targeted gene sets to predict brea...Interrogating differences in expression of targeted gene sets to predict brea...
Interrogating differences in expression of targeted gene sets to predict brea...
Enrique Moreno Gonzalez
 
NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016
Warren Kibbe
 
Marker Assisted Selection
Marker Assisted SelectionMarker Assisted Selection
Marker Assisted Selection
Khushbu
 
Big data sharing
Big data sharingBig data sharing
Big data sharing
Warren Kibbe
 
EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013
Warren Kibbe
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data Commons
Warren Kibbe
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision Medicine
Joel Saltz
 
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerExtreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Joel Saltz
 

Similar to Deep learning based multi-omics integration, a survey (20)

DREAM Challenge
DREAM ChallengeDREAM Challenge
DREAM Challenge
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례
 
A Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge SystemA Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge System
 
Artificial Intelligence in pathology
Artificial Intelligence in pathologyArtificial Intelligence in pathology
Artificial Intelligence in pathology
 
Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...
 
Precision oncology ncabr kibbe oct 2017
Precision oncology ncabr kibbe oct 2017Precision oncology ncabr kibbe oct 2017
Precision oncology ncabr kibbe oct 2017
 
Evolution of molecular prognostic testing in ER positive breast cancer
Evolution of molecular prognostic testing in ER positive breast cancerEvolution of molecular prognostic testing in ER positive breast cancer
Evolution of molecular prognostic testing in ER positive breast cancer
 
A Method to facilitate cancer detection and type classification from gene exp...
A Method to facilitate cancer detection and type classification from gene exp...A Method to facilitate cancer detection and type classification from gene exp...
A Method to facilitate cancer detection and type classification from gene exp...
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
 
priya brca1
priya brca1priya brca1
priya brca1
 
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
 
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
01-14 Analysis of Liquid Biopsies - Ibrahim.pdf
 
Interrogating differences in expression of targeted gene sets to predict brea...
Interrogating differences in expression of targeted gene sets to predict brea...Interrogating differences in expression of targeted gene sets to predict brea...
Interrogating differences in expression of targeted gene sets to predict brea...
 
NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016
 
Marker Assisted Selection
Marker Assisted SelectionMarker Assisted Selection
Marker Assisted Selection
 
Big data sharing
Big data sharingBig data sharing
Big data sharing
 
EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data Commons
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision Medicine
 
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerExtreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
 

More from SOYEON KIM

Network-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataNetwork-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal data
SOYEON KIM
 
Revealing disease-associated pathways by network integration of untargeted me...
Revealing disease-associated pathways by network integration of untargeted me...Revealing disease-associated pathways by network integration of untargeted me...
Revealing disease-associated pathways by network integration of untargeted me...
SOYEON KIM
 
Systems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSystems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traits
SOYEON KIM
 
Network embedding
Network embeddingNetwork embedding
Network embedding
SOYEON KIM
 
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
SOYEON KIM
 
DeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social RepresentationsDeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social Representations
SOYEON KIM
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
SOYEON KIM
 
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image SearchVisual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
SOYEON KIM
 
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
SOYEON KIM
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysis
SOYEON KIM
 
Translated learning
Translated learningTranslated learning
Translated learning
SOYEON KIM
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clustering
SOYEON KIM
 
Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...
SOYEON KIM
 
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
SOYEON KIM
 
Text extraction from natural scene image, a survey
Text extraction from natural scene image, a surveyText extraction from natural scene image, a survey
Text extraction from natural scene image, a survey
SOYEON KIM
 
Opinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network EffectsOpinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network Effects
SOYEON KIM
 
Evaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognitionEvaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognition
SOYEON KIM
 
Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...
SOYEON KIM
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
SOYEON KIM
 
Sentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion miningSentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion mining
SOYEON KIM
 

More from SOYEON KIM (20)

Network-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataNetwork-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal data
 
Revealing disease-associated pathways by network integration of untargeted me...
Revealing disease-associated pathways by network integration of untargeted me...Revealing disease-associated pathways by network integration of untargeted me...
Revealing disease-associated pathways by network integration of untargeted me...
 
Systems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSystems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traits
 
Network embedding
Network embeddingNetwork embedding
Network embedding
 
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
 
DeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social RepresentationsDeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social Representations
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
 
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image SearchVisual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
 
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysis
 
Translated learning
Translated learningTranslated learning
Translated learning
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clustering
 
Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...
 
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
 
Text extraction from natural scene image, a survey
Text extraction from natural scene image, a surveyText extraction from natural scene image, a survey
Text extraction from natural scene image, a survey
 
Opinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network EffectsOpinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network Effects
 
Evaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognitionEvaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognition
 
Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
Sentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion miningSentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion mining
 

Recently uploaded

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 

Recently uploaded (20)

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Deep learning based multi-omics integration, a survey

  • 1. Deep Learning based Multi-omics integration A survey
  • 2. Deep Learning in Bioinformatics Min, Seonwoo, Byunghan Lee, and Sungroh Yoon. "Deep learning in bioinformatics." Briefings in Bioinformatics (2016)
  • 3. Outline • Summarize three related works on deep learning based feature extraction / survival prediction on omics data • Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders • A deep learning approach for cancer detection and relevant gene identification • Deep Learning based multi-omics integration robustly predicts survival in liver cancer
  • 4. Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders Pacific Symposium on Biocomputing, 2015
  • 5. Denoising Auto-Encoder (DAE) • Build features that reconstruct initial input data from corrupted data • Generate robust features • Unsupervised learning • Extract features in the non- linear space
  • 6. Data • Two largest breast cancer dataset • Train DAs and identify predictive features with METABRIC dataset • 2137 samples, 3000 2520 genes • gene expression data from European Genomephenome Archive • Evaluate with TCGA dataset independently • 547 samples, 2520 genes
  • 7. Features to clinical characteristics • Genes are not linked to their neighbors • Genes are linked by transcription factors, pathway memberships • Are constructed features linked to clinical and molecular features of the samples? • Categorize tumor / normal samples • Categorize ER+/- samples • Categorize samples into molecular subtypes(Luminal A/B, Basal-like, HER2- enriched, Normal-like)
  • 8. Features to clinical characteristics • classifying tumor from normal samples • classifying ER + from ER - samples Robust performance across datasets
  • 9. Features to transcription factor • Breast cancer related transcription factors are linked to these high- weight features (Node58) • It contained genes that reflect activity of key ER-associated TFs Most genes gave zero or low weight to a hidden node High positive weightHigh negative weight
  • 10. Features to patient survival • Node whose activities best separated two high / low survival groups (Node5) • Highly predictive of patient survival
  • 11. Features to Biological pathways • Pathways significantly associated with genes that consistently gave high weights to a node PID pathways enriched in Node5(5th feature)
  • 12. Summary • Unsupervised feature construction based on DAEs and interpretation • Apply to a breast cancer gene expression data • Consistent results across different datasets • In the future.. • Multiple layers of stacked DAEs • Consistency across datasets will useful for data integration • Limitations for large-scale data integration
  • 13. A deep learning approach for cancer detection and relevant gene identification Pacific Symposium on Biocomputing, 2016
  • 15. Stacked Denoising Auto-Encoder • Extract functional features from high dimensional, noisy gene expression profiles with reduced loss of information • Select a layer has both low dimension and low validation error
  • 16. Classification result • Classify cancer samples from healthy control samples • Feature extraction • SDAE • Differentially expressed genes (DIFFEXP) • PCA • KPCA (RBF kernel) • Classification model • SVM • SVM (RBF kernel) • single-layer ANN
  • 17. Deeply connected genes • Genes with the largest weights in W (the product of the weight matrices for each layer) are the most strongly connected to the extracted and highly predictive features But lower performance than SDAE feateures ….
  • 18. Summary • SDAE to transform high-dimensional, noisy gene expression data to a lower dimensional, meaningful representation • Classify breast cancer samples from the healthy control samples using new compact features • Identify a set of highly interactive genes critical for the diagnosis of breast cancer • In the future.. • Need to improve the extraction of DCGs • Limitation on the requirement for large data sets • Identify cross-cancer biomarkers through the analysis of aggregated heterogeneous cancer data
  • 19. Deep Learning based multi-omics integration robustly predicts survival in liver cancer preprint, 2017
  • 20. 360 tumor samples 15629 genes 365 miRNAs 19883 genes 100 features 37 features high/poor survival
  • 21. Why Autoencoders? • Produce features linked to clinical outcomes • Analyze high-dimensional gene expression data • Integrate heterogeneous data • Interpret the biological functions (aggregate genes sharing similar pathways)
  • 24. Validation in five cohorts • Robustness of the model at predicting survival outcomes
  • 25. Adding clinical information • Age, Stage, Grade, Race, Risk factors (HBV, HCV, Alcohol, …) • DL-based multi-omics model performs sufficiently well even without clinical features
  • 26. Functional analysis of the survival-subgroups • KEGG pathway analysis to pinpoint the pathways enriched in two subtypes • Two subtypes have different and disjoint active pathways
  • 27. Enriched pathway-gene analysis for upregulated genes • S1 aggressive tumor sub-group • Enriched with cancer related pathways
  • 28. Enriched pathway-gene analysis for upregulated genes • S2 less aggressive tumor sub-group • Activated metabolism related pathways
  • 29. Summary • Contributions • Identified two subtypes from the molecular level • Consistent performance implying the reliability and robustness of the model • Sufficient performance without adding clinical features • AE has much more efficiency to infer features linked to survival • Validated in five additional cohorts • Challenges • The absence of cluster label information in original reports • Lack of survival data in some cases
  • 30. Conclusion • Feature extraction with SDAE • Robust to noisy datasets • Extract meaningful features and reflect both linear and non-linear relationships • Consistent performance, good for multi-omics integration • Multi-omics integration • More sophisticated strategy to combine multiple features • May incorporate pathways, handle overlapping genes