SlideShare a Scribd company logo
1 of 23
Download to read offline
Machine Learning and Network Analysis Approaches for
Predicting Clinically Relevant Outcomes
Francisco Azuaje, PhD
Head of Bioinformatics
Luxembourg Institute of Health (LIH)
304 citations (Source: Google Scholar)
Graphics by M. Fraiture
Mission
To enable patient-oriented research and biological
understanding through advanced computational approaches
Bioinformatics research and support @ LIH
F. Azuaje
(PI)
P. Nazarov
(Scientist)
T. Kaoma
(Bioinformatician)
S.Y Kim
(Bioinformatician)
A. Muller
(Bioinformatician)
K. Baum
(Postdoc. Fellow, part-time)
Y. Zhang
(PhD. Candidate)
Members:
(April 2019)
+ MSc research
students
DataQuestions Approaches Outcomes
Diagnostic
Prognostic
Predictive (drug response)
Other descriptive/modeling
Multi. sources/ technologies
Multi-omics
Clinically-relevant
Cells, animals, patients
Statistical models
Machine learning
Network-based models
Their combinations
Biological understanding
Candidate biomarkers, drugs
and targets
Software, workflows
Our research activities
Collaborations
National and international
Leading and non-leading partner
Funding targets
FNR
EU
Our collaborations
External Focus on Luxembourg
(Topol, 2014, Cell)
(Eisenstein, 2015, Nature)
Biomedical research: larger and diverse datasets
High inter-individual variabilityDatasets change in time and space High intra-individual variability
Key challenges in the field
Heterogeneity: Data, events, states,
within and between individuals…
Data not always “big”: relative lack of
labelled data, curse of dimensionality
Data: multi-layered, hierarchical
For same data type/layer: multiple
measurement platforms
Shared, key challenges in the field (2)
Interpretability, understandability:
Global and local, novelty and consistency
with prior knowledge
Reproducibility:
Crucial requirement
“Gold standards”/”ground truth”:
Lack, limitations
Complexity of pattern recurrence,
regularities
Addressing key challenges through combination of ML and
biological network models
Why networks?
• Networks are intuitive and biologically-meaningful representations of
biological data
• Networks can be used to encode and visualize data, and more
importantly: to extract features and make predictions about the data
• Network-based models can address different predictive modelling
challenges, including: multi-modal/-layered data analysis applications
and interpretable models
A biological network can be represented as a graph that is
biologically meaningful
From: McGillivray et al., 2018, Annu. Rev.
Biomed. Data Sci.
Using biological networks and machine learning for multi-omics
patient stratification
Hypothesis: information encoded in graphs is biologically relevant.
Protein-protein network
Jeong et al., Nature (2001)
Patient similarity network
Using biological networks and machine learning for multi-omics
patient stratification (cont.)
Global strategy Examples of centrality features
• 4 categories of topological features: Centrality (12 measures), modularity
features (from 7 to 153 features), diffusion features (1000), Node2Vec-
derived features (256).
• Each category generates a model
• Integrated models (weighted voting) also investigated
Application example (1): neuroblastoma multi-omic datasets
from the CAMDA challenge
Dataset 1 (498 patients,
2 omic datasets)
Dataset 2 (142 patients,
3 omic datasets)
Focus on Data 1
6,300 classification models
• Models based on graph topology features outperform models based on “classical” approach
• Among topological features, centrality metrics are most predictive (followed by diffusion-based features)
Application example (2): Neuroblastoma multi-omics datasets
from the CAMDA challenge, a deep learning approach*
Global strategy Algorithm Parameters Balanced
accuracy
Death from disease, Fischer-M
DNN h=[8,8,8,2], o=Adam, lr=1e-3, d=0.3 87.3% *
SVM t=RBF, c=64, g=0.25 75.4%
RF n=100 75.1% *
Disease progression, Fischer
DNN h=[4,2,2,2], o=Adam, lr=1e-3, d=0.3 84.7% *
SVM t=RBF, c=16, g=0.0625 81.8%
RF n=100 78.1% *• Network features from each dataset: Centrality (12), modularity
(30 to 47) features.
• Models based on each feature category, and their combination
• Data: 498 patients (2 omic datasets, gene expression data)
• Training (50% of total data), validation and test datasets
• DNNs: multiple architectures, Rectified Linear Units (ReLU),
Softmax function (2 outputs)
Prediction performance on test
dataset (top models)
Top DNN: Input features are graph centrality measures
Fischer-M: 1 dataset only (microarrays)
Fischer: Combination of 2 datasets (microarrays and RNA-Seq)
* Article submitted in
cooperation with:
Global strategy
• Additional Independent dataset (Versteeg, 88 patients,
microarray dataset)
• Network centrality features
• 3000 DNNs / classification task
• DNNs: Rectified Linear Units (ReLU), Softmax function (2
outputs)
Train Test DNN SVM RF
Death from disease, centralities
Fischer-M
Fischer-M 87.3% 75.4% 75.1%
Fischer-R 82.1% 53.5% 66.8%
Versteeg 75.0% 53.3% 67.5%
Fischer-R
Fischer-R 85.8% 66.0% 62.4%
Fischer-M 81.5% 75.4% 61.2%
Versteeg 70.8% 68.3% 67.5%
Further evaluation using independent datasets
Deep neural nets using graph centrality- based
input features offer best prediction performance
* Article submitted in
cooperation with:
Example 2: Linking gene network centrality to anti-cancer drug response
• Biological relevance of central genes/proteins previously determined in several model organisms and phenotypes.
• Their predictive capability in gene co-expression networks in the specific context of cancer-related drug response remains to be
deeply investigated.
Hubs in a pan-cancer cell line co-expression
network are biologically meaningful and
predictive of drug responses
• A (linear) model based on the expression
of 47 hubs shows accurate drug sensitivity
prediction capability (CCLE and GDSC
datasets)
• Independent of expression platform
technology (microarrays, RNA-Seq, qPCR)
• Comparable performance to published
models
• Relative accurate predictions in other
independent cell lines and drugs
Linking gene centrality to anti-cancer drug response (cont.)
Predicted vs. actual drug
sensitivity in the CCLE dataset
Expression of autophagy-related genes accurately predicts anti-cancer drug response.
Example 3: Biological pathway-focused prediction of drug sensitivity
Tests on a leukemia patient dataset
Prediction accuracy in GDSC dataset
Patients treated with Cytarabine (Data from Farge et al., Cancer Disc 2017)
Article in
preparation in
cooperation with:
Takeaways:
• Many ML challenges in BM research are shared by different application domains, but
this field poses its unique challenges.
• Supervised learning, including e.g., deep learning, will meet many of these needs,
however: unbiased exploration, hypothesis generation and interpretation (incl.
“mechanistic”) are crucial.
• The use of graphs/networks to represent data, extract predictive features and
integrate datasets together with ML will continue enabling new discoveries and
applications closer to the patient.
Thanks to:
Funding from:
Bioinformatics team Our research partners in Luxembourg and abroad

More Related Content

What's hot

A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
 
Application of Microarray Technology and softcomputing in cancer Biology
Application of Microarray Technology and softcomputing in cancer BiologyApplication of Microarray Technology and softcomputing in cancer Biology
Application of Microarray Technology and softcomputing in cancer BiologyCSCJournals
 
Ontologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataOntologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataYannick Pouliot
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Seattle DAML meetup
 
An Overview on Gene Expression Analysis
An Overview on Gene Expression AnalysisAn Overview on Gene Expression Analysis
An Overview on Gene Expression AnalysisIOSR Journals
 
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...SOYEON KIM
 
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...ijsc
 
Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtuAqsa Qambrani
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...ijaia
 
Systems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSystems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSOYEON KIM
 
The Recent advances in gene delivery using nanostructures and future prospects
The Recent advances in gene delivery using nanostructures and future prospectsThe Recent advances in gene delivery using nanostructures and future prospects
The Recent advances in gene delivery using nanostructures and future prospectsAANBTJournal
 
Deep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveyDeep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveySOYEON KIM
 
Comparing prediction accuracy for machine learning and
Comparing prediction accuracy for machine learning andComparing prediction accuracy for machine learning and
Comparing prediction accuracy for machine learning andAlexander Decker
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataJoel Saltz
 

What's hot (18)

A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
 
Application of Microarray Technology and softcomputing in cancer Biology
Application of Microarray Technology and softcomputing in cancer BiologyApplication of Microarray Technology and softcomputing in cancer Biology
Application of Microarray Technology and softcomputing in cancer Biology
 
Ontologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataOntologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological Data
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria López
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
An Overview on Gene Expression Analysis
An Overview on Gene Expression AnalysisAn Overview on Gene Expression Analysis
An Overview on Gene Expression Analysis
 
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
 
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
 
Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtu
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
 
Systems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSystems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traits
 
The Recent advances in gene delivery using nanostructures and future prospects
The Recent advances in gene delivery using nanostructures and future prospectsThe Recent advances in gene delivery using nanostructures and future prospects
The Recent advances in gene delivery using nanostructures and future prospects
 
Deep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveyDeep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a survey
 
Comparing prediction accuracy for machine learning and
Comparing prediction accuracy for machine learning andComparing prediction accuracy for machine learning and
Comparing prediction accuracy for machine learning and
 
Bio ontology drtc-seminar_anwesha
Bio ontology drtc-seminar_anweshaBio ontology drtc-seminar_anwesha
Bio ontology drtc-seminar_anwesha
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming Data
 

Similar to NTU-2019

Challenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchChallenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchFranciscoJAzuajeG
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized MedicineEdgewater
 
Friend harvard 2013-01-30
Friend harvard 2013-01-30Friend harvard 2013-01-30
Friend harvard 2013-01-30Sage Base
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experimentsHelena Deus
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...Human Variome Project
 
Enabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceEnabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceOla Spjuth
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance
 
NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016Warren Kibbe
 
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerExtreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerJoel Saltz
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeJoel Saltz
 
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMEDDataScienceConferenc1
 
Large scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyLarge scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyMaté Ongenaert
 
Realising the potential of Health Data Science: opportunities and challenges ...
Realising the potential of Health Data Science:opportunities and challenges ...Realising the potential of Health Data Science:opportunities and challenges ...
Realising the potential of Health Data Science: opportunities and challenges ...Paolo Missier
 
Math, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical ResearchMath, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical ResearchJessica Minnier
 
Pathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicinePathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicineJoel Saltz
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei LinChien-Wei Lin
 
Introduction to systems medicine
Introduction to systems medicineIntroduction to systems medicine
Introduction to systems medicineimprovemed
 
Forum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decadeForum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decadeJoaquin Dopazo
 

Similar to NTU-2019 (20)

Challenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchChallenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical research
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
 
Friend harvard 2013-01-30
Friend harvard 2013-01-30Friend harvard 2013-01-30
Friend harvard 2013-01-30
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
 
Enabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceEnabling Translational Medicine with e-Science
Enabling Translational Medicine with e-Science
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016
 
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerExtreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase Change
 
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
 
Large scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyLarge scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biology
 
Realising the potential of Health Data Science: opportunities and challenges ...
Realising the potential of Health Data Science:opportunities and challenges ...Realising the potential of Health Data Science:opportunities and challenges ...
Realising the potential of Health Data Science: opportunities and challenges ...
 
Dalton
DaltonDalton
Dalton
 
Dalton presentation
Dalton presentationDalton presentation
Dalton presentation
 
Math, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical ResearchMath, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical Research
 
Pathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicinePathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision Medicine
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
Introduction to systems medicine
Introduction to systems medicineIntroduction to systems medicine
Introduction to systems medicine
 
Forum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decadeForum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decade
 

Recently uploaded

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 

Recently uploaded (20)

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 

NTU-2019

  • 1. Machine Learning and Network Analysis Approaches for Predicting Clinically Relevant Outcomes Francisco Azuaje, PhD Head of Bioinformatics Luxembourg Institute of Health (LIH)
  • 2. 304 citations (Source: Google Scholar)
  • 3. Graphics by M. Fraiture
  • 4. Mission To enable patient-oriented research and biological understanding through advanced computational approaches Bioinformatics research and support @ LIH F. Azuaje (PI) P. Nazarov (Scientist) T. Kaoma (Bioinformatician) S.Y Kim (Bioinformatician) A. Muller (Bioinformatician) K. Baum (Postdoc. Fellow, part-time) Y. Zhang (PhD. Candidate) Members: (April 2019) + MSc research students
  • 5. DataQuestions Approaches Outcomes Diagnostic Prognostic Predictive (drug response) Other descriptive/modeling Multi. sources/ technologies Multi-omics Clinically-relevant Cells, animals, patients Statistical models Machine learning Network-based models Their combinations Biological understanding Candidate biomarkers, drugs and targets Software, workflows Our research activities Collaborations National and international Leading and non-leading partner Funding targets FNR EU
  • 7. (Topol, 2014, Cell) (Eisenstein, 2015, Nature) Biomedical research: larger and diverse datasets High inter-individual variabilityDatasets change in time and space High intra-individual variability
  • 8. Key challenges in the field Heterogeneity: Data, events, states, within and between individuals… Data not always “big”: relative lack of labelled data, curse of dimensionality Data: multi-layered, hierarchical For same data type/layer: multiple measurement platforms
  • 9. Shared, key challenges in the field (2) Interpretability, understandability: Global and local, novelty and consistency with prior knowledge Reproducibility: Crucial requirement “Gold standards”/”ground truth”: Lack, limitations Complexity of pattern recurrence, regularities
  • 10. Addressing key challenges through combination of ML and biological network models Why networks? • Networks are intuitive and biologically-meaningful representations of biological data • Networks can be used to encode and visualize data, and more importantly: to extract features and make predictions about the data • Network-based models can address different predictive modelling challenges, including: multi-modal/-layered data analysis applications and interpretable models
  • 11. A biological network can be represented as a graph that is biologically meaningful From: McGillivray et al., 2018, Annu. Rev. Biomed. Data Sci.
  • 12.
  • 13. Using biological networks and machine learning for multi-omics patient stratification Hypothesis: information encoded in graphs is biologically relevant. Protein-protein network Jeong et al., Nature (2001) Patient similarity network
  • 14. Using biological networks and machine learning for multi-omics patient stratification (cont.) Global strategy Examples of centrality features • 4 categories of topological features: Centrality (12 measures), modularity features (from 7 to 153 features), diffusion features (1000), Node2Vec- derived features (256). • Each category generates a model • Integrated models (weighted voting) also investigated
  • 15. Application example (1): neuroblastoma multi-omic datasets from the CAMDA challenge Dataset 1 (498 patients, 2 omic datasets) Dataset 2 (142 patients, 3 omic datasets) Focus on Data 1 6,300 classification models • Models based on graph topology features outperform models based on “classical” approach • Among topological features, centrality metrics are most predictive (followed by diffusion-based features)
  • 16. Application example (2): Neuroblastoma multi-omics datasets from the CAMDA challenge, a deep learning approach* Global strategy Algorithm Parameters Balanced accuracy Death from disease, Fischer-M DNN h=[8,8,8,2], o=Adam, lr=1e-3, d=0.3 87.3% * SVM t=RBF, c=64, g=0.25 75.4% RF n=100 75.1% * Disease progression, Fischer DNN h=[4,2,2,2], o=Adam, lr=1e-3, d=0.3 84.7% * SVM t=RBF, c=16, g=0.0625 81.8% RF n=100 78.1% *• Network features from each dataset: Centrality (12), modularity (30 to 47) features. • Models based on each feature category, and their combination • Data: 498 patients (2 omic datasets, gene expression data) • Training (50% of total data), validation and test datasets • DNNs: multiple architectures, Rectified Linear Units (ReLU), Softmax function (2 outputs) Prediction performance on test dataset (top models) Top DNN: Input features are graph centrality measures Fischer-M: 1 dataset only (microarrays) Fischer: Combination of 2 datasets (microarrays and RNA-Seq) * Article submitted in cooperation with:
  • 17. Global strategy • Additional Independent dataset (Versteeg, 88 patients, microarray dataset) • Network centrality features • 3000 DNNs / classification task • DNNs: Rectified Linear Units (ReLU), Softmax function (2 outputs) Train Test DNN SVM RF Death from disease, centralities Fischer-M Fischer-M 87.3% 75.4% 75.1% Fischer-R 82.1% 53.5% 66.8% Versteeg 75.0% 53.3% 67.5% Fischer-R Fischer-R 85.8% 66.0% 62.4% Fischer-M 81.5% 75.4% 61.2% Versteeg 70.8% 68.3% 67.5% Further evaluation using independent datasets Deep neural nets using graph centrality- based input features offer best prediction performance * Article submitted in cooperation with:
  • 18. Example 2: Linking gene network centrality to anti-cancer drug response • Biological relevance of central genes/proteins previously determined in several model organisms and phenotypes. • Their predictive capability in gene co-expression networks in the specific context of cancer-related drug response remains to be deeply investigated. Hubs in a pan-cancer cell line co-expression network are biologically meaningful and predictive of drug responses
  • 19. • A (linear) model based on the expression of 47 hubs shows accurate drug sensitivity prediction capability (CCLE and GDSC datasets) • Independent of expression platform technology (microarrays, RNA-Seq, qPCR) • Comparable performance to published models • Relative accurate predictions in other independent cell lines and drugs Linking gene centrality to anti-cancer drug response (cont.) Predicted vs. actual drug sensitivity in the CCLE dataset
  • 20. Expression of autophagy-related genes accurately predicts anti-cancer drug response. Example 3: Biological pathway-focused prediction of drug sensitivity Tests on a leukemia patient dataset Prediction accuracy in GDSC dataset Patients treated with Cytarabine (Data from Farge et al., Cancer Disc 2017) Article in preparation in cooperation with:
  • 21.
  • 22. Takeaways: • Many ML challenges in BM research are shared by different application domains, but this field poses its unique challenges. • Supervised learning, including e.g., deep learning, will meet many of these needs, however: unbiased exploration, hypothesis generation and interpretation (incl. “mechanistic”) are crucial. • The use of graphs/networks to represent data, extract predictive features and integrate datasets together with ML will continue enabling new discoveries and applications closer to the patient.
  • 23. Thanks to: Funding from: Bioinformatics team Our research partners in Luxembourg and abroad