SlideShare a Scribd company logo
1 of 19
Download to read offline
Apprentissage pour la biologie moléculaire et l’analyse de
données omiques
Nathalie Vialaneix
nathalie.vialaneix@inrae.fr
http://www.nathalievialaneix.eu
Journée scientifique d’unité
3 octobre 2022
This presentation covers topics of interest for...
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 2
Main topics related to INRAE scientific priority
INRAE2030
OS5: “Mobiliser la science des donnnées et les technologies du numérique au service
des transitions”
with various applications in OS 1.3, OS 2.2, OS 2.3, OS 3.3, OS 4.2 (adaptation of
species, sustainable farming, biomass treatment, ...)
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 3
Main topics related to INRAE scientific priority
INRAE2030
OS5: “Mobiliser la science des donnnées et les technologies du numérique au service
des transitions”
with various applications in OS 1.3, OS 2.2, OS 2.3, OS 3.3, OS 4.2 (adaptation of
species, sustainable farming, biomass treatment, ...)
SSD MathNum
GOS1: “Maı̂triser les méthodes pour acquérir, gérer et intégrer données et
connaissances face à la multiplication des sources d’information”
in interaction with
GOS 2: “Développer les méthodes de modélisation et d’analyse en vue de comprendre
et anticiper les trajectoires de systèmes complexes” (to be developed?)
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 3
A scientific activity focused on the molecular level
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 4
Main scientific questions
▶ Exploratory analysis: learn regulation from data, integrate multiple omics, ...
From expression data
−→
To gene network (regulatory?)
Example: SubtilNet, SUNRISE (PIA), PROBITY (ANR)
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 5
Main scientific questions
▶ Exploratory analysis: learn regulation from data, integrate multiple omics, ...
Example: PANORAMICS (ANR), AgroEnv (PEPR)
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 5
Main scientific questions
▶ Exploratory analysis: learn regulation from data, integrate multiple omics, ...
▶ Predictive biology: biomarker discovery, phenotype prediction
From data
−→
To phenotype
Example: Piglet survival, Differential analysis of chromatin conformation (HiC)
pig picture from https://dessin.fun
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 5
Scientific challenges
▶ very large dimensionality and big data (both scaling and statistical issues)
n ∼ {5 − 1000}
p ∼ 10{3−5}
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 6
Scientific challenges
▶ very large dimensionality and big data (both scaling and statistical issues)
▶ missing values and incomplete designs
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 6
Scientific challenges
▶ very large dimensionality and big data (both scaling and statistical issues)
▶ missing values and incomplete designs
▶ highly non Gaussian data: skewed distributions, count data, zero-inflated data, ...
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 6
Scientific challenges
▶ very large dimensionality and big data (both scaling and statistical issues)
▶ missing values and incomplete designs
▶ highly non Gaussian data: skewed distributions, count data, zero-inflated data, ...
▶ non Euclidean data: compositional data (metagenomics), similarity matrices
(Hi-C), spectra (metabolomics), ...
(*)
(**)
(*) image from [Dumuid et al., 2020] (**) image by courtesy of Gaëlle Lefort
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 6
Scientific challenges
▶ very large dimensionality and big data (both scaling and statistical issues)
▶ missing values and incomplete designs
▶ highly non Gaussian data: skewed distributions, count data, zero-inflated data, ...
▶ non Euclidean data: compositional data (metagenomics), similarity matrices
(Hi-C), spectra (metabolomics), ...
... used to capture a weak genotype (or omics) / phenotype signal.
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 6
Favorite methods
▶ Kernel methods
Example: Structure learning with kernel
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 7
Favorite methods
▶ Kernel methods
▶ Graphical models
From expression data
−→
To gene network (regulatory?)
Example: Structure learning with BN
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 7
Favorite methods
▶ Kernel methods
▶ Graphical models
▶ Neural networks
Example: Combining sequence and orthology information for automatic
annotation, Predict RNA modifications from sequencing data
▶ Random Forest, ...
picture from Wikimedia Commons, Cburnett
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 7
Collaborators
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 8
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 9
References
Dumuid, D., Pedišić, v., Palarea-Albaladejo, J., Martı́n-Fernández, J. A., Hron, K., and Olds, T. (2020).
Compositional data analysis in time-use epidemiology: what, why, how.
International Journal of Environmental Research and Public Health, 17(7):2220.
Journée scientifique d’unité
3 octobre 2022 / Nathalie Vialaneix
p. 9

More Related Content

Similar to Apprentissage pour la biologie moléculaire et l’analyse de données omiques

Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imagingScikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imagingGael Varoquaux
 
Big data in biology
Big data in biologyBig data in biology
Big data in biologyOmkar Reddy
 
Dirty data science machine learning on non-curated data
Dirty data science machine learning on non-curated dataDirty data science machine learning on non-curated data
Dirty data science machine learning on non-curated dataGael Varoquaux
 
CAS open glam_teil2_20150910
CAS open glam_teil2_20150910CAS open glam_teil2_20150910
CAS open glam_teil2_20150910Beat Estermann
 
Big&open data challenges for smartcity-PIC2014 Shanghai
Big&open data challenges for smartcity-PIC2014 ShanghaiBig&open data challenges for smartcity-PIC2014 Shanghai
Big&open data challenges for smartcity-PIC2014 ShanghaiVictoria López
 
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...Franck Michel
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchUniversity Medicine Greifswald
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology tuxette
 
NETSEED : a cross-disciplinary project to analyse how small farms contribute ...
NETSEED : a cross-disciplinary project to analyse how small farms contribute ...NETSEED : a cross-disciplinary project to analyse how small farms contribute ...
NETSEED : a cross-disciplinary project to analyse how small farms contribute ...Alison Specht
 
An inquiry based project for stem
An inquiry based project for stemAn inquiry based project for stem
An inquiry based project for stemLaura Bradford
 
Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Big Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experienceBig Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experienceAndreas Drakos
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansJameel Syed
 
data science and its role in big data analytics.pptx
data science and its role in big data analytics.pptxdata science and its role in big data analytics.pptx
data science and its role in big data analytics.pptxAkashVerma168555
 
Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...Franck Michel
 
A multi-dimension analysis of students' log files in algebra
A multi-dimension analysis of students' log files in algebraA multi-dimension analysis of students' log files in algebra
A multi-dimension analysis of students' log files in algebrabutest
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 

Similar to Apprentissage pour la biologie moléculaire et l’analyse de données omiques (20)

Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imagingScikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
 
Big data in biology
Big data in biologyBig data in biology
Big data in biology
 
Dirty data science machine learning on non-curated data
Dirty data science machine learning on non-curated dataDirty data science machine learning on non-curated data
Dirty data science machine learning on non-curated data
 
CAS open glam_teil2_20150910
CAS open glam_teil2_20150910CAS open glam_teil2_20150910
CAS open glam_teil2_20150910
 
0-introduction.pdf
0-introduction.pdf0-introduction.pdf
0-introduction.pdf
 
Big&open data challenges for smartcity-PIC2014 Shanghai
Big&open data challenges for smartcity-PIC2014 ShanghaiBig&open data challenges for smartcity-PIC2014 Shanghai
Big&open data challenges for smartcity-PIC2014 Shanghai
 
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
NETSEED : a cross-disciplinary project to analyse how small farms contribute ...
NETSEED : a cross-disciplinary project to analyse how small farms contribute ...NETSEED : a cross-disciplinary project to analyse how small farms contribute ...
NETSEED : a cross-disciplinary project to analyse how small farms contribute ...
 
An inquiry based project for stem
An inquiry based project for stemAn inquiry based project for stem
An inquiry based project for stem
 
Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Big Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experienceBig Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experience
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake Fans
 
data science and its role in big data analytics.pptx
data science and its role in big data analytics.pptxdata science and its role in big data analytics.pptx
data science and its role in big data analytics.pptx
 
Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...
 
A multi-dimension analysis of students' log files in algebra
A multi-dimension analysis of students' log files in algebraA multi-dimension analysis of students' log files in algebra
A multi-dimension analysis of students' log files in algebra
 
CODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon HodsonCODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon Hodson
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 

More from tuxette

Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...tuxette
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNNtuxette
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practicetuxette
 
La famille *down
La famille *downLa famille *down
La famille *downtuxette
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC datatuxette
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelstuxette
 
'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysistuxette
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphstuxette
 

More from tuxette (20)

Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
 
La famille *down
La famille *downLa famille *down
La famille *down
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernels
 
'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
 

Recently uploaded

OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxSulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxnoordubaliya2003
 

Recently uploaded (20)

OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxSulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
 

Apprentissage pour la biologie moléculaire et l’analyse de données omiques

  • 1. Apprentissage pour la biologie moléculaire et l’analyse de données omiques Nathalie Vialaneix nathalie.vialaneix@inrae.fr http://www.nathalievialaneix.eu Journée scientifique d’unité 3 octobre 2022
  • 2. This presentation covers topics of interest for... Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 2
  • 3. Main topics related to INRAE scientific priority INRAE2030 OS5: “Mobiliser la science des donnnées et les technologies du numérique au service des transitions” with various applications in OS 1.3, OS 2.2, OS 2.3, OS 3.3, OS 4.2 (adaptation of species, sustainable farming, biomass treatment, ...) Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 3
  • 4. Main topics related to INRAE scientific priority INRAE2030 OS5: “Mobiliser la science des donnnées et les technologies du numérique au service des transitions” with various applications in OS 1.3, OS 2.2, OS 2.3, OS 3.3, OS 4.2 (adaptation of species, sustainable farming, biomass treatment, ...) SSD MathNum GOS1: “Maı̂triser les méthodes pour acquérir, gérer et intégrer données et connaissances face à la multiplication des sources d’information” in interaction with GOS 2: “Développer les méthodes de modélisation et d’analyse en vue de comprendre et anticiper les trajectoires de systèmes complexes” (to be developed?) Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 3
  • 5. A scientific activity focused on the molecular level Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 4
  • 6. Main scientific questions ▶ Exploratory analysis: learn regulation from data, integrate multiple omics, ... From expression data −→ To gene network (regulatory?) Example: SubtilNet, SUNRISE (PIA), PROBITY (ANR) Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 5
  • 7. Main scientific questions ▶ Exploratory analysis: learn regulation from data, integrate multiple omics, ... Example: PANORAMICS (ANR), AgroEnv (PEPR) Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 5
  • 8. Main scientific questions ▶ Exploratory analysis: learn regulation from data, integrate multiple omics, ... ▶ Predictive biology: biomarker discovery, phenotype prediction From data −→ To phenotype Example: Piglet survival, Differential analysis of chromatin conformation (HiC) pig picture from https://dessin.fun Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 5
  • 9. Scientific challenges ▶ very large dimensionality and big data (both scaling and statistical issues) n ∼ {5 − 1000} p ∼ 10{3−5} Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 6
  • 10. Scientific challenges ▶ very large dimensionality and big data (both scaling and statistical issues) ▶ missing values and incomplete designs Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 6
  • 11. Scientific challenges ▶ very large dimensionality and big data (both scaling and statistical issues) ▶ missing values and incomplete designs ▶ highly non Gaussian data: skewed distributions, count data, zero-inflated data, ... Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 6
  • 12. Scientific challenges ▶ very large dimensionality and big data (both scaling and statistical issues) ▶ missing values and incomplete designs ▶ highly non Gaussian data: skewed distributions, count data, zero-inflated data, ... ▶ non Euclidean data: compositional data (metagenomics), similarity matrices (Hi-C), spectra (metabolomics), ... (*) (**) (*) image from [Dumuid et al., 2020] (**) image by courtesy of Gaëlle Lefort Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 6
  • 13. Scientific challenges ▶ very large dimensionality and big data (both scaling and statistical issues) ▶ missing values and incomplete designs ▶ highly non Gaussian data: skewed distributions, count data, zero-inflated data, ... ▶ non Euclidean data: compositional data (metagenomics), similarity matrices (Hi-C), spectra (metabolomics), ... ... used to capture a weak genotype (or omics) / phenotype signal. Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 6
  • 14. Favorite methods ▶ Kernel methods Example: Structure learning with kernel Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 7
  • 15. Favorite methods ▶ Kernel methods ▶ Graphical models From expression data −→ To gene network (regulatory?) Example: Structure learning with BN Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 7
  • 16. Favorite methods ▶ Kernel methods ▶ Graphical models ▶ Neural networks Example: Combining sequence and orthology information for automatic annotation, Predict RNA modifications from sequencing data ▶ Random Forest, ... picture from Wikimedia Commons, Cburnett Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 7
  • 17. Collaborators Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 8
  • 18. Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 9
  • 19. References Dumuid, D., Pedišić, v., Palarea-Albaladejo, J., Martı́n-Fernández, J. A., Hron, K., and Olds, T. (2020). Compositional data analysis in time-use epidemiology: what, why, how. International Journal of Environmental Research and Public Health, 17(7):2220. Journée scientifique d’unité 3 octobre 2022 / Nathalie Vialaneix p. 9