SlideShare a Scribd company logo
1 of 35
Download to read offline
Representation of metabolomic data with wavelets
Nathalie Villa-Vialaneix
http://www.nathalievilla.org
Toulouse School of Economics
Workgroup BioPuces, INRA de Castanet
June 5th, 2009
BioPuces (05/06/09) Nathalie Villa Metabolomic data 1 / 16
Sommaire
1 Database presentation
2 Wavelet representation
3 Perspective of work
BioPuces (05/06/09) Nathalie Villa Metabolomic data 2 / 16
Database presentation
Sommaire
1 Database presentation
2 Wavelet representation
3 Perspective of work
BioPuces (05/06/09) Nathalie Villa Metabolomic data 3 / 16
Database presentation
Basics about the data base
The database was given by Alain Paris (INRA) and consists of
metabolomic registration (H NMR) from urine of mice.
950 variables from 0.505 ppm to 9.995 ppm.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 4 / 16
Database presentation
Basics about the data base
The database was given by Alain Paris (INRA) and consists of
metabolomic registration (H NMR) from urine of mice.
950 variables from 0.505 ppm to 9.995 ppm.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 4 / 16
Database presentation
Basics about the data base
The database was given by Alain Paris (INRA) and consists of
metabolomic registration (H NMR) from urine of mice.
950 variables from 0.505 ppm to 9.995 ppm.
Baseline has been removed and peaks have been aligned.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 4 / 16
Database presentation
Purpose of the work
Study the effects of the ingestion of Hypochoeris radicata (HR) on the
metabolism: the inflorescences of this plant are known to be responsible
for a horse desease, the Australian stringhalt.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 5 / 16
Database presentation
Purpose of the work
Study the effects of the ingestion of Hypochoeris radicata (HR) on the
metabolism: the inflorescences of this plant are known to be responsible
for a horse desease, the Australian stringhalt.
As it is hard to obtain several dizains of horses to kill them, the
experiments have been conducted on 72 mice.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 5 / 16
Database presentation
Description of the experiment
72 mice from:
2 sexes 36 males 36 females
BioPuces (05/06/09) Nathalie Villa Metabolomic data 6 / 16
Database presentation
Description of the experiment
72 mice from:
2 sexes 36 males 36 females
3 kinds of HR doses 0 (control) : 24 mice 3%: 24 mice 9%: 24 mice
BioPuces (05/06/09) Nathalie Villa Metabolomic data 6 / 16
Database presentation
Description of the experiment
72 mice from:
2 sexes 36 males 36 females
3 kinds of HR doses 0 (control) : 24 mice 3%: 24 mice 9%: 24 mice
3 sacrifice dates 8th day: 24 mice 15th: 24 mice 21st: 24 mice
BioPuces (05/06/09) Nathalie Villa Metabolomic data 6 / 16
Database presentation
Description of the experiment
72 mice from:
2 sexes 36 males 36 females
3 kinds of HR doses 0 (control) : 24 mice 3%: 24 mice 9%: 24 mice
3 sacrifice dates 8th day: 24 mice 15th: 24 mice 21st: 24 mice
⇒ 18 groups.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 6 / 16
Database presentation
Measurements days
The urine was collected:
Days 0 1 4 8 11 15 18 21
Nb of observations 68 68 68 66 46 44 19 18
BioPuces (05/06/09) Nathalie Villa Metabolomic data 7 / 16
Database presentation
Measurements days
The urine was collected:
Days 0 1 4 8 11 15 18 21
Nb of observations 68 68 68 66 46 44 19 18
For each mice, from 2 to 22 measurements are made.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 7 / 16
Database presentation
Measurements days
The urine was collected:
Days 0 1 4 8 11 15 18 21
Nb of observations 68 68 68 66 46 44 19 18
For each mice, from 2 to 22 measurements are made.
In conclusion, 397 observations for 950 variables.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 7 / 16
Wavelet representation
Sommaire
1 Database presentation
2 Wavelet representation
3 Perspective of work
BioPuces (05/06/09) Nathalie Villa Metabolomic data 8 / 16
Wavelet representation
Basic principle of wavelets
For a given J integer, the spectra can be expressed at level J as:
f(x) =
k
αk 2−J/2
Ψ(2−J
x − k) +
J
j=1 k
βjk 2−j/2
Φ 2−j
x − k
BioPuces (05/06/09) Nathalie Villa Metabolomic data 9 / 16
Wavelet representation
Basic principle of wavelets
For a given J integer, the spectra can be expressed at level J as:
f(x) =
k
αk 2−J/2
Ψ(2−J
x − k)
Trend: based on the father wavelet Ψ
+
J
j=1 k
βjk 2−j/2
Φ 2−j
x − k
BioPuces (05/06/09) Nathalie Villa Metabolomic data 9 / 16
Wavelet representation
Basic principle of wavelets
For a given J integer, the spectra can be expressed at level J as:
f(x) =
k
αk 2−J/2
Ψ(2−J
x − k)
Trend: based on the father wavelet Ψ
+
J
j=1 k
βjk 2−j/2
Φ 2−j
x − k
Details at levels 1,...,J: based on the mother wavelet Φ
BioPuces (05/06/09) Nathalie Villa Metabolomic data 9 / 16
Wavelet representation
Hierarchical decomposition
We add 74 zero values at the end of the spectra to have a diadic discrete
sampling.
Original Data: f observed at t1 ... t1024 equally spaced
BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
Wavelet representation
Hierarchical decomposition
We add 74 zero values at the end of the spectra to have a diadic discrete
sampling.
Original Data: f observed at t1 ... t1024 equally spaced
↓
Level 1 Trend Details
BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
Wavelet representation
Hierarchical decomposition
We add 74 zero values at the end of the spectra to have a diadic discrete
sampling.
Original Data: f observed at t1 ... t1024 equally spaced
↓
Level 1 Trend Details
↓
Level 2 Trend Details
BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
Wavelet representation
Hierarchical decomposition
We add 74 zero values at the end of the spectra to have a diadic discrete
sampling.
Original Data: f observed at t1 ... t1024 equally spaced
↓
Level 1 Trend Details
↓
Level 2 Trend Details
. . .
↓
Level 9 Trend Details
BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
Wavelet representation
Hierarchical decomposition
We add 74 zero values at the end of the spectra to have a diadic discrete
sampling.
Original Data: f observed at t1 ... t1024 equally spaced
↓
Level 1 Trend Details
↓
Level 2 Trend Details
. . .
↓
Level 9 Trend Details
⇒ At level 9 (maximum level with 1024 length discrete sampling), we
obtain 1025 coefficients.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
Wavelet representation
Examples
Trend Details
BioPuces (05/06/09) Nathalie Villa Metabolomic data 11 / 16
Wavelet representation
Denoising
For coefficients corresponding to details greater than J (with J large
enough), a filtering is made:
c∗
=
0 if |c| < 2 log 10ˆσ
c if |c| ≥ 2 log 10ˆσ
(Donoho and Johnstone)
BioPuces (05/06/09) Nathalie Villa Metabolomic data 12 / 16
Wavelet representation
Denoising
For coefficients corresponding to details greater than J (with J large
enough), a filtering is made:
c∗
=
0 if |c| < 2 log 10ˆσ
c if |c| ≥ 2 log 10ˆσ
(Donoho and Johnstone)
Two parameters are to be tuned:
• Which wavelet has to be used?
• Which J has to be used?
to make a trade-off between quality of the reconstruction of the function
(what are the values on the functions built on the the basis of the filtered
coefficients?) and the number of non negative coefficients.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 12 / 16
Wavelet representation
Denoising
For coefficients corresponding to details greater than J (with J large
enough), a filtering is made:
c∗
=
0 if |c| < 2 log 10ˆσ
c if |c| ≥ 2 log 10ˆσ
(Donoho and Johnstone)
Two parameters are to be tuned:
• Which wavelet has to be used?
• Which J has to be used?
to make a trade-off between quality of the reconstruction of the function
(what are the values on the functions built on the the basis of the filtered
coefficients?) and the number of non negative coefficients.
Minimization of an empirical (self-created) quality criterium:
1
n
i
1
D
j
fi(tj) − ˆfi(tj)
2
+
Nb of non negative coefficients
Nb of coefficients
BioPuces (05/06/09) Nathalie Villa Metabolomic data 12 / 16
Wavelet representation
Final reconstruction of the data
274 positive coefficients
BioPuces (05/06/09) Nathalie Villa Metabolomic data 13 / 16
Wavelet representation
Boxplots
Original coefficients
BioPuces (05/06/09) Nathalie Villa Metabolomic data 14 / 16
Wavelet representation
Boxplots
Scaled coefficients (reduction by mean and standard deviation)
BioPuces (05/06/09) Nathalie Villa Metabolomic data 14 / 16
Perspective of work
Sommaire
1 Database presentation
2 Wavelet representation
3 Perspective of work
BioPuces (05/06/09) Nathalie Villa Metabolomic data 15 / 16
Perspective of work
Using random forests
The idea is to use random forest to make prediction and also extract the
main coefficients responsible for the explanation of the target variables.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 16 / 16
Perspective of work
Using random forests
The idea is to use random forest to make prediction and also extract the
main coefficients responsible for the explanation of the target variables.
Proposed regression: the scale coefficients will be the explanatory
variables. The variable of interest could be:
• the dose (either as a number or as a class leading to a classification
problem);
• the total dose injected (i.e., the dose multiplied by the number of
days of ingestion);
• any other interesting idea?
BioPuces (05/06/09) Nathalie Villa Metabolomic data 16 / 16
Perspective of work
Using random forests
The idea is to use random forest to make prediction and also extract the
main coefficients responsible for the explanation of the target variables.
Proposed regression: the scale coefficients will be the explanatory
variables. The variable of interest could be:
• the dose (either as a number or as a class leading to a classification
problem);
• the total dose injected (i.e., the dose multiplied by the number of
days of ingestion);
• any other interesting idea?
The idea is to rebuilt the individuals from the main coefficients (putting the
others to zero) to see which peaks are different from one group to the
others.
BioPuces (05/06/09) Nathalie Villa Metabolomic data 16 / 16

More Related Content

Viewers also liked

Multiple kernel Self-Organizing Maps
Multiple kernel Self-Organizing MapsMultiple kernel Self-Organizing Maps
Multiple kernel Self-Organizing Mapstuxette
 
Mining co-expression network
Mining co-expression networkMining co-expression network
Mining co-expression networktuxette
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"tuxette
 
Graph mining with kernel self-organizing map
Graph mining with kernel self-organizing mapGraph mining with kernel self-organizing map
Graph mining with kernel self-organizing maptuxette
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional datatuxette
 
Several nonlinear models and methods for FDA
Several nonlinear models and methods for FDASeveral nonlinear models and methods for FDA
Several nonlinear models and methods for FDAtuxette
 
Large network analysis : visualization and clustering
Large network analysis : visualization and clusteringLarge network analysis : visualization and clustering
Large network analysis : visualization and clusteringtuxette
 
FDA and Statistical learning theory
FDA and Statistical learning theoryFDA and Statistical learning theory
FDA and Statistical learning theorytuxette
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learningtuxette
 
Topographic graph clustering with kernel and dissimilarity methods
Topographic graph clustering with kernel and dissimilarity methodsTopographic graph clustering with kernel and dissimilarity methods
Topographic graph clustering with kernel and dissimilarity methodstuxette
 

Viewers also liked (12)

Multiple kernel Self-Organizing Maps
Multiple kernel Self-Organizing MapsMultiple kernel Self-Organizing Maps
Multiple kernel Self-Organizing Maps
 
Mining co-expression network
Mining co-expression networkMining co-expression network
Mining co-expression network
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"
 
Graph mining with kernel self-organizing map
Graph mining with kernel self-organizing mapGraph mining with kernel self-organizing map
Graph mining with kernel self-organizing map
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional data
 
Several nonlinear models and methods for FDA
Several nonlinear models and methods for FDASeveral nonlinear models and methods for FDA
Several nonlinear models and methods for FDA
 
Large network analysis : visualization and clustering
Large network analysis : visualization and clusteringLarge network analysis : visualization and clustering
Large network analysis : visualization and clustering
 
FDA and Statistical learning theory
FDA and Statistical learning theoryFDA and Statistical learning theory
FDA and Statistical learning theory
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Topographic graph clustering with kernel and dissimilarity methods
Topographic graph clustering with kernel and dissimilarity methodsTopographic graph clustering with kernel and dissimilarity methods
Topographic graph clustering with kernel and dissimilarity methods
 
Classroom arrangement
Classroom arrangementClassroom arrangement
Classroom arrangement
 
Classroom arrangement
Classroom arrangementClassroom arrangement
Classroom arrangement
 

Similar to Representation of metabolomic data with wavelets

Metabolomic data: combining wavelet representation with learning approaches
Metabolomic data: combining wavelet representation with learning approachesMetabolomic data: combining wavelet representation with learning approaches
Metabolomic data: combining wavelet representation with learning approachestuxette
 
Étude du pathobiome respiratoire chez les jeunes bovins atteints de bronchopn...
Étude du pathobiome respiratoire chez les jeunes bovins atteints de bronchopn...Étude du pathobiome respiratoire chez les jeunes bovins atteints de bronchopn...
Étude du pathobiome respiratoire chez les jeunes bovins atteints de bronchopn...tuxette
 
GMI proficiency testing- Progress report 2016
GMI proficiency testing- Progress report 2016GMI proficiency testing- Progress report 2016
GMI proficiency testing- Progress report 2016ExternalEvents
 
Inference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' worldInference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' worldJoe Parker
 
When Biology Meets Computer Science
When Biology Meets Computer ScienceWhen Biology Meets Computer Science
When Biology Meets Computer ScienceGeeks Anonymes
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaBarry Hardy
 
ESAI-CEU-UCH solution for American Epilepsy Society Seizure Prediction Challenge
ESAI-CEU-UCH solution for American Epilepsy Society Seizure Prediction ChallengeESAI-CEU-UCH solution for American Epilepsy Society Seizure Prediction Challenge
ESAI-CEU-UCH solution for American Epilepsy Society Seizure Prediction ChallengeFrancisco Zamora-Martinez
 
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...Claudia Vitolo
 
Open Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisOpen Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisAntica Culina
 
Open Data and Ecological and Evolutionary synthesis
Open Data and Ecological and Evolutionary synthesisOpen Data and Ecological and Evolutionary synthesis
Open Data and Ecological and Evolutionary synthesisAntica Culina
 
Computing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsComputing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsPierre Pudlo
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetstuxette
 
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...Hybrid bat-ant colony optimization algorithm for rule-based feature selection...
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...IJECEIAES
 
Internal examination 3rd semester disaster
Internal examination 3rd semester disasterInternal examination 3rd semester disaster
Internal examination 3rd semester disasterMahendra Poudel
 
American Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk Universitymcdonadt
 
Integrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningIntegrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningtuxette
 
On the Modeling of the Three Types of Non-Spiking Neurons of the Caenorhabdit...
On the Modeling of the Three Types of Non-Spiking Neurons of the Caenorhabdit...On the Modeling of the Three Types of Non-Spiking Neurons of the Caenorhabdit...
On the Modeling of the Three Types of Non-Spiking Neurons of the Caenorhabdit...Juan Luis Jiménez Laredo
 

Similar to Representation of metabolomic data with wavelets (20)

Metabolomic data: combining wavelet representation with learning approaches
Metabolomic data: combining wavelet representation with learning approachesMetabolomic data: combining wavelet representation with learning approaches
Metabolomic data: combining wavelet representation with learning approaches
 
Étude du pathobiome respiratoire chez les jeunes bovins atteints de bronchopn...
Étude du pathobiome respiratoire chez les jeunes bovins atteints de bronchopn...Étude du pathobiome respiratoire chez les jeunes bovins atteints de bronchopn...
Étude du pathobiome respiratoire chez les jeunes bovins atteints de bronchopn...
 
GMI proficiency testing- Progress report 2016
GMI proficiency testing- Progress report 2016GMI proficiency testing- Progress report 2016
GMI proficiency testing- Progress report 2016
 
Inference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' worldInference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' world
 
When Biology Meets Computer Science
When Biology Meets Computer ScienceWhen Biology Meets Computer Science
When Biology Meets Computer Science
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malaria
 
ESAI-CEU-UCH solution for American Epilepsy Society Seizure Prediction Challenge
ESAI-CEU-UCH solution for American Epilepsy Society Seizure Prediction ChallengeESAI-CEU-UCH solution for American Epilepsy Society Seizure Prediction Challenge
ESAI-CEU-UCH solution for American Epilepsy Society Seizure Prediction Challenge
 
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
 
Open Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisOpen Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysis
 
Open Data and Ecological and Evolutionary synthesis
Open Data and Ecological and Evolutionary synthesisOpen Data and Ecological and Evolutionary synthesis
Open Data and Ecological and Evolutionary synthesis
 
Computing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsComputing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population genetics
 
ERVA-NMR
ERVA-NMRERVA-NMR
ERVA-NMR
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasets
 
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...Hybrid bat-ant colony optimization algorithm for rule-based feature selection...
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...
 
Internal examination 3rd semester disaster
Internal examination 3rd semester disasterInternal examination 3rd semester disaster
Internal examination 3rd semester disaster
 
C04821220
C04821220C04821220
C04821220
 
American Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk University
 
Integrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningIntegrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learning
 
On the Modeling of the Three Types of Non-Spiking Neurons of the Caenorhabdit...
On the Modeling of the Three Types of Non-Spiking Neurons of the Caenorhabdit...On the Modeling of the Three Types of Non-Spiking Neurons of the Caenorhabdit...
On the Modeling of the Three Types of Non-Spiking Neurons of the Caenorhabdit...
 
LiveSense
LiveSenseLiveSense
LiveSense
 

More from tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 

More from tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Recently uploaded

办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭o8wvnojp
 
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ EscortsDelhi Escorts Service
 
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service DhuleDhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhulesrsj9000
 
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot AndCall Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot AndPooja Nehwal
 
E J Waggoner against Kellogg's Pantheism 8.pptx
E J Waggoner against Kellogg's Pantheism 8.pptxE J Waggoner against Kellogg's Pantheism 8.pptx
E J Waggoner against Kellogg's Pantheism 8.pptxJackieSparrow3
 
social media chat application main ppt.pptx
social media chat application main ppt.pptxsocial media chat application main ppt.pptx
social media chat application main ppt.pptxsprasad829829
 
南新罕布什尔大学毕业证学位证成绩单-学历认证
南新罕布什尔大学毕业证学位证成绩单-学历认证南新罕布什尔大学毕业证学位证成绩单-学历认证
南新罕布什尔大学毕业证学位证成绩单-学历认证kbdhl05e
 
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改atducpo
 
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...ur8mqw8e
 
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...Authentic No 1 Amil Baba In Pakistan
 
西伦敦大学毕业证学位证成绩单-怎么样做
西伦敦大学毕业证学位证成绩单-怎么样做西伦敦大学毕业证学位证成绩单-怎么样做
西伦敦大学毕业证学位证成绩单-怎么样做j5bzwet6
 
Ahmedabad Escorts Girl Services For Male Tourists 9537192988
Ahmedabad Escorts Girl Services For Male Tourists 9537192988Ahmedabad Escorts Girl Services For Male Tourists 9537192988
Ahmedabad Escorts Girl Services For Male Tourists 9537192988oolala9823
 
Powerpoint on Writing a Newspaper Report.pptx
Powerpoint on Writing a Newspaper Report.pptxPowerpoint on Writing a Newspaper Report.pptx
Powerpoint on Writing a Newspaper Report.pptxNeelamMulchandani1
 
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdfBreath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdfJess Walker
 
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 AvilableCall Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilabledollysharma2066
 

Recently uploaded (20)

办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭
 
🔝9953056974🔝!!-YOUNG BOOK model Call Girls In Aerocity Delhi Escort service
🔝9953056974🔝!!-YOUNG BOOK model Call Girls In Aerocity Delhi Escort service🔝9953056974🔝!!-YOUNG BOOK model Call Girls In Aerocity Delhi Escort service
🔝9953056974🔝!!-YOUNG BOOK model Call Girls In Aerocity Delhi Escort service
 
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
 
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service DhuleDhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
 
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot AndCall Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
 
E J Waggoner against Kellogg's Pantheism 8.pptx
E J Waggoner against Kellogg's Pantheism 8.pptxE J Waggoner against Kellogg's Pantheism 8.pptx
E J Waggoner against Kellogg's Pantheism 8.pptx
 
social media chat application main ppt.pptx
social media chat application main ppt.pptxsocial media chat application main ppt.pptx
social media chat application main ppt.pptx
 
南新罕布什尔大学毕业证学位证成绩单-学历认证
南新罕布什尔大学毕业证学位证成绩单-学历认证南新罕布什尔大学毕业证学位证成绩单-学历认证
南新罕布什尔大学毕业证学位证成绩单-学历认证
 
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
 
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
 
Cheap Rate ➥8448380779 ▻Call Girls In Mg Road Gurgaon
Cheap Rate ➥8448380779 ▻Call Girls In Mg Road GurgaonCheap Rate ➥8448380779 ▻Call Girls In Mg Road Gurgaon
Cheap Rate ➥8448380779 ▻Call Girls In Mg Road Gurgaon
 
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
 
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
 
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
 
西伦敦大学毕业证学位证成绩单-怎么样做
西伦敦大学毕业证学位证成绩单-怎么样做西伦敦大学毕业证学位证成绩单-怎么样做
西伦敦大学毕业证学位证成绩单-怎么样做
 
Ahmedabad Escorts Girl Services For Male Tourists 9537192988
Ahmedabad Escorts Girl Services For Male Tourists 9537192988Ahmedabad Escorts Girl Services For Male Tourists 9537192988
Ahmedabad Escorts Girl Services For Male Tourists 9537192988
 
Powerpoint on Writing a Newspaper Report.pptx
Powerpoint on Writing a Newspaper Report.pptxPowerpoint on Writing a Newspaper Report.pptx
Powerpoint on Writing a Newspaper Report.pptx
 
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdfBreath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
 
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 AvilableCall Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
 

Representation of metabolomic data with wavelets

  • 1. Representation of metabolomic data with wavelets Nathalie Villa-Vialaneix http://www.nathalievilla.org Toulouse School of Economics Workgroup BioPuces, INRA de Castanet June 5th, 2009 BioPuces (05/06/09) Nathalie Villa Metabolomic data 1 / 16
  • 2. Sommaire 1 Database presentation 2 Wavelet representation 3 Perspective of work BioPuces (05/06/09) Nathalie Villa Metabolomic data 2 / 16
  • 3. Database presentation Sommaire 1 Database presentation 2 Wavelet representation 3 Perspective of work BioPuces (05/06/09) Nathalie Villa Metabolomic data 3 / 16
  • 4. Database presentation Basics about the data base The database was given by Alain Paris (INRA) and consists of metabolomic registration (H NMR) from urine of mice. 950 variables from 0.505 ppm to 9.995 ppm. BioPuces (05/06/09) Nathalie Villa Metabolomic data 4 / 16
  • 5. Database presentation Basics about the data base The database was given by Alain Paris (INRA) and consists of metabolomic registration (H NMR) from urine of mice. 950 variables from 0.505 ppm to 9.995 ppm. BioPuces (05/06/09) Nathalie Villa Metabolomic data 4 / 16
  • 6. Database presentation Basics about the data base The database was given by Alain Paris (INRA) and consists of metabolomic registration (H NMR) from urine of mice. 950 variables from 0.505 ppm to 9.995 ppm. Baseline has been removed and peaks have been aligned. BioPuces (05/06/09) Nathalie Villa Metabolomic data 4 / 16
  • 7. Database presentation Purpose of the work Study the effects of the ingestion of Hypochoeris radicata (HR) on the metabolism: the inflorescences of this plant are known to be responsible for a horse desease, the Australian stringhalt. BioPuces (05/06/09) Nathalie Villa Metabolomic data 5 / 16
  • 8. Database presentation Purpose of the work Study the effects of the ingestion of Hypochoeris radicata (HR) on the metabolism: the inflorescences of this plant are known to be responsible for a horse desease, the Australian stringhalt. As it is hard to obtain several dizains of horses to kill them, the experiments have been conducted on 72 mice. BioPuces (05/06/09) Nathalie Villa Metabolomic data 5 / 16
  • 9. Database presentation Description of the experiment 72 mice from: 2 sexes 36 males 36 females BioPuces (05/06/09) Nathalie Villa Metabolomic data 6 / 16
  • 10. Database presentation Description of the experiment 72 mice from: 2 sexes 36 males 36 females 3 kinds of HR doses 0 (control) : 24 mice 3%: 24 mice 9%: 24 mice BioPuces (05/06/09) Nathalie Villa Metabolomic data 6 / 16
  • 11. Database presentation Description of the experiment 72 mice from: 2 sexes 36 males 36 females 3 kinds of HR doses 0 (control) : 24 mice 3%: 24 mice 9%: 24 mice 3 sacrifice dates 8th day: 24 mice 15th: 24 mice 21st: 24 mice BioPuces (05/06/09) Nathalie Villa Metabolomic data 6 / 16
  • 12. Database presentation Description of the experiment 72 mice from: 2 sexes 36 males 36 females 3 kinds of HR doses 0 (control) : 24 mice 3%: 24 mice 9%: 24 mice 3 sacrifice dates 8th day: 24 mice 15th: 24 mice 21st: 24 mice ⇒ 18 groups. BioPuces (05/06/09) Nathalie Villa Metabolomic data 6 / 16
  • 13. Database presentation Measurements days The urine was collected: Days 0 1 4 8 11 15 18 21 Nb of observations 68 68 68 66 46 44 19 18 BioPuces (05/06/09) Nathalie Villa Metabolomic data 7 / 16
  • 14. Database presentation Measurements days The urine was collected: Days 0 1 4 8 11 15 18 21 Nb of observations 68 68 68 66 46 44 19 18 For each mice, from 2 to 22 measurements are made. BioPuces (05/06/09) Nathalie Villa Metabolomic data 7 / 16
  • 15. Database presentation Measurements days The urine was collected: Days 0 1 4 8 11 15 18 21 Nb of observations 68 68 68 66 46 44 19 18 For each mice, from 2 to 22 measurements are made. In conclusion, 397 observations for 950 variables. BioPuces (05/06/09) Nathalie Villa Metabolomic data 7 / 16
  • 16. Wavelet representation Sommaire 1 Database presentation 2 Wavelet representation 3 Perspective of work BioPuces (05/06/09) Nathalie Villa Metabolomic data 8 / 16
  • 17. Wavelet representation Basic principle of wavelets For a given J integer, the spectra can be expressed at level J as: f(x) = k αk 2−J/2 Ψ(2−J x − k) + J j=1 k βjk 2−j/2 Φ 2−j x − k BioPuces (05/06/09) Nathalie Villa Metabolomic data 9 / 16
  • 18. Wavelet representation Basic principle of wavelets For a given J integer, the spectra can be expressed at level J as: f(x) = k αk 2−J/2 Ψ(2−J x − k) Trend: based on the father wavelet Ψ + J j=1 k βjk 2−j/2 Φ 2−j x − k BioPuces (05/06/09) Nathalie Villa Metabolomic data 9 / 16
  • 19. Wavelet representation Basic principle of wavelets For a given J integer, the spectra can be expressed at level J as: f(x) = k αk 2−J/2 Ψ(2−J x − k) Trend: based on the father wavelet Ψ + J j=1 k βjk 2−j/2 Φ 2−j x − k Details at levels 1,...,J: based on the mother wavelet Φ BioPuces (05/06/09) Nathalie Villa Metabolomic data 9 / 16
  • 20. Wavelet representation Hierarchical decomposition We add 74 zero values at the end of the spectra to have a diadic discrete sampling. Original Data: f observed at t1 ... t1024 equally spaced BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
  • 21. Wavelet representation Hierarchical decomposition We add 74 zero values at the end of the spectra to have a diadic discrete sampling. Original Data: f observed at t1 ... t1024 equally spaced ↓ Level 1 Trend Details BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
  • 22. Wavelet representation Hierarchical decomposition We add 74 zero values at the end of the spectra to have a diadic discrete sampling. Original Data: f observed at t1 ... t1024 equally spaced ↓ Level 1 Trend Details ↓ Level 2 Trend Details BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
  • 23. Wavelet representation Hierarchical decomposition We add 74 zero values at the end of the spectra to have a diadic discrete sampling. Original Data: f observed at t1 ... t1024 equally spaced ↓ Level 1 Trend Details ↓ Level 2 Trend Details . . . ↓ Level 9 Trend Details BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
  • 24. Wavelet representation Hierarchical decomposition We add 74 zero values at the end of the spectra to have a diadic discrete sampling. Original Data: f observed at t1 ... t1024 equally spaced ↓ Level 1 Trend Details ↓ Level 2 Trend Details . . . ↓ Level 9 Trend Details ⇒ At level 9 (maximum level with 1024 length discrete sampling), we obtain 1025 coefficients. BioPuces (05/06/09) Nathalie Villa Metabolomic data 10 / 16
  • 25. Wavelet representation Examples Trend Details BioPuces (05/06/09) Nathalie Villa Metabolomic data 11 / 16
  • 26. Wavelet representation Denoising For coefficients corresponding to details greater than J (with J large enough), a filtering is made: c∗ = 0 if |c| < 2 log 10ˆσ c if |c| ≥ 2 log 10ˆσ (Donoho and Johnstone) BioPuces (05/06/09) Nathalie Villa Metabolomic data 12 / 16
  • 27. Wavelet representation Denoising For coefficients corresponding to details greater than J (with J large enough), a filtering is made: c∗ = 0 if |c| < 2 log 10ˆσ c if |c| ≥ 2 log 10ˆσ (Donoho and Johnstone) Two parameters are to be tuned: • Which wavelet has to be used? • Which J has to be used? to make a trade-off between quality of the reconstruction of the function (what are the values on the functions built on the the basis of the filtered coefficients?) and the number of non negative coefficients. BioPuces (05/06/09) Nathalie Villa Metabolomic data 12 / 16
  • 28. Wavelet representation Denoising For coefficients corresponding to details greater than J (with J large enough), a filtering is made: c∗ = 0 if |c| < 2 log 10ˆσ c if |c| ≥ 2 log 10ˆσ (Donoho and Johnstone) Two parameters are to be tuned: • Which wavelet has to be used? • Which J has to be used? to make a trade-off between quality of the reconstruction of the function (what are the values on the functions built on the the basis of the filtered coefficients?) and the number of non negative coefficients. Minimization of an empirical (self-created) quality criterium: 1 n i 1 D j fi(tj) − ˆfi(tj) 2 + Nb of non negative coefficients Nb of coefficients BioPuces (05/06/09) Nathalie Villa Metabolomic data 12 / 16
  • 29. Wavelet representation Final reconstruction of the data 274 positive coefficients BioPuces (05/06/09) Nathalie Villa Metabolomic data 13 / 16
  • 30. Wavelet representation Boxplots Original coefficients BioPuces (05/06/09) Nathalie Villa Metabolomic data 14 / 16
  • 31. Wavelet representation Boxplots Scaled coefficients (reduction by mean and standard deviation) BioPuces (05/06/09) Nathalie Villa Metabolomic data 14 / 16
  • 32. Perspective of work Sommaire 1 Database presentation 2 Wavelet representation 3 Perspective of work BioPuces (05/06/09) Nathalie Villa Metabolomic data 15 / 16
  • 33. Perspective of work Using random forests The idea is to use random forest to make prediction and also extract the main coefficients responsible for the explanation of the target variables. BioPuces (05/06/09) Nathalie Villa Metabolomic data 16 / 16
  • 34. Perspective of work Using random forests The idea is to use random forest to make prediction and also extract the main coefficients responsible for the explanation of the target variables. Proposed regression: the scale coefficients will be the explanatory variables. The variable of interest could be: • the dose (either as a number or as a class leading to a classification problem); • the total dose injected (i.e., the dose multiplied by the number of days of ingestion); • any other interesting idea? BioPuces (05/06/09) Nathalie Villa Metabolomic data 16 / 16
  • 35. Perspective of work Using random forests The idea is to use random forest to make prediction and also extract the main coefficients responsible for the explanation of the target variables. Proposed regression: the scale coefficients will be the explanatory variables. The variable of interest could be: • the dose (either as a number or as a class leading to a classification problem); • the total dose injected (i.e., the dose multiplied by the number of days of ingestion); • any other interesting idea? The idea is to rebuilt the individuals from the main coefficients (putting the others to zero) to see which peaks are different from one group to the others. BioPuces (05/06/09) Nathalie Villa Metabolomic data 16 / 16