SlideShare a Scribd company logo
1 of 41
Download to read offline
Metabolomic data: combining wavelet
representation with learning approaches
Nathalie Villa-Vialaneix
http://www.nathalievilla.org
In collaboration with Noslen Hernández (CENATAV, La
Havane, Cuba) & Philippe Besse
IUT de Carcassonne (UPVD)
& Institut de Mathématiques de Toulouse
Groupe de travail BioPuces, INRA de Castanet
May 19th, 2010
1 / 23
Nathalie Villa-Vialaneix
Présentation générale
1 Presentation of the data
2 Wavelet preprocessing and normalization
3 Learning methods
4 Identification of relevant metabolites
2 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Presentation of the data
Data have been provided by Alain Paris (INRA): they are
metabolomic spectra (H NMR) from mice urine and consist of
950 variables (from 0.50 ppm to 9.99 ppm).
3 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Presentation of the data
Data have been provided by Alain Paris (INRA): they are
metabolomic spectra (H NMR) from mice urine and consist of
950 variables (from 0.50 ppm to 9.99 ppm).
3 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Presentation of the data
Data have been provided by Alain Paris (INRA): they are
metabolomic spectra (H NMR) from mice urine and consist of
950 variables (from 0.50 ppm to 9.99 ppm).
Peaks have been aligned and baseline has been removed.
3 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Biologic question
Study the effets of Hypochoeris radicata (HR) ingestion on the
metabolism: HR flowers are responsible for a mortal disease for
horses, the “Australian stringhalt” (nervous system attack,
trembling...)
4 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Biologic question
Study the effets of Hypochoeris radicata (HR) ingestion on the
metabolism: HR flowers are responsible for a mortal disease for
horses, the “Australian stringhalt” (nervous system attack,
trembling...)
Experiences have been made with 72 mice.
4 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Description of the experiments
Mice are divided into several groups according to:
genders : 36 males ; 36 females
5 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Description of the experiments
Mice are divided into several groups according to:
genders : 36 males ; 36 females
daily HR doses ingested : 0 (control) : 24 mice ; 3% : 24 mice ;
9% : 24 mice
5 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Description of the experiments
Mice are divided into several groups according to:
genders : 36 males ; 36 females
daily HR doses ingested : 0 (control) : 24 mice ; 3% : 24 mice ;
9% : 24 mice
3 sacrifice dates : 8th day : 24 mice ; 15th : 24 mice ; 21st : 24
mice
5 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Description of the experiments
Mice are divided into several groups according to:
genders : 36 males ; 36 females
daily HR doses ingested : 0 (control) : 24 mice ; 3% : 24 mice ;
9% : 24 mice
3 sacrifice dates : 8th day : 24 mice ; 15th : 24 mice ; 21st : 24
mice
⇒ 18 groups (but groups coming from sacrifice dates are irrelevant
for the biological question).
5 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Day of measures
Urine was collected the following days:
Days 0 1 4 8 11 15 18 21
Nb of obs. 68 68 68 66 46 44 19 18
6 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Day of measures
Urine was collected the following days:
Days 0 1 4 8 11 15 18 21
Nb of obs. 68 68 68 66 46 44 19 18
For each mice, from 1 to 8 measures were done.
6 / 23
Nathalie Villa-Vialaneix
Presentation of the data
Day of measures
Urine was collected the following days:
Days 0 1 4 8 11 15 18 21
Nb of obs. 68 68 68 66 46 44 19 18
For each mice, from 1 to 8 measures were done.
Finally, 397 observations with 950 variables.
6 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Basics about wavelets
For a given integer J, a spectrum f can be expressed at level J by:
f(x) =
k
αk 2−J/2
Ψ(2−J
x − k) +
J
j=1 k
βjk 2−j/2
Φ 2−j
x − k
7 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Basics about wavelets
For a given integer J, a spectrum f can be expressed at level J by:
f(x) =
k
αk 2−J/2
Ψ(2−J
x − k)
Trend based on father wavelet Ψ
+
J
j=1 k
βjk 2−j/2
Φ 2−j
x − k
7 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Basics about wavelets
For a given integer J, a spectrum f can be expressed at level J by:
f(x) =
k
αk 2−J/2
Ψ(2−J
x − k)
Trend based on father wavelet Ψ
+
J
j=1 k
βjk 2−j/2
Φ 2−j
x − k
Details of levels 1, . . . , J
based on mother wavelet Φ
7 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Example of a hierarchical decomposi-
tion for a metabolomic spectrum
↓
8 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Example of a hierarchical decomposi-
tion for a metabolomic spectrum
↓
8 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Example of a hierarchical decomposi-
tion for a metabolomic spectrum
↓
8 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Example of a hierarchical decomposi-
tion for a metabolomic spectrum
... Details 1 to 8
↓
8 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Several strategies
Several wavelet basis
Haar wavelets (easily interpretable because they are close to
discrete derivatives);
D4 Daubechies wavelets (smoother representation but not
directly interpretable).
9 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Several strategies
Several wavelet basis
Haar wavelets (easily interpretable because they are close to
discrete derivatives);
D4 Daubechies wavelets (smoother representation but not
directly interpretable).
Several preprocessings
Use all wavelet coefficients as input data;
Use thresholded wavelet coefficients as input data (i.e., delete
the smallest coefficient with an automatic method called “soft
thresholding”);
Use only the detailed coefficients (and the detailed coefficients
of the shifted spectra) as input data.
9 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Scaling of wavelet coefficients (ex: Haar
detailed coefficients)
D.1 D.57 D.125 D.297 D.370 D.443 D2.41 D2.120 D2.304 D2.389 D2.474
−40−2002040
Before scaling
D.1 D.57 D.125 D.297 D.370 D.443 D2.41 D2.120 D2.304 D2.389 D2.474
−15−10−5051015
After scaling
10 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Normalization issue
q
q
qq
q
q
q q
q
q
q
q
q q
q
q
qq
q
q
qq
q
q
q
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
q q
q
q
q
q
q
q qq
q
q
q
q q q
q
q
q
q q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
qq
q
q
q
q q
q
q
q
q
q
q
q q
qq q
q
qq
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
qq
q
−10 −5 0 5 10 15
−10−505 PC1 vs. PC2
PC1
PC2
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q q
q
q
q
q
q qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
qq q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q qq
q
q
qq
q q
q
q
q
q
q
q
q
qq
q
q
q
q q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
−10 −5 0 5 10 15
−20−10010
PC1 vs. PC3
PC1
PC3
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
qq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
−10 −5 0 5 10 15
−15−505101520
PC1 vs. PC4
PC1
PC4
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
qq
q
q
q
q
qq q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qqq
q
q
qq
q q
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
qq q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
−10 −5 0 5
−20−10010 PC2 vs. PC3
PC2
PC3
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
qq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
−10 −5 0 5
−15−505101520
PC2 vs. PC4
PC2
PC4
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
qq
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
−20 −10 0 10
−15−505101520
PC3 vs. PC4
PC3
PC4
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
PCA for the coef-
ficients: the day
of measure for the
control group is
emphasized on
axis 2 and 4
11 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Normalization
Find median and variance of the coefficients for each day of
measure based on the control group.
Use these values for the normalization of all the observations
(according to the day of measure).
12 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
Normalization
Find median and variance of the coefficients for each day of
measure based on the control group.
Use these values for the normalization of all the observations
(according to the day of measure).
q
q
q
q
0 1 4 8 11 15 18 21
−0.20.00.20.40.6
D2.444
Day
Waveletcoefficients
q
q
q
q
q
0 1 4 8 11 15 18 21
−0.20−0.100.000.10
D.78
Day
Waveletcoefficients
q
q
q
0 1 4 8 11 15 18 21
0.00.51.01.52.02.5
D.332
Day
Waveletcoefficients
q
q
q
q
q
q
q
0 1 4 8 11 15 18 21
−1.5−1.0−0.5
D2.289
Day
Waveletcoefficients
q
q
q
q
0 1 4 8 11 18
−2−1012
D2.444
Day
Waveletcoefficients
q
q
q
q
q
0 1 4 8 11 18
−3−1012
D.78
Day
Waveletcoefficients
q
q q
0 1 4 8 11 18
−3−10123
D.332
Day
Waveletbcoefficients
q
qq
q
q
q
q
0 1 4 8 11 18
−3−10123
D2.289
Day
Waveletcoefficients
Before After 12 / 23
Nathalie Villa-Vialaneix
Wavelet preprocessing and normalization
PCA after normalization
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
qq
q
q q
q
qq
q
q
q qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q q
q
q qq
q
qq q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
−10 −5 0 5 10 15
0246810
PC1 vs. PC2
PC1
PC2
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
qq
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
−10 −5 0 5 10 15
−10−5051015
PC1 vs. PC3
PC1
PC3
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q
qq
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
q
qq
qq
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
−10 −5 0 5 10 15
−505
PC1 vs. PC4
PC1
PC4
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
qq
q
0 2 4 6 8 10 12
−10−5051015
PC2 vs. PC3
PC2
PC3
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q
q q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q q
q
q
q
q
q
q
q q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
0 2 4 6 8 10 12
−505
PC2 vs. PC4
PC2
PC4
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q q
q
q
q
q
q
q
qq
q q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
−10 −5 0 5 10 15
−505
PC3 vs. PC4
PC3
PC4
q
q
q
q
q
q
q
q
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
13 / 23
Nathalie Villa-Vialaneix
Learning methods
Motivations
Purpose: Validation of the impact of HR ingestion on metabolism
by predicting from the spectra the total HR dose ingested. If
the prediction is accurate, the impact is not an artefact of the data
and the biological dependency is validated.
14 / 23
Nathalie Villa-Vialaneix
Learning methods
Motivations
Purpose: Validation of the impact of HR ingestion on metabolism
by predicting from the spectra the total HR dose ingested. If
the prediction is accurate, the impact is not an artefact of the data
and the biological dependency is validated.
Compared methods :
random forest (R package randomForest)
ridge regression (R package glmnet)
LASSO (R package glmnet)
Elasticnet (R package glmnet)
Partial Least Squares (PLS) (R package mixOmics)
sparse PLS (R package mixOmics)
14 / 23
Nathalie Villa-Vialaneix
Learning methods
Methodology
Split the data into train and test sets that are balanced according to
the groups;
Preprocess (or not), scale and normalize the data with wavelets;
Learn each of the 6 methods (for each of the 7 kinds of
preprocessing) on the train set with a cross-validation strategy to
tune the parameters;
Calculate the mean squared error on the test set.
15 / 23
Nathalie Villa-Vialaneix
Learning methods
Methodology
Split the data into train and test sets that are balanced according to
the groups;
Preprocess (or not), scale and normalize the data with wavelets;
Learn each of the 6 methods (for each of the 7 kinds of
preprocessing) on the train set with a cross-validation strategy to
tune the parameters;
Calculate the mean squared error on the test set.
Repeat the previous scheme 250 times.
15 / 23
Nathalie Villa-Vialaneix
Learning methods
Mean performances in test
Methods Original Daubechies Daubechies Daubechies Haar Haar Haar
- Details - Full - Threshold - Details - Full - Threshold
ELN 0.5 16.29 (1.03) 15.38 (0.9) 14.33 (1.07) 42.94 (52.25) 15.39 (1.04) 14.49 (1.03) 30.98 (16.43)
ELN 0.25 16.12 (1.03) 15.28 (0.9) 14.35 (0.94) 44.62 (61.3) 15.2 (1) 14.47 (0.98) 32.54 (17.31)
ELN 0.1 15.81 (0.98) 15.14 (0.77) 14.38 (0.84) 42.58 (53.83) 15.15 (0.87) 14.58 (0.92) 35.41 (19.43)
ELN 0.75 16.31 (1.1) 15.48 (0.9) 14.43 (1.1) 42.62 (51.59) 15.44 (1.06) 14.5 (1.01) 30.31 (15.92)
Lasso 16.37 (1.27) 15.56 (1.01) 14.45 (1.14) 41.82 (50.86) 15.56 (1.1) 14.49 (1.01) 30.8 (17.01)
Ridge 16.82 (0.83) 16.22 (0.67) 15.56 (0.74) 41.75 (25.09) 16.16 (0.7) 15.66 (0.8) 37.58 (16.07)
PLS 16.83 (1.1) 16.25 (0.79) 15.61 (0.87) 81.56 (116.21) 16.09 (0.87) 15.87 (0.91) 42.6 (25.14)
RF 16.69 (0.91) 16.33 (1.36) 16.2 (1.16) 18.91 (1.66) 16.24 (1.06) 16.11 (1.09) 18.8 (1.32)
SPLS 5 19.71 (1.63) 19.25 (1.25) 16.55 (1.18) 36.54 (31.88) 19.1 (1.63) 17.24 (1.4) 34.25 (24.99)
SPLS 10 19.25 (1.65) 19.22 (1.23) 16.74 (1.15) 79.35 (110.56) 18.66 (1.36) 17.14 (1.25) 42.46 (23.76)
SPLS 20 18.41 (1.5) 18.81 (1.18) 17.55 (1.2) 76.05 (104.74) 18.55 (1.2) 17.11 (1.13) 42.38 (23.74)
16 / 23
Nathalie Villa-Vialaneix
Learning methods
Boxplot for full Daubechies representa-
tion
q
q
qq
q
q
q
q q
q q
q q
q
q
q
q
q
qq
qq
qqq q
q
q
q
q
q
Lasso
Ridge
ELN0.1
ELN0.25
ELN0.5
ELN0.75
PLS
SPLS5
SPLS10
SPLS20
RF
1214161820
Daubechies wavelets − Full
17 / 23
Nathalie Villa-Vialaneix
Learning methods
Full Daubechies representation and
ELN: Accuracy (on test)
q
qqq
q
q
qq
q
qq
q
q
q
q
q
qq
q
q
qq
q
qq
q
q
q
q
qq
qq
q
q
q
qqq
q
qqq
q
q
q
q
q
q
q
q
q
q
q
qqqq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
qq
qq
q
qqq
q
q
q
qqq
qqq
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
qq
q
qq
q
q
q
q
qqq
q
qq
q
qq
q
q
q
q
q
q
q
q q
0 50 100 150
050100150
True values
Predictedvalues
Mean R2
on test sets is equal to 89.0% (minimum is 83.1% and
maximum is 92.8%). 18 / 23
Nathalie Villa-Vialaneix
Identification of relevant metabolites
Identification issue
The full learning process is the following:
Spectra → Wavelet preprocess → Learning → HR dose prediction
19 / 23
Nathalie Villa-Vialaneix
Identification of relevant metabolites
Identification issue
The full learning process is the following:
Spectra → Wavelet preprocess → Learning → HR dose prediction
Hence, due to the preprocessing step, the coefficients selected
by ELN are not directly related to metabolites (or to localization
on the spectra).
19 / 23
Nathalie Villa-Vialaneix
Identification of relevant metabolites
Adaptation of the importance measure
for Each of the 950 variables, v, of the original data set do
Randomize the observations of the variable v
Compute the full Daubechies wavelet representation
with the randomized observations for v
Scale and normalize according to the true values mean,
median or variance
for Each test set, i do
Calculate new predictions with false values of v
and corresponding mse: msev,i
Calculate decrease in accuracy for test set: DAi =
1 − msei
msev,i
end for
Average over i, DAi, to obtain Importance of v
end for
20 / 23
Nathalie Villa-Vialaneix
Identification of relevant metabolites
Values of importance
q
q
qq
q
q
qqq
qqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0 200 400 600
0.00.20.40.60.8
Rank
Importance
21 / 23
Nathalie Villa-Vialaneix
Identification of relevant metabolites
Identification of important metabolites
2 4 6 8 10
05101520
ppm
Some have
already been identified: the most important is scyllo-inositol; one
of the orange is probably valine; one of the light yellow is probably
trimethylamine. The others are new. 22 / 23
Nathalie Villa-Vialaneix
Identification of relevant metabolites
What next?
Identification of the metabolites, study of the correlation between
the ones found and the ones previously emphasized.
Questions? Propositions?
23 / 23
Nathalie Villa-Vialaneix

More Related Content

Similar to Metabolomic data: combining wavelet representation with learning approaches

Integrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningIntegrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningtuxette
 
An introduction to neural network
An introduction to neural networkAn introduction to neural network
An introduction to neural networktuxette
 
All About that Bayes: Probability, Statistics, and the Quest to Quantify Unce...
All About that Bayes: Probability, Statistics, and the Quest to Quantify Unce...All About that Bayes: Probability, Statistics, and the Quest to Quantify Unce...
All About that Bayes: Probability, Statistics, and the Quest to Quantify Unce...Lawrence Livermore National Laboratory
 
OS16 - 6.G4.a Evaluation of Oral Swabs for FMDV Surveillance - P. Kirkland
OS16 - 6.G4.a   Evaluation of Oral Swabs for FMDV Surveillance - P. KirklandOS16 - 6.G4.a   Evaluation of Oral Swabs for FMDV Surveillance - P. Kirkland
OS16 - 6.G4.a Evaluation of Oral Swabs for FMDV Surveillance - P. KirklandEuFMD
 
Dr. Jeff Zimmerman - Developments in infectious disease surveillance
Dr. Jeff Zimmerman - Developments in infectious disease surveillanceDr. Jeff Zimmerman - Developments in infectious disease surveillance
Dr. Jeff Zimmerman - Developments in infectious disease surveillanceJohn Blue
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOtuxette
 
Using In Silico Tools in Repurposing Drugs for Neglected and Orphan Diseases
Using In Silico Tools in Repurposing Drugs for Neglected and Orphan DiseasesUsing In Silico Tools in Repurposing Drugs for Neglected and Orphan Diseases
Using In Silico Tools in Repurposing Drugs for Neglected and Orphan DiseasesSean Ekins
 
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...Torsten Seemann
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOtuxette
 

Similar to Metabolomic data: combining wavelet representation with learning approaches (9)

Integrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningIntegrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learning
 
An introduction to neural network
An introduction to neural networkAn introduction to neural network
An introduction to neural network
 
All About that Bayes: Probability, Statistics, and the Quest to Quantify Unce...
All About that Bayes: Probability, Statistics, and the Quest to Quantify Unce...All About that Bayes: Probability, Statistics, and the Quest to Quantify Unce...
All About that Bayes: Probability, Statistics, and the Quest to Quantify Unce...
 
OS16 - 6.G4.a Evaluation of Oral Swabs for FMDV Surveillance - P. Kirkland
OS16 - 6.G4.a   Evaluation of Oral Swabs for FMDV Surveillance - P. KirklandOS16 - 6.G4.a   Evaluation of Oral Swabs for FMDV Surveillance - P. Kirkland
OS16 - 6.G4.a Evaluation of Oral Swabs for FMDV Surveillance - P. Kirkland
 
Dr. Jeff Zimmerman - Developments in infectious disease surveillance
Dr. Jeff Zimmerman - Developments in infectious disease surveillanceDr. Jeff Zimmerman - Developments in infectious disease surveillance
Dr. Jeff Zimmerman - Developments in infectious disease surveillance
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
Using In Silico Tools in Repurposing Drugs for Neglected and Orphan Diseases
Using In Silico Tools in Repurposing Drugs for Neglected and Orphan DiseasesUsing In Silico Tools in Repurposing Drugs for Neglected and Orphan Diseases
Using In Silico Tools in Repurposing Drugs for Neglected and Orphan Diseases
 
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 

More from tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 

More from tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Recently uploaded

Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiologyDrAnita Sharma
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 

Recently uploaded (20)

Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiology
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 

Metabolomic data: combining wavelet representation with learning approaches

  • 1. Metabolomic data: combining wavelet representation with learning approaches Nathalie Villa-Vialaneix http://www.nathalievilla.org In collaboration with Noslen Hernández (CENATAV, La Havane, Cuba) & Philippe Besse IUT de Carcassonne (UPVD) & Institut de Mathématiques de Toulouse Groupe de travail BioPuces, INRA de Castanet May 19th, 2010 1 / 23 Nathalie Villa-Vialaneix
  • 2. Présentation générale 1 Presentation of the data 2 Wavelet preprocessing and normalization 3 Learning methods 4 Identification of relevant metabolites 2 / 23 Nathalie Villa-Vialaneix
  • 3. Presentation of the data Presentation of the data Data have been provided by Alain Paris (INRA): they are metabolomic spectra (H NMR) from mice urine and consist of 950 variables (from 0.50 ppm to 9.99 ppm). 3 / 23 Nathalie Villa-Vialaneix
  • 4. Presentation of the data Presentation of the data Data have been provided by Alain Paris (INRA): they are metabolomic spectra (H NMR) from mice urine and consist of 950 variables (from 0.50 ppm to 9.99 ppm). 3 / 23 Nathalie Villa-Vialaneix
  • 5. Presentation of the data Presentation of the data Data have been provided by Alain Paris (INRA): they are metabolomic spectra (H NMR) from mice urine and consist of 950 variables (from 0.50 ppm to 9.99 ppm). Peaks have been aligned and baseline has been removed. 3 / 23 Nathalie Villa-Vialaneix
  • 6. Presentation of the data Biologic question Study the effets of Hypochoeris radicata (HR) ingestion on the metabolism: HR flowers are responsible for a mortal disease for horses, the “Australian stringhalt” (nervous system attack, trembling...) 4 / 23 Nathalie Villa-Vialaneix
  • 7. Presentation of the data Biologic question Study the effets of Hypochoeris radicata (HR) ingestion on the metabolism: HR flowers are responsible for a mortal disease for horses, the “Australian stringhalt” (nervous system attack, trembling...) Experiences have been made with 72 mice. 4 / 23 Nathalie Villa-Vialaneix
  • 8. Presentation of the data Description of the experiments Mice are divided into several groups according to: genders : 36 males ; 36 females 5 / 23 Nathalie Villa-Vialaneix
  • 9. Presentation of the data Description of the experiments Mice are divided into several groups according to: genders : 36 males ; 36 females daily HR doses ingested : 0 (control) : 24 mice ; 3% : 24 mice ; 9% : 24 mice 5 / 23 Nathalie Villa-Vialaneix
  • 10. Presentation of the data Description of the experiments Mice are divided into several groups according to: genders : 36 males ; 36 females daily HR doses ingested : 0 (control) : 24 mice ; 3% : 24 mice ; 9% : 24 mice 3 sacrifice dates : 8th day : 24 mice ; 15th : 24 mice ; 21st : 24 mice 5 / 23 Nathalie Villa-Vialaneix
  • 11. Presentation of the data Description of the experiments Mice are divided into several groups according to: genders : 36 males ; 36 females daily HR doses ingested : 0 (control) : 24 mice ; 3% : 24 mice ; 9% : 24 mice 3 sacrifice dates : 8th day : 24 mice ; 15th : 24 mice ; 21st : 24 mice ⇒ 18 groups (but groups coming from sacrifice dates are irrelevant for the biological question). 5 / 23 Nathalie Villa-Vialaneix
  • 12. Presentation of the data Day of measures Urine was collected the following days: Days 0 1 4 8 11 15 18 21 Nb of obs. 68 68 68 66 46 44 19 18 6 / 23 Nathalie Villa-Vialaneix
  • 13. Presentation of the data Day of measures Urine was collected the following days: Days 0 1 4 8 11 15 18 21 Nb of obs. 68 68 68 66 46 44 19 18 For each mice, from 1 to 8 measures were done. 6 / 23 Nathalie Villa-Vialaneix
  • 14. Presentation of the data Day of measures Urine was collected the following days: Days 0 1 4 8 11 15 18 21 Nb of obs. 68 68 68 66 46 44 19 18 For each mice, from 1 to 8 measures were done. Finally, 397 observations with 950 variables. 6 / 23 Nathalie Villa-Vialaneix
  • 15. Wavelet preprocessing and normalization Basics about wavelets For a given integer J, a spectrum f can be expressed at level J by: f(x) = k αk 2−J/2 Ψ(2−J x − k) + J j=1 k βjk 2−j/2 Φ 2−j x − k 7 / 23 Nathalie Villa-Vialaneix
  • 16. Wavelet preprocessing and normalization Basics about wavelets For a given integer J, a spectrum f can be expressed at level J by: f(x) = k αk 2−J/2 Ψ(2−J x − k) Trend based on father wavelet Ψ + J j=1 k βjk 2−j/2 Φ 2−j x − k 7 / 23 Nathalie Villa-Vialaneix
  • 17. Wavelet preprocessing and normalization Basics about wavelets For a given integer J, a spectrum f can be expressed at level J by: f(x) = k αk 2−J/2 Ψ(2−J x − k) Trend based on father wavelet Ψ + J j=1 k βjk 2−j/2 Φ 2−j x − k Details of levels 1, . . . , J based on mother wavelet Φ 7 / 23 Nathalie Villa-Vialaneix
  • 18. Wavelet preprocessing and normalization Example of a hierarchical decomposi- tion for a metabolomic spectrum ↓ 8 / 23 Nathalie Villa-Vialaneix
  • 19. Wavelet preprocessing and normalization Example of a hierarchical decomposi- tion for a metabolomic spectrum ↓ 8 / 23 Nathalie Villa-Vialaneix
  • 20. Wavelet preprocessing and normalization Example of a hierarchical decomposi- tion for a metabolomic spectrum ↓ 8 / 23 Nathalie Villa-Vialaneix
  • 21. Wavelet preprocessing and normalization Example of a hierarchical decomposi- tion for a metabolomic spectrum ... Details 1 to 8 ↓ 8 / 23 Nathalie Villa-Vialaneix
  • 22. Wavelet preprocessing and normalization Several strategies Several wavelet basis Haar wavelets (easily interpretable because they are close to discrete derivatives); D4 Daubechies wavelets (smoother representation but not directly interpretable). 9 / 23 Nathalie Villa-Vialaneix
  • 23. Wavelet preprocessing and normalization Several strategies Several wavelet basis Haar wavelets (easily interpretable because they are close to discrete derivatives); D4 Daubechies wavelets (smoother representation but not directly interpretable). Several preprocessings Use all wavelet coefficients as input data; Use thresholded wavelet coefficients as input data (i.e., delete the smallest coefficient with an automatic method called “soft thresholding”); Use only the detailed coefficients (and the detailed coefficients of the shifted spectra) as input data. 9 / 23 Nathalie Villa-Vialaneix
  • 24. Wavelet preprocessing and normalization Scaling of wavelet coefficients (ex: Haar detailed coefficients) D.1 D.57 D.125 D.297 D.370 D.443 D2.41 D2.120 D2.304 D2.389 D2.474 −40−2002040 Before scaling D.1 D.57 D.125 D.297 D.370 D.443 D2.41 D2.120 D2.304 D2.389 D2.474 −15−10−5051015 After scaling 10 / 23 Nathalie Villa-Vialaneix
  • 25. Wavelet preprocessing and normalization Normalization issue q q qq q q q q q q q q q q q q qq q q qq q q q q q qq q q q qq q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q q qq q q q q qq q q qq q −10 −5 0 5 10 15 −10−505 PC1 vs. PC2 PC1 PC2 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q q qq q q q qq q q qq q q q q q q q q q q q q qq q q q q q q q q q q q q q q q qq q q q q q qq q q qq q q q q q q q q q qq q q q q q q q qq q q q q q q q q q q q q q qq q q qq q −10 −5 0 5 10 15 −20−10010 PC1 vs. PC3 PC1 PC3 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q q q q q q q q q q q q q q q q q q q q q qq qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q q q qq q −10 −5 0 5 10 15 −15−505101520 PC1 vs. PC4 PC1 PC4 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q qq q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q qqq q q q q q q qq q q q q q q q q q q q q q qq q q q q q q q q q q q q q q qq q q q q qqq q q qq q q q q q q q q q qq q q q qq q q qq q q q q q q q q q q q q q qq q q qq q −10 −5 0 5 −20−10010 PC2 vs. PC3 PC2 PC3 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q q q q q q q q q q q q q q q q q q q qq qq qq q q q q q q q q q q q q q q q q q q q q q q q q qq q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q qq q −10 −5 0 5 −15−505101520 PC2 vs. PC4 PC2 PC4 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q q q qq q −20 −10 0 10 −15−505101520 PC3 vs. PC4 PC3 PC4 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 PCA for the coef- ficients: the day of measure for the control group is emphasized on axis 2 and 4 11 / 23 Nathalie Villa-Vialaneix
  • 26. Wavelet preprocessing and normalization Normalization Find median and variance of the coefficients for each day of measure based on the control group. Use these values for the normalization of all the observations (according to the day of measure). 12 / 23 Nathalie Villa-Vialaneix
  • 27. Wavelet preprocessing and normalization Normalization Find median and variance of the coefficients for each day of measure based on the control group. Use these values for the normalization of all the observations (according to the day of measure). q q q q 0 1 4 8 11 15 18 21 −0.20.00.20.40.6 D2.444 Day Waveletcoefficients q q q q q 0 1 4 8 11 15 18 21 −0.20−0.100.000.10 D.78 Day Waveletcoefficients q q q 0 1 4 8 11 15 18 21 0.00.51.01.52.02.5 D.332 Day Waveletcoefficients q q q q q q q 0 1 4 8 11 15 18 21 −1.5−1.0−0.5 D2.289 Day Waveletcoefficients q q q q 0 1 4 8 11 18 −2−1012 D2.444 Day Waveletcoefficients q q q q q 0 1 4 8 11 18 −3−1012 D.78 Day Waveletcoefficients q q q 0 1 4 8 11 18 −3−10123 D.332 Day Waveletbcoefficients q qq q q q q 0 1 4 8 11 18 −3−10123 D2.289 Day Waveletcoefficients Before After 12 / 23 Nathalie Villa-Vialaneix
  • 28. Wavelet preprocessing and normalization PCA after normalization q q q q q q q q q q q q q q q q qq q q q q qq q q q qq q q q q q q q q q q q q q q q q qq q q q q q q q q q qq q qq q q qq q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q −10 −5 0 5 10 15 0246810 PC1 vs. PC2 PC1 PC2 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q qq q q qq q −10 −5 0 5 10 15 −10−5051015 PC1 vs. PC3 PC1 PC3 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q qq qq q q q q q q q q q q q q q q q q q q q qq q q q q qq q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q q q q q q q q q q q q q q q q q q q −10 −5 0 5 10 15 −505 PC1 vs. PC4 PC1 PC4 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q qq q q q q q q q q q q q q q q qq q q q q q q qq q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q qq q 0 2 4 6 8 10 12 −10−5051015 PC2 vs. PC3 PC2 PC3 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0 2 4 6 8 10 12 −505 PC2 vs. PC4 PC2 PC4 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 q q q q q q q qq q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q qq q q q q q q q q q q q qq q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q q q q q q q q q q q q q q q q q q q −10 −5 0 5 10 15 −505 PC3 vs. PC4 PC3 PC4 q q q q q q q q Day 0 Day 1 Day 4 Day 8 Day 11 Day 15 Day 18 Day 21 13 / 23 Nathalie Villa-Vialaneix
  • 29. Learning methods Motivations Purpose: Validation of the impact of HR ingestion on metabolism by predicting from the spectra the total HR dose ingested. If the prediction is accurate, the impact is not an artefact of the data and the biological dependency is validated. 14 / 23 Nathalie Villa-Vialaneix
  • 30. Learning methods Motivations Purpose: Validation of the impact of HR ingestion on metabolism by predicting from the spectra the total HR dose ingested. If the prediction is accurate, the impact is not an artefact of the data and the biological dependency is validated. Compared methods : random forest (R package randomForest) ridge regression (R package glmnet) LASSO (R package glmnet) Elasticnet (R package glmnet) Partial Least Squares (PLS) (R package mixOmics) sparse PLS (R package mixOmics) 14 / 23 Nathalie Villa-Vialaneix
  • 31. Learning methods Methodology Split the data into train and test sets that are balanced according to the groups; Preprocess (or not), scale and normalize the data with wavelets; Learn each of the 6 methods (for each of the 7 kinds of preprocessing) on the train set with a cross-validation strategy to tune the parameters; Calculate the mean squared error on the test set. 15 / 23 Nathalie Villa-Vialaneix
  • 32. Learning methods Methodology Split the data into train and test sets that are balanced according to the groups; Preprocess (or not), scale and normalize the data with wavelets; Learn each of the 6 methods (for each of the 7 kinds of preprocessing) on the train set with a cross-validation strategy to tune the parameters; Calculate the mean squared error on the test set. Repeat the previous scheme 250 times. 15 / 23 Nathalie Villa-Vialaneix
  • 33. Learning methods Mean performances in test Methods Original Daubechies Daubechies Daubechies Haar Haar Haar - Details - Full - Threshold - Details - Full - Threshold ELN 0.5 16.29 (1.03) 15.38 (0.9) 14.33 (1.07) 42.94 (52.25) 15.39 (1.04) 14.49 (1.03) 30.98 (16.43) ELN 0.25 16.12 (1.03) 15.28 (0.9) 14.35 (0.94) 44.62 (61.3) 15.2 (1) 14.47 (0.98) 32.54 (17.31) ELN 0.1 15.81 (0.98) 15.14 (0.77) 14.38 (0.84) 42.58 (53.83) 15.15 (0.87) 14.58 (0.92) 35.41 (19.43) ELN 0.75 16.31 (1.1) 15.48 (0.9) 14.43 (1.1) 42.62 (51.59) 15.44 (1.06) 14.5 (1.01) 30.31 (15.92) Lasso 16.37 (1.27) 15.56 (1.01) 14.45 (1.14) 41.82 (50.86) 15.56 (1.1) 14.49 (1.01) 30.8 (17.01) Ridge 16.82 (0.83) 16.22 (0.67) 15.56 (0.74) 41.75 (25.09) 16.16 (0.7) 15.66 (0.8) 37.58 (16.07) PLS 16.83 (1.1) 16.25 (0.79) 15.61 (0.87) 81.56 (116.21) 16.09 (0.87) 15.87 (0.91) 42.6 (25.14) RF 16.69 (0.91) 16.33 (1.36) 16.2 (1.16) 18.91 (1.66) 16.24 (1.06) 16.11 (1.09) 18.8 (1.32) SPLS 5 19.71 (1.63) 19.25 (1.25) 16.55 (1.18) 36.54 (31.88) 19.1 (1.63) 17.24 (1.4) 34.25 (24.99) SPLS 10 19.25 (1.65) 19.22 (1.23) 16.74 (1.15) 79.35 (110.56) 18.66 (1.36) 17.14 (1.25) 42.46 (23.76) SPLS 20 18.41 (1.5) 18.81 (1.18) 17.55 (1.2) 76.05 (104.74) 18.55 (1.2) 17.11 (1.13) 42.38 (23.74) 16 / 23 Nathalie Villa-Vialaneix
  • 34. Learning methods Boxplot for full Daubechies representa- tion q q qq q q q q q q q q q q q q q q qq qq qqq q q q q q q Lasso Ridge ELN0.1 ELN0.25 ELN0.5 ELN0.75 PLS SPLS5 SPLS10 SPLS20 RF 1214161820 Daubechies wavelets − Full 17 / 23 Nathalie Villa-Vialaneix
  • 35. Learning methods Full Daubechies representation and ELN: Accuracy (on test) q qqq q q qq q qq q q q q q qq q q qq q qq q q q q qq qq q q q qqq q qqq q q q q q q q q q q q qqqq qq q q q q q q q q q q q q q q q qq qq q q q q q q q q q q q q q q q qq q q q q q q q qq q q qq qq q qqq q q q qqq qqq q qq q q qq q q q q q q q q q q q q q qq q q q q q q q q q q q q q q qq q q q qq qq q qq q q q q qqq q qq q qq q q q q q q q q q 0 50 100 150 050100150 True values Predictedvalues Mean R2 on test sets is equal to 89.0% (minimum is 83.1% and maximum is 92.8%). 18 / 23 Nathalie Villa-Vialaneix
  • 36. Identification of relevant metabolites Identification issue The full learning process is the following: Spectra → Wavelet preprocess → Learning → HR dose prediction 19 / 23 Nathalie Villa-Vialaneix
  • 37. Identification of relevant metabolites Identification issue The full learning process is the following: Spectra → Wavelet preprocess → Learning → HR dose prediction Hence, due to the preprocessing step, the coefficients selected by ELN are not directly related to metabolites (or to localization on the spectra). 19 / 23 Nathalie Villa-Vialaneix
  • 38. Identification of relevant metabolites Adaptation of the importance measure for Each of the 950 variables, v, of the original data set do Randomize the observations of the variable v Compute the full Daubechies wavelet representation with the randomized observations for v Scale and normalize according to the true values mean, median or variance for Each test set, i do Calculate new predictions with false values of v and corresponding mse: msev,i Calculate decrease in accuracy for test set: DAi = 1 − msei msev,i end for Average over i, DAi, to obtain Importance of v end for 20 / 23 Nathalie Villa-Vialaneix
  • 39. Identification of relevant metabolites Values of importance q q qq q q qqq qqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0 200 400 600 0.00.20.40.60.8 Rank Importance 21 / 23 Nathalie Villa-Vialaneix
  • 40. Identification of relevant metabolites Identification of important metabolites 2 4 6 8 10 05101520 ppm Some have already been identified: the most important is scyllo-inositol; one of the orange is probably valine; one of the light yellow is probably trimethylamine. The others are new. 22 / 23 Nathalie Villa-Vialaneix
  • 41. Identification of relevant metabolites What next? Identification of the metabolites, study of the correlation between the ones found and the ones previously emphasized. Questions? Propositions? 23 / 23 Nathalie Villa-Vialaneix