SlideShare a Scribd company logo
1 of 2
Download to read offline
Workshop: “Statistical Methods for Omics Data Integration and Analysis“
Valencia, Spain, 14-16, 2015
1
INTEGRATION OF METABOLOMICS, LIPIDOMICS AND CLINICAL DATA BY
RANDOM FOREST
Animesh Acharjee1
, Zsuzsanna Ament1
, James A West1
, Elizabeth Stanley1
, Benjamin J Jenkins1
,
Albert Koulman1
& Julian L Griffin1,2
1
Medical Research Council, Elsie Widdowson Laboratory, 120 Fulbourn Road, Cambridge, CB1 9NL, UK,
2
The Department of Biochemistry; 80 Tennis Court Road, University of Cambridge, Cambridge, CB2 1GA, UK.
Introduction: Peroxisome proliferator-activated
receptors, PPAR-α, PPAR-γ, and PPAR-δ are known to
regulate systemic metabolism (Ament et al., 2012).
Beneficial effects of their activation in the treatment of a
wide array of metabolic diseases are well established.
However, they can also cause side effects and adverse
pathological changes through unknown mechanisms. In
the current study, a PPAR-pan agonist (a triple agonist of
PPAR-α, -γ, and -δ) was investigated after dietary
treatment of male Sprague–Dawley (SD) rats. In addition
to the classical toxicological tests (urinalysis and clinical
chemistry) various mass spectrometry (MS) approaches
for the detection of liver metabolomic and lipidomic
changes were employed, in order to define the systemic
changes and better understand the underlying toxicity.
Here we present an approach, which is able to
integrate multiple data types and successfully combine
classical clinical chemistry and toxicology test results
with MS data. First, Random Forest (RF) (Breiman, 2001)
classification was used to select subsets of metabolites
showing that RF is successful in building associations and
predicting different dose responses. Next, we used RF
regression approach to link liver metabolites with clinical
phenotypes from plasma and urinalysis. Finally, an
integrated network analysis was performed providing a
relatively small sets of interrelated metabolites which can
predict the different dose levels with high accuracy. We
validate this approach by comparing the selected
metabolites to pathways known to be involved in PPAR
metabolism.
Methods: Five groups of 12 animals were
administered a PPAR-pan activator by daily oral gavage
at 30, 100, 300, 1000 mg/kg/day for 13 weeks. A separate
satellite group of animals (6 per group) were kept for a 4
week treatment free period in the control, 300 and 1000
mg/kg/day dose groups. Blood and urine samples of all
animals were collected at week 13 and 18. At necropsy,
tissue samples were collected following an overdose of
anaesthetic (halothane Ph. Eur. Vapour). Samples were
snap-frozen in liquid nitrogen and were maintained at
-80 °C until further analysis. Gas chromatography mass
spectrometry, (GC-MS), direct infusion mass
spectrometry (DI-MS) and liquid chromatography tandem
mass spectrometry (LC-MS/MS) methods were set up,
optimized, and used to measure hepatic (i) total fatty acids
(GC-MS), (ii-iii) intact lipids by DI-MS (pos. and neg.
ionization mode) (iv-v) intact lipids by LC-MS/MS (pos.
and neg. ionization modes) (vi) acyl-carnitines (targeted
LC-MS/MS), (vii) eicosanoids (SPE followed by LC-
MS/MS), and (viii-x) aqueous metabolites (open profiling
(pos. and neg.) and targeted LC-MS/MS) generating a
total of 9 datasets comprising over 1500 variables in
addition to those of clinical-chemical parameters (CCPs)
of plasma (33 variables), urinalysis (12 variables) and
relative liver weight (body and liver weight ratio)
Random Forest (RF) was used for both classification
and regression mode for different data types including liver
metabolites and CCPs from urine and plasma. Using the
select metabolites from the classification approach, RFs
were iteratively fitted, so that they yielded the smallest out
of bag (OOB) error rates (Díaz-Uriarte and De Andres,
2006). Further, we included permutation tests calculating
the significance of the associations of the metabolites with
CCPs. For integrated network analysis, we used partial
correlation, because it has the ability to distinguish between
direct and indirect associations.
Result: RF classification differentiated dose response
effects across all metabolomic and lipidomic datasets and
the regression approach was successfully applied to link
CCPs with metabolomic and lipidomic data.
Classification approach: The different doses
administered were treated as multiclass parameters whilst
metabolomic and lipidomic data were treated as predictor
sets. RF was applied in classification mode and OOB
misclassification error rate was calculated for the
individual data sets. Positive ion mode DI-MS intact lipids
and eicosanoid method variables were found to have the
lowest OOB errors of 36% each. Metabolites were selected
using backward elimination approach (Díaz-Uriarte and De
Andres, 2006). From each of the 9 dataset, important
variables were selected focusing down to 57 out of 1538.
Using the selected variables only, RF OOB error for dose
prediction was 22%. Again, applying the backward
elimination process, we were able to select the most
discriminatory variables, further reducing the total number
of variables to 15 (Figure 1). These were selected across 5
data sets, further reducing the OOB error to 21%.
Regression approach: We linked different clinical
phenotypes such as relative liver weight, urine and plasma
CCPs.
Relative liver weight: Intact lipid pos. DI-MS were
found to explain the highest variation (84%), the lowest
variation was explained by intact lipid in negative mode
Workshop: “Statistical Methods for Omics Data Integration and Analysis“
Valencia, Spain, 14-16, 2015
2
(32%). In total, 42 variables were selected (out of 1538)
explaining a striking 82%.
Urinalysis: Urine colour and turbidity was best
explained by selected intact lipid DI-MS (pos. and neg.),
eicosanoid, and intact lipid LC-MS/MS (pos.) data
variables. Variation (R2
) in urine colour was explained
54% by 24 variables and 60% of the variation in turbidity
was explained by 23 variables.
Plasma Clinical Chemistry: Aspartate
aminotransferase (AST, IU/L), albumin (g/L) and glucose
(mmol/L) variations were explained by 52, 37 and 44%
using 24, 31, 27 variables respectively using intact lipid DI-
MS (pos. and neg.), total fatty acid GC-MS and eicosanoid
data.
Network analysis: An integrated network was built
using partial correlation approach shown in figure 1.
Figure 1: Partial correlation network of the most discriminatory
variables (15) differentiating between dose levels. Metabolites
form different data matrices are in different colours: total fatty
acids GC-MS (yellow); eicosanoid open profiling (red); intact
lipids from DI-MS neg. (purple); and pos. (blue) mode; and acyl-
carnitines (green). The dotted lines represent negative, the solid
lines positive partial correlation coefficients. Eico_X is
representative of unknown small molecules as measured in the
eicosanoid assay by LC-MS/MS.
Discussions: We analysed, processed and explored
multiple liver metabolomics and lipidomics datasets along
with CCPs measured from plasma and urine. Firstly, RF
classification was successfully employed in the metabolite
selection process and allowed us to not only combine 9
different types of metabolite data from multiple platforms
but also to focus our attention to the most discriminatory
15 metabolites for data interpretation and biological
understanding, while increasing the predictive ability at the
same time. Furthermore, RF regression proved to be useful
as an interdisciplinary approach in joining classical
toxicology with modern metabolomics and lipidomics
data. Four broad themes emerged from the analysis. Firstly,
the selected 15 metabolites include only lipids, and no
aqueous compounds, reflecting the intimate role of PPARs
in lipidomic remodelling. Secondly, changes in acyl-
carnitines (C4-DC and C5:1) are suggestive of aciduria
more specifically 2-methyl-3-hydroxybutyric aciduria.
PPARs are known to regulate mitochondrial lipid
metabolism, and aciduria is commonly reported in
mitochondrial disorders, which could be suggestive of
common pathophysiological mechanism of damage.
Thirdly, it is interesting to note, that although the
discriminatory free fatty acids, C20:3, C22:5 and C20:5 all
have the potential to feed into the eicosanoid cascade, there
were no associations found with compounds detected by
the open profiling eicosanoid method. This could explain
the inability to identify these eicosanoid method
metabolites, and highlights the importance of a targeted
approach when these molecules are measured. And finally,
the odd chain saturated fatty acid: C17:0, commonly
considered to be a simple marker of ruminant fat intake,
was also found important and highly discriminatory,
leading us to further speculate on suggestions linking this
fatty acid to fatty acid α-oxidation (Jenkins et al., 2015).
In addition, CCPs were successfully combined with
metabolomic and lipidomic datasets highlighting
unexpected connections, such as liver lipid status and urine
turbidity. Liver related biochemical parameters AST
(hepatic leakage enzyme) and albumin (indicative of
altered liver synthetic function) have also been linked with
several decreasing phospholipids, which class of
compounds have well established hepato-protective effects
(Küllenberg et al., 2012).
Conclusion: In this study, we demonstrate a
powerful strategy in integrating multiple ~omics data using
RF and selecting discriminatory metabolites for partial
correlation network analysis. Previously, Acharjee and co-
workers (Acharjee et al., 2011) integrated plant gene
expression and metabolomics data using RF regression,
however, to the best of our knowledge, no such integrative
approach have been utilised to link classical hepatic
parameters with metabolomic and/or lipidomic datasets.
RF has proved to be a reliable and useful method in
integrative data interpretation which can assist hypothesis
generation. We also hope, that by linking classical
toxicology parameters with metabolite markers more
accurate and early detection of toxicity can be facilitated.
References:
Acharjee, A., Kloosterman, B., de Vos, R.C., Werij, J.S.,
Bachem, C.W., Visser, R.G., Maliepaard, C., 2011.
Data integration and network reconstruction with∼
omics data using Random Forest regression in potato.
Analytica chimica acta 705, 56-63.
Ament, Z., Masoodi, M., Griffin, J.L., 2012. Applications of
metabolomics for understanding the action of
peroxisome proliferator-activated receptors (PPARs)
in diabetes, obesity and cancer. Genome Med 4, 32.
Breiman, L., 2001. Random forests. Machine learning 45, 5-32.
Díaz-Uriarte, R., De Andres, S.A., 2006. Gene selection and
classification of microarray data using random forest.
BMC bioinformatics 7, 3.
Jenkins, B., West, J.A., Koulman, A., 2015. A Review of Odd-
Chain Fatty Acid Metabolism and the Role of
Pentadecanoic Acid (C15: 0) and Heptadecanoic Acid
(C17: 0) in Health and Disease. Molecules 20, 2425-
2444.
Küllenberg, D., Taylor, L.A., Schneider, M., Massing, U., 2012.
Health effects of dietary phospholipids. Lipids Health
Dis 11, 1-16.

More Related Content

What's hot

TCA for Aminoglycoside Compounds
TCA for  Aminoglycoside CompoundsTCA for  Aminoglycoside Compounds
TCA for Aminoglycoside CompoundsDeqing Xiao
 
acs.jmedchem.5b00495
acs.jmedchem.5b00495acs.jmedchem.5b00495
acs.jmedchem.5b00495Justin Murray
 
Antihypertensive Peptides; Synthesis, Properties and Application in Foods
Antihypertensive Peptides; Synthesis, Properties and Application in FoodsAntihypertensive Peptides; Synthesis, Properties and Application in Foods
Antihypertensive Peptides; Synthesis, Properties and Application in FoodsAkshay Ramani
 
Aubrey Perrine Research Poster 2016
Aubrey Perrine Research Poster 2016Aubrey Perrine Research Poster 2016
Aubrey Perrine Research Poster 2016Aubrey Perrine
 
A study on the antioxidant defense system in breast cancer patients.
A study on the antioxidant defense system in breast cancer patients.A study on the antioxidant defense system in breast cancer patients.
A study on the antioxidant defense system in breast cancer patients.Alexander Decker
 
Tetrameric peptide purified from hydrolysates of biodiesel byproducts of nann...
Tetrameric peptide purified from hydrolysates of biodiesel byproducts of nann...Tetrameric peptide purified from hydrolysates of biodiesel byproducts of nann...
Tetrameric peptide purified from hydrolysates of biodiesel byproducts of nann...Van-Tinh Nguyen
 
Effect of feed at different times prior to exercise and chelated chromium sup...
Effect of feed at different times prior to exercise and chelated chromium sup...Effect of feed at different times prior to exercise and chelated chromium sup...
Effect of feed at different times prior to exercise and chelated chromium sup...Lilian De Rezende Jordão
 
Mitochondrial dysfunction and the pathophysiology of Myalgic Encephalomyeliti...
Mitochondrial dysfunction and the pathophysiology of Myalgic Encephalomyeliti...Mitochondrial dysfunction and the pathophysiology of Myalgic Encephalomyeliti...
Mitochondrial dysfunction and the pathophysiology of Myalgic Encephalomyeliti...degarden
 
Investigating Novel Methods to Reduce Cholesterol Levels (Research Report)
Investigating Novel Methods to Reduce Cholesterol Levels (Research Report)Investigating Novel Methods to Reduce Cholesterol Levels (Research Report)
Investigating Novel Methods to Reduce Cholesterol Levels (Research Report)Tony Ng
 
¹³C Isotope Tracing to Delineate Altered Choline Metabolism in Proliferating ...
¹³C Isotope Tracing to Delineate Altered Choline Metabolism in Proliferating ...¹³C Isotope Tracing to Delineate Altered Choline Metabolism in Proliferating ...
¹³C Isotope Tracing to Delineate Altered Choline Metabolism in Proliferating ...ShreyaMandal4
 
special project cyp2e1 report
special project cyp2e1 reportspecial project cyp2e1 report
special project cyp2e1 reportKemal Asik
 
Biochemistry tom murray (poster)
Biochemistry tom murray (poster)Biochemistry tom murray (poster)
Biochemistry tom murray (poster)Tommurray111
 

What's hot (20)

TCA for Aminoglycoside Compounds
TCA for  Aminoglycoside CompoundsTCA for  Aminoglycoside Compounds
TCA for Aminoglycoside Compounds
 
Milton 1996
Milton 1996Milton 1996
Milton 1996
 
I0342048053
I0342048053I0342048053
I0342048053
 
acs.jmedchem.5b00495
acs.jmedchem.5b00495acs.jmedchem.5b00495
acs.jmedchem.5b00495
 
Antihypertensive Peptides; Synthesis, Properties and Application in Foods
Antihypertensive Peptides; Synthesis, Properties and Application in FoodsAntihypertensive Peptides; Synthesis, Properties and Application in Foods
Antihypertensive Peptides; Synthesis, Properties and Application in Foods
 
2000 j ethnoph 69 207
2000 j ethnoph 69 2072000 j ethnoph 69 207
2000 j ethnoph 69 207
 
Aubrey Perrine Research Poster 2016
Aubrey Perrine Research Poster 2016Aubrey Perrine Research Poster 2016
Aubrey Perrine Research Poster 2016
 
1 Lactaptin
1 Lactaptin1 Lactaptin
1 Lactaptin
 
A study on the antioxidant defense system in breast cancer patients.
A study on the antioxidant defense system in breast cancer patients.A study on the antioxidant defense system in breast cancer patients.
A study on the antioxidant defense system in breast cancer patients.
 
Tetrameric peptide purified from hydrolysates of biodiesel byproducts of nann...
Tetrameric peptide purified from hydrolysates of biodiesel byproducts of nann...Tetrameric peptide purified from hydrolysates of biodiesel byproducts of nann...
Tetrameric peptide purified from hydrolysates of biodiesel byproducts of nann...
 
Effect of feed at different times prior to exercise and chelated chromium sup...
Effect of feed at different times prior to exercise and chelated chromium sup...Effect of feed at different times prior to exercise and chelated chromium sup...
Effect of feed at different times prior to exercise and chelated chromium sup...
 
Mitochondrial dysfunction and the pathophysiology of Myalgic Encephalomyeliti...
Mitochondrial dysfunction and the pathophysiology of Myalgic Encephalomyeliti...Mitochondrial dysfunction and the pathophysiology of Myalgic Encephalomyeliti...
Mitochondrial dysfunction and the pathophysiology of Myalgic Encephalomyeliti...
 
Investigating Novel Methods to Reduce Cholesterol Levels (Research Report)
Investigating Novel Methods to Reduce Cholesterol Levels (Research Report)Investigating Novel Methods to Reduce Cholesterol Levels (Research Report)
Investigating Novel Methods to Reduce Cholesterol Levels (Research Report)
 
Matsunami et al., 2010 4th
Matsunami et al., 2010 4thMatsunami et al., 2010 4th
Matsunami et al., 2010 4th
 
pone.0143384
pone.0143384pone.0143384
pone.0143384
 
¹³C Isotope Tracing to Delineate Altered Choline Metabolism in Proliferating ...
¹³C Isotope Tracing to Delineate Altered Choline Metabolism in Proliferating ...¹³C Isotope Tracing to Delineate Altered Choline Metabolism in Proliferating ...
¹³C Isotope Tracing to Delineate Altered Choline Metabolism in Proliferating ...
 
special project cyp2e1 report
special project cyp2e1 reportspecial project cyp2e1 report
special project cyp2e1 report
 
Biochemistry tom murray (poster)
Biochemistry tom murray (poster)Biochemistry tom murray (poster)
Biochemistry tom murray (poster)
 
ASCB poster
ASCB posterASCB poster
ASCB poster
 
1st pre sentation journal... 15feb.2015
1st pre sentation journal... 15feb.20151st pre sentation journal... 15feb.2015
1st pre sentation journal... 15feb.2015
 

Similar to Abstract_SMODIA2015_Acharjee_et_al

Central Lechera Asturiana, estudio de intervención Naturlinea
Central Lechera Asturiana, estudio de intervención Naturlinea Central Lechera Asturiana, estudio de intervención Naturlinea
Central Lechera Asturiana, estudio de intervención Naturlinea Central_Lechera_Asturiana
 
Hepatic and serum lipid signatures specific to NASH in mouse models
Hepatic and serum lipid signatures specific to NASH in mouse modelsHepatic and serum lipid signatures specific to NASH in mouse models
Hepatic and serum lipid signatures specific to NASH in mouse modelsFranck Chiappini
 
Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...Guide to PHARMACOLOGY
 
COMPARISON OF SERUM LEVELS OF ZINC AND LEPTIN IN FEMALE ENDURANCE AND SPRINTI...
COMPARISON OF SERUM LEVELS OF ZINC AND LEPTIN IN FEMALE ENDURANCE AND SPRINTI...COMPARISON OF SERUM LEVELS OF ZINC AND LEPTIN IN FEMALE ENDURANCE AND SPRINTI...
COMPARISON OF SERUM LEVELS OF ZINC AND LEPTIN IN FEMALE ENDURANCE AND SPRINTI...EDITOR IJCRCPS
 
Exetimibe melhora dm diminuindo nafld.x
Exetimibe melhora dm diminuindo nafld.xExetimibe melhora dm diminuindo nafld.x
Exetimibe melhora dm diminuindo nafld.xRuy Pantoja
 
Metformin presentation sigma xi
Metformin presentation sigma xiMetformin presentation sigma xi
Metformin presentation sigma xiLaceyg92
 
Core Components of the Metabolic Syndrome in Nonalcohlic Fatty Liver Disease
Core Components of the Metabolic Syndrome in Nonalcohlic Fatty Liver DiseaseCore Components of the Metabolic Syndrome in Nonalcohlic Fatty Liver Disease
Core Components of the Metabolic Syndrome in Nonalcohlic Fatty Liver DiseaseIOSR Journals
 
Alterations of Mitochondrial Functions and DNA in Diabetic Cardiomyopathy of ...
Alterations of Mitochondrial Functions and DNA in Diabetic Cardiomyopathy of ...Alterations of Mitochondrial Functions and DNA in Diabetic Cardiomyopathy of ...
Alterations of Mitochondrial Functions and DNA in Diabetic Cardiomyopathy of ...CrimsonPublishersIOD
 
Proteome-wide covalent ligand discovery in native biological systems
Proteome-wide covalent ligand discovery in native biological systemsProteome-wide covalent ligand discovery in native biological systems
Proteome-wide covalent ligand discovery in native biological systemsMegha Majumder
 
Sr Creatinine estimation journal dr.prathy.pptx
Sr Creatinine estimation journal dr.prathy.pptxSr Creatinine estimation journal dr.prathy.pptx
Sr Creatinine estimation journal dr.prathy.pptxprathyushameravala
 
Biochemical Study on Endothelial Nitric Oxide Gene Polymorphism in Fatty Live...
Biochemical Study on Endothelial Nitric Oxide Gene Polymorphism in Fatty Live...Biochemical Study on Endothelial Nitric Oxide Gene Polymorphism in Fatty Live...
Biochemical Study on Endothelial Nitric Oxide Gene Polymorphism in Fatty Live...iosrjce
 
Poster on systems pharmacology of the cholesterol biosynthesis pathway
Poster on systems pharmacology of the cholesterol biosynthesis pathwayPoster on systems pharmacology of the cholesterol biosynthesis pathway
Poster on systems pharmacology of the cholesterol biosynthesis pathwayGuide to PHARMACOLOGY
 
Accelrys UGM slides 2011
Accelrys UGM slides 2011Accelrys UGM slides 2011
Accelrys UGM slides 2011Sean Ekins
 
Metabolism and Weight Loss effect
Metabolism and Weight Loss effectMetabolism and Weight Loss effect
Metabolism and Weight Loss effectsilver1111
 
dan.crawford.project.final
dan.crawford.project.finaldan.crawford.project.final
dan.crawford.project.finalDan Crawford
 

Similar to Abstract_SMODIA2015_Acharjee_et_al (20)

Central Lechera Asturiana, estudio de intervención Naturlinea
Central Lechera Asturiana, estudio de intervención Naturlinea Central Lechera Asturiana, estudio de intervención Naturlinea
Central Lechera Asturiana, estudio de intervención Naturlinea
 
Hepatic and serum lipid signatures specific to NASH in mouse models
Hepatic and serum lipid signatures specific to NASH in mouse modelsHepatic and serum lipid signatures specific to NASH in mouse models
Hepatic and serum lipid signatures specific to NASH in mouse models
 
Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...
 
COMPARISON OF SERUM LEVELS OF ZINC AND LEPTIN IN FEMALE ENDURANCE AND SPRINTI...
COMPARISON OF SERUM LEVELS OF ZINC AND LEPTIN IN FEMALE ENDURANCE AND SPRINTI...COMPARISON OF SERUM LEVELS OF ZINC AND LEPTIN IN FEMALE ENDURANCE AND SPRINTI...
COMPARISON OF SERUM LEVELS OF ZINC AND LEPTIN IN FEMALE ENDURANCE AND SPRINTI...
 
Exetimibe melhora dm diminuindo nafld.x
Exetimibe melhora dm diminuindo nafld.xExetimibe melhora dm diminuindo nafld.x
Exetimibe melhora dm diminuindo nafld.x
 
Metformin presentation sigma xi
Metformin presentation sigma xiMetformin presentation sigma xi
Metformin presentation sigma xi
 
1479 5876-12-153
1479 5876-12-1531479 5876-12-153
1479 5876-12-153
 
Core Components of the Metabolic Syndrome in Nonalcohlic Fatty Liver Disease
Core Components of the Metabolic Syndrome in Nonalcohlic Fatty Liver DiseaseCore Components of the Metabolic Syndrome in Nonalcohlic Fatty Liver Disease
Core Components of the Metabolic Syndrome in Nonalcohlic Fatty Liver Disease
 
Alterations of Mitochondrial Functions and DNA in Diabetic Cardiomyopathy of ...
Alterations of Mitochondrial Functions and DNA in Diabetic Cardiomyopathy of ...Alterations of Mitochondrial Functions and DNA in Diabetic Cardiomyopathy of ...
Alterations of Mitochondrial Functions and DNA in Diabetic Cardiomyopathy of ...
 
ASMS Poster final
ASMS Poster finalASMS Poster final
ASMS Poster final
 
Senior Thesis- Poster
Senior Thesis- PosterSenior Thesis- Poster
Senior Thesis- Poster
 
Proteome-wide covalent ligand discovery in native biological systems
Proteome-wide covalent ligand discovery in native biological systemsProteome-wide covalent ligand discovery in native biological systems
Proteome-wide covalent ligand discovery in native biological systems
 
Sr Creatinine estimation journal dr.prathy.pptx
Sr Creatinine estimation journal dr.prathy.pptxSr Creatinine estimation journal dr.prathy.pptx
Sr Creatinine estimation journal dr.prathy.pptx
 
Biochemical Study on Endothelial Nitric Oxide Gene Polymorphism in Fatty Live...
Biochemical Study on Endothelial Nitric Oxide Gene Polymorphism in Fatty Live...Biochemical Study on Endothelial Nitric Oxide Gene Polymorphism in Fatty Live...
Biochemical Study on Endothelial Nitric Oxide Gene Polymorphism in Fatty Live...
 
Poster on systems pharmacology of the cholesterol biosynthesis pathway
Poster on systems pharmacology of the cholesterol biosynthesis pathwayPoster on systems pharmacology of the cholesterol biosynthesis pathway
Poster on systems pharmacology of the cholesterol biosynthesis pathway
 
Nutrigenomics
NutrigenomicsNutrigenomics
Nutrigenomics
 
Accelrys UGM slides 2011
Accelrys UGM slides 2011Accelrys UGM slides 2011
Accelrys UGM slides 2011
 
Resaerch paper 2
Resaerch paper 2Resaerch paper 2
Resaerch paper 2
 
Metabolism and Weight Loss effect
Metabolism and Weight Loss effectMetabolism and Weight Loss effect
Metabolism and Weight Loss effect
 
dan.crawford.project.final
dan.crawford.project.finaldan.crawford.project.final
dan.crawford.project.final
 

Abstract_SMODIA2015_Acharjee_et_al

  • 1. Workshop: “Statistical Methods for Omics Data Integration and Analysis“ Valencia, Spain, 14-16, 2015 1 INTEGRATION OF METABOLOMICS, LIPIDOMICS AND CLINICAL DATA BY RANDOM FOREST Animesh Acharjee1 , Zsuzsanna Ament1 , James A West1 , Elizabeth Stanley1 , Benjamin J Jenkins1 , Albert Koulman1 & Julian L Griffin1,2 1 Medical Research Council, Elsie Widdowson Laboratory, 120 Fulbourn Road, Cambridge, CB1 9NL, UK, 2 The Department of Biochemistry; 80 Tennis Court Road, University of Cambridge, Cambridge, CB2 1GA, UK. Introduction: Peroxisome proliferator-activated receptors, PPAR-α, PPAR-γ, and PPAR-δ are known to regulate systemic metabolism (Ament et al., 2012). Beneficial effects of their activation in the treatment of a wide array of metabolic diseases are well established. However, they can also cause side effects and adverse pathological changes through unknown mechanisms. In the current study, a PPAR-pan agonist (a triple agonist of PPAR-α, -γ, and -δ) was investigated after dietary treatment of male Sprague–Dawley (SD) rats. In addition to the classical toxicological tests (urinalysis and clinical chemistry) various mass spectrometry (MS) approaches for the detection of liver metabolomic and lipidomic changes were employed, in order to define the systemic changes and better understand the underlying toxicity. Here we present an approach, which is able to integrate multiple data types and successfully combine classical clinical chemistry and toxicology test results with MS data. First, Random Forest (RF) (Breiman, 2001) classification was used to select subsets of metabolites showing that RF is successful in building associations and predicting different dose responses. Next, we used RF regression approach to link liver metabolites with clinical phenotypes from plasma and urinalysis. Finally, an integrated network analysis was performed providing a relatively small sets of interrelated metabolites which can predict the different dose levels with high accuracy. We validate this approach by comparing the selected metabolites to pathways known to be involved in PPAR metabolism. Methods: Five groups of 12 animals were administered a PPAR-pan activator by daily oral gavage at 30, 100, 300, 1000 mg/kg/day for 13 weeks. A separate satellite group of animals (6 per group) were kept for a 4 week treatment free period in the control, 300 and 1000 mg/kg/day dose groups. Blood and urine samples of all animals were collected at week 13 and 18. At necropsy, tissue samples were collected following an overdose of anaesthetic (halothane Ph. Eur. Vapour). Samples were snap-frozen in liquid nitrogen and were maintained at -80 °C until further analysis. Gas chromatography mass spectrometry, (GC-MS), direct infusion mass spectrometry (DI-MS) and liquid chromatography tandem mass spectrometry (LC-MS/MS) methods were set up, optimized, and used to measure hepatic (i) total fatty acids (GC-MS), (ii-iii) intact lipids by DI-MS (pos. and neg. ionization mode) (iv-v) intact lipids by LC-MS/MS (pos. and neg. ionization modes) (vi) acyl-carnitines (targeted LC-MS/MS), (vii) eicosanoids (SPE followed by LC- MS/MS), and (viii-x) aqueous metabolites (open profiling (pos. and neg.) and targeted LC-MS/MS) generating a total of 9 datasets comprising over 1500 variables in addition to those of clinical-chemical parameters (CCPs) of plasma (33 variables), urinalysis (12 variables) and relative liver weight (body and liver weight ratio) Random Forest (RF) was used for both classification and regression mode for different data types including liver metabolites and CCPs from urine and plasma. Using the select metabolites from the classification approach, RFs were iteratively fitted, so that they yielded the smallest out of bag (OOB) error rates (Díaz-Uriarte and De Andres, 2006). Further, we included permutation tests calculating the significance of the associations of the metabolites with CCPs. For integrated network analysis, we used partial correlation, because it has the ability to distinguish between direct and indirect associations. Result: RF classification differentiated dose response effects across all metabolomic and lipidomic datasets and the regression approach was successfully applied to link CCPs with metabolomic and lipidomic data. Classification approach: The different doses administered were treated as multiclass parameters whilst metabolomic and lipidomic data were treated as predictor sets. RF was applied in classification mode and OOB misclassification error rate was calculated for the individual data sets. Positive ion mode DI-MS intact lipids and eicosanoid method variables were found to have the lowest OOB errors of 36% each. Metabolites were selected using backward elimination approach (Díaz-Uriarte and De Andres, 2006). From each of the 9 dataset, important variables were selected focusing down to 57 out of 1538. Using the selected variables only, RF OOB error for dose prediction was 22%. Again, applying the backward elimination process, we were able to select the most discriminatory variables, further reducing the total number of variables to 15 (Figure 1). These were selected across 5 data sets, further reducing the OOB error to 21%. Regression approach: We linked different clinical phenotypes such as relative liver weight, urine and plasma CCPs. Relative liver weight: Intact lipid pos. DI-MS were found to explain the highest variation (84%), the lowest variation was explained by intact lipid in negative mode
  • 2. Workshop: “Statistical Methods for Omics Data Integration and Analysis“ Valencia, Spain, 14-16, 2015 2 (32%). In total, 42 variables were selected (out of 1538) explaining a striking 82%. Urinalysis: Urine colour and turbidity was best explained by selected intact lipid DI-MS (pos. and neg.), eicosanoid, and intact lipid LC-MS/MS (pos.) data variables. Variation (R2 ) in urine colour was explained 54% by 24 variables and 60% of the variation in turbidity was explained by 23 variables. Plasma Clinical Chemistry: Aspartate aminotransferase (AST, IU/L), albumin (g/L) and glucose (mmol/L) variations were explained by 52, 37 and 44% using 24, 31, 27 variables respectively using intact lipid DI- MS (pos. and neg.), total fatty acid GC-MS and eicosanoid data. Network analysis: An integrated network was built using partial correlation approach shown in figure 1. Figure 1: Partial correlation network of the most discriminatory variables (15) differentiating between dose levels. Metabolites form different data matrices are in different colours: total fatty acids GC-MS (yellow); eicosanoid open profiling (red); intact lipids from DI-MS neg. (purple); and pos. (blue) mode; and acyl- carnitines (green). The dotted lines represent negative, the solid lines positive partial correlation coefficients. Eico_X is representative of unknown small molecules as measured in the eicosanoid assay by LC-MS/MS. Discussions: We analysed, processed and explored multiple liver metabolomics and lipidomics datasets along with CCPs measured from plasma and urine. Firstly, RF classification was successfully employed in the metabolite selection process and allowed us to not only combine 9 different types of metabolite data from multiple platforms but also to focus our attention to the most discriminatory 15 metabolites for data interpretation and biological understanding, while increasing the predictive ability at the same time. Furthermore, RF regression proved to be useful as an interdisciplinary approach in joining classical toxicology with modern metabolomics and lipidomics data. Four broad themes emerged from the analysis. Firstly, the selected 15 metabolites include only lipids, and no aqueous compounds, reflecting the intimate role of PPARs in lipidomic remodelling. Secondly, changes in acyl- carnitines (C4-DC and C5:1) are suggestive of aciduria more specifically 2-methyl-3-hydroxybutyric aciduria. PPARs are known to regulate mitochondrial lipid metabolism, and aciduria is commonly reported in mitochondrial disorders, which could be suggestive of common pathophysiological mechanism of damage. Thirdly, it is interesting to note, that although the discriminatory free fatty acids, C20:3, C22:5 and C20:5 all have the potential to feed into the eicosanoid cascade, there were no associations found with compounds detected by the open profiling eicosanoid method. This could explain the inability to identify these eicosanoid method metabolites, and highlights the importance of a targeted approach when these molecules are measured. And finally, the odd chain saturated fatty acid: C17:0, commonly considered to be a simple marker of ruminant fat intake, was also found important and highly discriminatory, leading us to further speculate on suggestions linking this fatty acid to fatty acid α-oxidation (Jenkins et al., 2015). In addition, CCPs were successfully combined with metabolomic and lipidomic datasets highlighting unexpected connections, such as liver lipid status and urine turbidity. Liver related biochemical parameters AST (hepatic leakage enzyme) and albumin (indicative of altered liver synthetic function) have also been linked with several decreasing phospholipids, which class of compounds have well established hepato-protective effects (Küllenberg et al., 2012). Conclusion: In this study, we demonstrate a powerful strategy in integrating multiple ~omics data using RF and selecting discriminatory metabolites for partial correlation network analysis. Previously, Acharjee and co- workers (Acharjee et al., 2011) integrated plant gene expression and metabolomics data using RF regression, however, to the best of our knowledge, no such integrative approach have been utilised to link classical hepatic parameters with metabolomic and/or lipidomic datasets. RF has proved to be a reliable and useful method in integrative data interpretation which can assist hypothesis generation. We also hope, that by linking classical toxicology parameters with metabolite markers more accurate and early detection of toxicity can be facilitated. References: Acharjee, A., Kloosterman, B., de Vos, R.C., Werij, J.S., Bachem, C.W., Visser, R.G., Maliepaard, C., 2011. Data integration and network reconstruction with∼ omics data using Random Forest regression in potato. Analytica chimica acta 705, 56-63. Ament, Z., Masoodi, M., Griffin, J.L., 2012. Applications of metabolomics for understanding the action of peroxisome proliferator-activated receptors (PPARs) in diabetes, obesity and cancer. Genome Med 4, 32. Breiman, L., 2001. Random forests. Machine learning 45, 5-32. Díaz-Uriarte, R., De Andres, S.A., 2006. Gene selection and classification of microarray data using random forest. BMC bioinformatics 7, 3. Jenkins, B., West, J.A., Koulman, A., 2015. A Review of Odd- Chain Fatty Acid Metabolism and the Role of Pentadecanoic Acid (C15: 0) and Heptadecanoic Acid (C17: 0) in Health and Disease. Molecules 20, 2425- 2444. Küllenberg, D., Taylor, L.A., Schneider, M., Massing, U., 2012. Health effects of dietary phospholipids. Lipids Health Dis 11, 1-16.