We present a novel approach to integrate biochemical pathway and chemical relationships to map all detected metabolites in network graphs (MetaMapp) using KEGG reactant pair database, Tanimoto chemical and NIST mass spectral similarity scores. In fetal and maternal lungs, and in maternal blood plasma from pregnant rats exposed to environmental tobacco smoke (ETS), 459 unique metabolites comprising 179 structurally identified compounds were detected by gas chromatography time of flight mass spectrometry (GC-TOF MS) and BinBase data processing. MetaMapp graphs in Cytoscape showed much clearer metabolic modularity and complete content visualization compared to conventional biochemical mapping approaches. Cytoscape visualization of differential statistics results using these graphs showed that overall, fetal lung metabolism was more impaired than lungs and blood metabolism in dams. Fetuses from ETS-exposed dams expressed lower lipid and nucleotide levels and higher amounts of energy metabolism intermediates than control animals, indicating lower biosynthetic rates of metabolites for cell division, structural proteins and lipids that are critical for in lung development.
MetaMapp graphs efficiently visualizes mass spectrometry based metabolomics datasets as network graphs in Cytoscape, and highlights metabolic alterations that can be associated with higher rate of pulmonary diseases and infections in children prenatally exposed to ETS. The MetaMapp scripts can be accessed at http://metamapp.fiehnlab.ucdavis.edu.
2. DATA
ACQUISITION
Separation
Detection
SAMPLING
EXTRACTION
DATA
PROCESSING
File Conversion
Baseline Correction
Peak Detection
Deconvolution
Adduct Annotation
Alignment
Gap Filling
STATISTICS
Normalization
Multivariate Analysis
(Parametric, Nonparametric)
Univariate Analysis
(Unsupervised, Supervised)
BIOLOGICAL
INTERPRETATION
Pathway Mapping
Network Enrichment
STUDY DESIGN
VALIDATION
COMPOUND
IDENTIFICATION
Molecular Formula ID
Structure ID
MS Library Search
Database Search
In silico Fragmentation
WCMC
UC Davis
3. Questions :
• Why a network graph ?
• How to create biochemical network map of
identified metabolites ?
• How to include all the identified metabolites into a
network ?
• How to visualize and make publication ready
network graphs ?
• How to use MetaMapp and Cytoscape software ?
4. What is a network graph ?
A network graph represents entities as nodes (dots) and various
relationships among them as edges (links).
A
C
D
B
E
relationship X
An example network graph
Nodes can be – genes, proteins, reactions, metabolites.
Edges can be – correlation, reactions, reaction pairs,
pathways, chemical similarity, mass spectral similarity.
Edges can have direction like A B or B A.
Notable examples –
Air transportation network
Citation/ co-author network
Social network
Metabolic network
5. http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-13-334
What is a metabolic network ?
Tools for make this type of network
are –
• MetScape (http://metscape.ncibi.org/)
• MetaBox
(www.metabox.fiehnlab.ucdavis.edu)
• KEGG spider
(https://genomebiology.biomedcentral
.com/articles/10.1186/gb-2008-9-12-
r179)
• CPDB (http://consensuspathdb.org/)
• MetExplore
(http://metexplore.toulouse.inra.fr)
A
C
D
B
E
reaction X
A metabolite
Not every detected metabolite will be included in this network.
6. Two representations of the EC 2.3.1.35 reaction.
Two ways to convert a reaction to a graph
The KEGG RPAIR database is a manually curated
collection of reactant pairs (substrate-product pairs)
and chemical structure transformation patterns in
enzymatic reactions.
Masanori Arita PNAS 2004;101:1543-1547
Connect only the actual subtract-product and ignore the side
product or co-factors.
7. Biochemical databases provide list of metabolites and reactions among them.
An example of a metabolic reaction :
http://www.brenda-enzymes.org/ http://www.genome.jp/kegg/ https://metacyc.org/ http://www.reactome.org/
Major DBs that provided curated list of biochemical reactions.
Glucose-
6P
D-Glucose
D-Glucose 6-phosphate + H2O <=> D-Glucose + Orthophosphate
http://www.genome.jp/dbget-bin/www_bget?rn:R00303
~ 25000 metabolic reactions are known for various organisms.
Node A Node B
Node A Edge Node B
Cpd 1 KEGG Cpd 2
Cpd 3 KEGG Cpd 4
Cpd 4 KEGG Cpd 5
Cpd 6 KEGG Cpd 7
… KEGG …
Metabolic network in a text format
How to make a metabolic network ?
A
C
D
B
E
reaction X
A metabolite
8. But not all the metabolites have reaction annotations ?
https://bmcbioinformatics.biomedcentral.com/articl
es/10.1186/1471-2105-13-99
https://www.nature.com/articles/s41598-017-15231-w
Many significant (p<0.05) compounds are
not present in pathway databases
9. What is a chemical similarity coefficient ?
A
C
D
B
E
Chemical similarity
A metabolite
Xanthine Hypoxanthine
Tanimoto Chemical
Similarity score
0.917
Tanimoto = AB / ( A + B - AB )
Substructure decomposition for calculations of chemical similarity
11. What is MetaMapp ?
MetaMapp: mapping and visualizing metabolomic data by integrating information from biochemical pathways and chemical and
mass spectral similarity https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-13-99
MetaMapp is a approach to map all the detected metabolites into a network
graph that resembles known biochemistry
Available at metamapp.fiehnlab.ucdavis.edu
12. Input file :
MetaMapp R-
Package (openCPU
version)
Network file
Node attribute file
How to use MetaMapp ?
14. How to use MetaMapp ?
Prepare the input file.
• This is the minimum input
• No duplicate CIDs are allowed
Example :
spring_2018_metabolomics_course_metamapp_
example.xlsx
15. About the example dataset
• Comparison of the plasma metabolome in
Non alcoholic fatty liver disease subjects and
the controls.
• HILIC + CSH assays
• ~ 650 identified metabolites.
• Unpublished (you can write a paper out of it).
17. How to use MetaMapp ?
Obtaining the MetaMapp files.
• Go to http://metamapp.fiehnlab.ucdavis.edu
Copy and paste your data
in this box
Click here
Now, click on these two buttons
chemsim_krp_07.sif node_attributes_chemsim_krp_07.tsv
Both files are provided in the example folder for the case study.
19. What is Cytoscape ?
The most used network visualization and
analysis tool.
http://journals.plos.org/plosbiology/article?id=10.1371%2Fjournal.pbio.1001843http://www.cytoscape.org/
Backed by strong institutions
20. Visualize a range of experimental data on a
network graph
Useful graph layout algorithms
Graph theory calculations
Easy organization of multiple networks for
comparisons.
Faster navigation of large networks
Filter and query the network
Contributed plugins
Works on PC, Max and Linux system
USB drive contains a copy of Cytoscape software
Cytoscape basic features
21. Click here
Start Cytoscape software
Locate the chemsim_krp_07.sif file and click import.
The file shall be in your download folder or you can use
the file in the example folder.
How to use Cytoscape?
22. Import a new network
If you want to import a new network file in already running Cytoscape
Select the .sif file
29. Import your Node Attributes file
“Key” symbol
should be
PubChem_ID
30. Import your Node Attributes file
Table panel after importing the node attributes.
31. Data visualization
All visual properties
can be accessed in the
style tab.
Node color
Node size
Node label
Node label position
Node Label font size
Edge color
Edge width
Network background color
34. Red = higher
Blue = lower
Yellow = no
change
Node coloring : “Node Fill color”
35. Change node label
You can choose any
label from the node
attribute file.
You need to zoom in to see the labels
Use the scroll button to zoom in
and out.
38. Change label font size
Showing labels for only the significant compounds.
It is bit clearer.
39. Change node size
Select the values by press the left click
on mouse. Then right click.
Node size rules
FC 1.0 – size = 20
FC >1 & <2 --size = 60
FC >2 & < 3 -- size = 100
FC >3 & < 5 --size = 150
FC > 5-- size = 200
40. Intermediate network graph
Bit clearer, has experimental results. Highlighting the clusters that are changed.
Not yet
publication ready
41. Node moving
Press “control” key and click left
mouse button and select the area
Now you can move
these nodes
45. Observations
• Several fatty acids and DAGs are increased in NAFLD
plasma.
• Metformin was higher in NAFLD subjects along with
increased in MTA and Hypoxanthine.
• PE, CER, PC and CEs were decreased in the NAFLD
subjects.
• Stachydrine- orange juice related compound was 5-
fold lower in the NFALD subjects.
• One carbon and lipid metabolism were altered in
the NAFLD subjects.
46. Cluster detection
Check this box
Large network can be divided into smaller modules for better
visualization and interpretation
47. Useful buttons in the menu-bar
Zoom in Zoom out Zoom-all
Zoom in
selected
You will use these often.
49. Create a sub-network
Select a cluster by pressing “Ctrl” and
then left click and make a square
box.
Once selected the nodes,
Prese Ctrl+N to make a new network
Show graphics
details
50. Node label position
Click on the label position
It will make label position property
available in the style tab.
51. Select right side nodes
Click on this box to
change the property
for selected node.
Drag the
object box
to the left
and click ok
Node label position
52. Increase the scaling factor to
remove the overlaps of labels.
Scaling
Focused view of the fatty acids and DG cluster.
53. Network navigation
Click on a network you want to visualize
Create
subnetworks for –
• Sphingolipids
• Cholesteroyl esters
• TGs
• Phopspholipids
• One carbon metabolism
56. The commonly used anti-diabetic agent metformin targets mitochondrial complex I and thus
decreases the NAD+/NADH ratio. Metformin also inhibits cancer cell growth, in part through
inhibition of biosynthetic metabolism (Griss et al., 2015). Metformin-induced growth
inhibition can be partially rescued by supplementation with hypoxanthine and thymidine,
products of 1C metabolism (Corominas-Faja et al., 2012). It remains unclear, however, if the
impact of metformin on 1C metabolism is clinically significant at normal therapeutic doses.
One carbon metabolism and FAFLD
Increased homocysteine also occurs in liver disease,
including non-alcoholic fatty liver disease (NAFLD) (Dai
et al., 2016). In animals fed high-fat diets to induce
NAFLD, liver 1C metabolism is dysregulated, as
evidenced by intrahepatic increases in SAH and free
homocysteine and decreases in methionine and the
GNMT enzyme (Pacana et al., 2015). At the same time,
deletions of 1C enzymes lead to development of liver
disease.
59. Conclusions
• Biochemical network created using KEGG or any
biochemical databases did not cover all the identified
metabolites.
• MetaMapp combined KEGG reactions and chemical
similarity mapping to put all the known metabolites into
biochemical modules
• Cytoscape provided rich functionalities to visualize and
cluster a network graphs.
• Overlaying statistical results on these graphs can highlight
the modules which were affected in cases in comparison
to controls.
Editor's Notes
Two representations of the EC 2.3.1.35 reaction. In this reaction, the acetyl moiety of N-acetyl l-ornithine is transferred to l-glutamate to form N-acetyl l-glutamate. (Lower Left) In the scheme of Jeong et al. (7), its two substrates and two products are equally linked to the object representing the EC number, irrespective of their structural changes. (Lower Right) In our scheme, conserved substructural moieties, coded by color, are computationally detected, and each link is associated with the information of which atom goes where.