Mapping metabolites against pathway databases

Metabolic
pathway mapping
Dinesh Barupal
dinkumar@ucdavis.edu
5.1 & 5.4

DATA
ACQUISITION
Separation
Detection
SAMPLING
EXTRACTION
DATA
PROCESSING
File Conversion
Baseline Correction
Peak Detection
Deconvolution
Adduct Annotation
Alignment
Gap Filling
STATISTICS
Normalization
Multivariate Analysis
(Parametric, Nonparametric)
Univariate Analysis
(Unsupervised, Supervised)
BIOLOGICAL
INTERPRETATION
Pathway Mapping
Network Enrichment
STUDY DESIGN
VALIDATION
COMPOUND
IDENTIFICATION
Molecular Formula ID
Structure ID
MS Library Search
Database Search
In silico Fragmentation
WCMC
UC Davis

Question :
Which metabolites from my list can be mapped to
metabolic pathways ?
Approach : Database mapping
Results:
A. Table of pathways with statistics
B. Maps overlaid with detected metabolites
C. Metabolites not mapped to pathways

Data preparation
C00003
C00016
C00019
C00020
C00025
C00047
C00064
C00082
C00086
C00099
C00105
Select the KEGG database identifiers for compounds having pvalue <0.05 in your experiment.
To get the KEGG ids for your compound list, you can use PubChem identifier exchange service.
https://pubchem.ncbi.nlm.nih.gov/idexchange/idexchange.cgi

Prepare the compound list
Sort the statistical results by p-value
Select the KEGG ids of compounds with p-value <0.05
Use filter in
Excel
Example study : “spring_2018_metabolomics_course_pathway_example.xlsx” in the
PathwayAnalysis_example folder

Four major databases for pathway
mapping –
• KEGG
• Reactome
• HMDB
• Consensus path db
For a comprehensive list of pathway databases – go to http://www.pathguide.org/

KEGG : Kyoto encyclopedia of genes and genomes
http://www.genome.jp/kegg/
GO TO
KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances.
KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics,
metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug
development.

http://www.genome.jp/kegg/
Click on KEGG Mapper (at the bottom of the page)

Copy and paste the KEGG
Ids in this box
Click on execute
Change organism if needed

KEGG Results -
Click on the a pathway to get
the map with compound
overlaid on it.

KEGG Global metabolic map
Red dots are the
compounds
present in the
metabolite list.

Green boxes are enzymes mapped to the human genome.
White boxes
are enzymes
not mapped
to the human
genome.
Red circle shows
the presence of
this compound
in the
metabolite list

Reactome pathway mapping
www.reactome.org
“REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database.
Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of
pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology
and education. Founded in 2001, the Reactome project is led by Lincoln Stein of OICR, Peter D’Eustachio
of NYULMC, Henning Hermjakob of EMBL-EBI, and Guanming Wu of OHSU.”

Copy and paste KEGG Ids in this box.
# first line should be
“#Small_molecules_KEGG”

Zooming in and double clicking will show
the map of a pathway
Metabolite in the list

MetaboAnalyst
Pathway Mapping
“To provide a user-friendly, web-based analytical pipeline for high-throughput metabolomics
studies. In particular, MetaboAnalyst aims to offer a variety of commonly used procedures for
metabolomic data processing, normalization, multivariate statistical analysis, as well as data
annotation. The current implementation focuses on exploratory statistical analysis, functional
interpretation, and advanced statistics for translational metabolomics studies.”
http://www.metaboanalyst.ca/

Go to www.metaboanalyst.ca
Click here

Paste the KEGG ids and select the type of identifiers
Click : Submit
Click : Submit
Select a pathway library
Click : Submit

Results
Pathway topology : centrality of compounds

Output table
Pathway with FDR < 0.20 are selected

Compound-pathway mapping is not provided

Consensus PathDB
Metabolic Network Mapping
“ConsensusPathDB-human integrates interaction networks in Homo sapiens including binary and complex
protein-protein, genetic, metabolic, signaling, gene regulatory and drug-target interactions, as well as
biochemical pathways. Data originate from currently 32 public resources for interactions (listed below) and
interactions that we have curated from the literature. The interaction data are integrated in a complementary
manner (avoiding redundancies), resulting in a seamless interaction network containing different types of
interactions.”
http://cpdb.molgen.mpg.de/

Click here
http://cpdb.molgen.mpg.de/
Copy and paste KEGG
id list here.

Results
Click on select “all”
Pathway similarity
network

Input KEGG ids are
mapped to internal ids and
to various databases.
Click on “all” to select
them.
Then click on the show
interactions button.

CPDB Interaction Results
Click on “all” to select them and then click on map and visualize interactions.

Key points :
• Not all detected metabolites have KEGG identifiers.
• Not all compounds with KEGG ids can be mapped to
metabolic pathways.
• Pathway maps may not show all the reaction known for a
metabolite.
• Several compound appeared in many pathways.
• Online tools can map metabolite list to pathway maps and
perform pathway over-representation analysis for
metabolites.
• Online network visualization of a large number of
metabolites is not efficient.

KEGG - 495
MetaCyc - 2453
Reactome - 2000
HMDB – 613
Wikipathways - 789
Note : Pathway definitions are manual and vary across
different databases.
Major pathway databases
How many pathway we know so far ?

http://www.ncbi.nlm.nih.gov/pubmed/22591066
Not all metabolites have pathways

http://onlinelibrary.wiley.com/doi/10.1002/9781444339956.ch12/summary
Caution!
Can we trust a genome-derived pathway map ?

Glucose belongs to ?
KEGG DB
Reactome DB
Pathway boundaries are arbitrary

Chemical-similarity maps of small molecules and metals in human blood
Rappaport, Stephen M., et al. "The blood exposome and its role in discovering causes
of disease." Environmental Health Perspectives (Online)122.8 (2014): 769.
Node size = number of
pathways
Node size =
Epidemiological studies
• Epidemiologists and
systems biologists /
biochemist don’t study
same compounds.
• Epidemiologist seem to
study compounds having
a lesser number of known
pathways.
Do everyone care about pathways ?Do everyone care about pathways ?

Which TCA definitions ?
KEGG
Reactome
SMPDB
MetaCyc
Which TCA definition ?

P <0.10 P <0.05
Which p-value : < 0.05 OR < 0.10
decide the p-value
cutoff to get the list
of input
metabolites.

Metabolite pathway analysis: how about pathway topology ?
Shall they
have a same
score ?
Topology scoring
If affected compounds are in
close biochemical proximity,
the pathway should be given
a weightage score.

Fahrmann et al 2017
Gene Ontology TCA map
-include gene names or symbols in the map
Use pathway maps for focused biochemical visualization

Use pathway maps for focused biochemical visualization
Pathway maps provided by databases are not used
by publications. Authors draw their own maps and
show data on it.
https://www.nature.com/articles/cddis2011123

How to create these customized maps ?
(Re) read biochemistry books and draw them
OR

How to create these customized maps ?
Metabolites Reactions Genes
Pathway layout
Biochemical databases
Metabolomics dataset
Cytsocape + powerpoint
**No automated way exits to make pretty
looking customized pathway maps.
ProteinsKEGG
HMDB
MetaCyc
Wikipathways
Reactome
Use biochemical databases

Next questions :
• How to map all the identified metabolites
into a biochemical network ?
• How to show all the known reactions for a
metabolite ?
• How to map publishable network graphs ?

Mapping metabolites against pathway databases

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Mapping metabolites against pathway databases

Similar to Mapping metabolites against pathway databases (20)

Recently uploaded

Recently uploaded (20)

Mapping metabolites against pathway databases

Editor's Notes