dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing Mechanistic and Functional Interpretation of Microbial Metabolites 02/23/2024
Presenter: Pieter Dorrestein, PhD, Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences, Department of Pharmacology and Pediatrics, University of California San Diego
Abstract
In the analysis of organs, volatilome, or biofluids, the microbiome influences 15-70% of detectable mass spectrometry molecules. Typically, only 10% of human untargeted metabolomics data can be assigned a molecular structure, with merely 1-2% traceable to microbial origins. Human microbiomes contribute metabolites through the microbial metabolism of host-derived substances, digestion of food and beverage molecules, and de novo assembly using proteins encoded by genetic elements. Despite the significance of microbiome-derived metabolites to human health, there is no centralized knowledge base for community access. To address this, the "Collaborative Microbial Metabolite Center" (CMMC) leverages expertise in mass spectrometry, microbiome innovation, and the GNPS ecosystem to built a knowledgebase. It aims to create a user-accessible microbiome resource, enrich bioactivity knowledge, and facilitate data deposition. The CMMC includes the construction of a knowledge base, MicrobeMASST tool, and health phenotype enrichment workflows, the construction and use will be discussed in this presentation. The use of this ecosystem will be exemplified by the discovery of 20,000 bile acids, many of which were shown to be of microbial origin and linked to diet and IBD.
The top 3 key questions that this resource can answer:
1. How can we leverage the 1000’s of public metabolomics studies to discover microbial metabolites and their organ distributions as well as their phenotypic, including health, associations?
2. If one has an unknown molecule, how can one assess what microbes make a molecule without known structure?
3. How can one contribute to the expansion of the knowledgebase on microbial metabolites?
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET
More Related Content
Similar to dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing Mechanistic and Functional Interpretation of Microbial Metabolites 02/23/2024
Similar to dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing Mechanistic and Functional Interpretation of Microbial Metabolites 02/23/2024 (20)
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing Mechanistic and Functional Interpretation of Microbial Metabolites 02/23/2024
1. For this Presentation, photo’s
recordings, and tweets welcome -
Even for unpublished work
@Pdorrestein1
Pieter C. Dorrestein
4. Microbial metabolites Influence Human Health
Production of
energy metabolites,
neurotransmitters,
and vitamins
Many more
unknown
molecules
with
unknown
function
4
5. Anti- and Pro-Inflammatory Microbial Metabolites
Indole-3-Lactate (ILA)1-3
Produced by Bifidobacterium species
Modulates CD4+ T cells in the intestine and
reduce inflammation
1Ehrlich, A.M., et al. Indole-3-lactic acid associated with Bifidobacterium-dominated microbiota
significantly decreases inflammation in intestinal epithelial cells. BMC Microbiology (2020).
2Meng, D., et al. Indole-3-lactic acid, a metabolite of tryptophan, secreted by Bifidobacterium
longum subspecies infantis is anti-inflammatory in the immature intestine. Pediatric Research (2020)
3Laursen, M.F., et al. Bifidobacterium species associated with breastfeeding produce aromatic lactic
acids in the infant gut. Nature Microbiology (2021)
Colibactin4-6
4Arthur JC, Perez-Chanona E, Mühlbauer M, et al. Intestinal inflammation targets cancer-inducing
activity of the microbiota. Science. (2012)
4Cougnoux A, Dalmasso G, Martinez R, et al. Bacterial genotoxin colibactin promotes colon
tumour growth by inducing a senescence-associated secretory phenotype. Gut. (2014)
6Dziubańska-Kusibab, P.J., Berger, H., Battistini, F. et al. Colibactin DNA-damage signature
indicates mutational impact in colorectal cancer. Nature Medicine. (2020)
Produced by E. coli and other
Enterobacteriaceae
Involved in Inflammatory Bowel Disease
(IBD) and Colorectal Cancer
5
12. Collaborative Microbial Metabolite Center (CMMC)
Microbial Metabolite data and
knowledge of NIH PAR-21-253
grantees.
Curation, storage, and
knowledge enrichment
(Dorrestein, Wang, Bandeira)
Enable data reuse and
analysis.
(Dorrestein, Wang, Knight)
CMMC microbial data portal
upload, knowledge capture,
search and analysis
(Wang, Dorrestein)
Organize annual CMMC
meeting
(Dorrestein)
Documentation and training
(Wang, Dorrestein)
(NIDDK, NIAID, NIDCR, NIEHS, NCI, NICHD)
MK
13. Collaborative Microbial Metabolite Center (CMMC)
Microbial Metabolite data and
knowledge of NIH PAR-21-253
grantees.
Curation, storage, and
knowledge enrichment
(Dorrestein, Wang, Bandeira)
Enable data reuse and
analysis.
(Dorrestein, Wang, Knight)
CMMC microbial data portal
upload, knowledge capture,
search and analysis
(Wang, Dorrestein)
Organize annual CMMC
meeting
(Dorrestein)
Documentation and training
(Wang, Dorrestein)
(NIDDK, NIAID, NIDCR, NIEHS, NCI, NICHD)
MK
14. Collaborative Microbial Metabolite Center (CMMC)
Microbial Metabolite data and
knowledge of NIH PAR-21-253
grantees.
Curation, storage, and
knowledge enrichment
(Dorrestein, Wang, Bandeira)
Enable data reuse and
analysis.
(Dorrestein, Wang, Knight)
CMMC microbial data portal
upload, knowledge capture,
search and analysis
(Wang, Dorrestein)
Organize annual CMMC
meeting
(Dorrestein)
Documentation and training
(Wang, Dorrestein)
(NIDDK, NIAID, NIDCR, NIEHS, NCI, NICHD)
MK
15. Collaborative Microbial Metabolite Center (CMMC)
Microbial Metabolite data and
knowledge of NIH PAR-21-253
grantees.
Curation, storage, and
knowledge enrichment
(Dorrestein, Wang, Bandeira)
Enable data reuse and
analysis.
(Dorrestein, Wang, Knight)
CMMC microbial data portal
upload, knowledge capture,
search and analysis
(Wang, Dorrestein)
Organize annual CMMC
meeting
(Dorrestein)
Documentation and training
(Wang, Dorrestein)
(NIDDK, NIAID, NIDCR, NIEHS, NCI, NICHD)
MK
16. Collaborative Microbial Metabolite Center (CMMC)
Microbial Metabolite data and
knowledge of NIH PAR-21-253
grantees.
Curation, storage, and
knowledge enrichment
(Dorrestein, Wang, Bandeira)
Enable data reuse and
analysis.
(Dorrestein, Wang, Knight)
CMMC microbial data portal
upload, knowledge capture,
search and analysis
(Wang, Dorrestein)
Organize annual CMMC
meeting
(Dorrestein)
Documentation and training
(Wang, Dorrestein)
(NIDDK, NIAID, NIDCR, NIEHS, NCI, NICHD)
MK
17. Collaborative Microbial Metabolite Center (CMMC)
Microbial Metabolite data and
knowledge of NIH PAR-21-253
grantees.
Curation, storage, and
knowledge enrichment
(Dorrestein, Wang, Bandeira)
Enable data reuse and
analysis.
(Dorrestein, Wang, Knight)
CMMC microbial data portal
upload, knowledge capture,
search and analysis
(Wang, Dorrestein)
Organize annual CMMC
meeting
(Dorrestein)
Documentation and training
(Wang, Dorrestein)
(NIDDK, NIAID, NIDCR, NIEHS, NCI, NICHD)
MK
Other NIH
Awardees
World-wide
Other data
around the
world.
MIME
NPAtlas
MiBIG
Metabolights
Metabolomics
WB
Metabolic
models
+ others
18. CMMC Microbial data upload, knowledge capture and data driven enrichment
Indexed
19. CMMC Microbial data upload, knowledge capture and data driven enrichment
Indexed
20. CMMC Microbial data upload, knowledge capture and data driven enrichment
Indexed
37. Knowledgebase
Helena Russo Shipei Xing
Upload was created with 1-on-1 meetings with individual R01 grantees
and we expect it to continue to evolve as needs arise.
52. Materials: tinyurl.com/CMMCWS
Metabolomics data (LC-MS/MS)
-Rel intensities
-and in 14% annotations
-Relationships of structural
similarity.
CMMC community curated
Grown to
~5,000
annotations
of ~600
compounds
Call to the community
To populate at the end of this
workshop
Molecular networking enrichment workflow using CMMCkb.
Watrous PNAS 2012, Wang Nat Biotech 2016
59. Building microbeMASST
> 100 MILLION
MS/MS spectra
- Bacteria
- Fungi
- Human cells
- QC/Blanks
1
2
1
Wang, Mingxun, et al. "Sharing and community curation of mass spectrometry data with Global
Natural Products Social Molecular Networking." Nature biotechnology (2016)
2Jarmusch, Alan K., et al. "ReDU: a framework to find and reanalyze public mass spectrometry
data." Nature methods (2020)
In 2 months, 25 different research groups
(> 70 scientists) from 15 different
countries have deposited and made
publicly available their data
+
59
Simone Ming Wang
Anelize
Robin
60. Taxonomic Tree
> 60,000 unique files
- Bacteria
- Fungi
- Human cells
- QC/Blanks
Extraction of unique
taxonomic names or
NCBI IDs (2,335)
CHALLENGES
• Most samples do not have
associated genomic data
• Controlled metadata not
always available
• Several strains and species
not present or deposited
• Multi-domain tree
60
Simone Ming Wang
Anelize
Robin
61. Taxonomic Tree
Extraction of unique
taxonomic names or
NCBI IDs (2,335)
Blasting – 1,858
unique lineages
Taxonomic tree –
2,849 nodes
Domain 3
Phylum 16
Class 41
Order 109
Family 264
Genus 539
Species 1336
Strains 541
Bacteria
Actinobacteria
Bacteroidetes
Cyanobacteria
Deinococcus-Thermus
Firmicutes
Fusobacteria
Proteobacteria
Spirochaetes
Verrucomicrobia
Archaea
Halobacteria
Methanobacteria
Eukaryota
Ascomycota
Basidiomycota
Mucoromycota
Zoopagomycota
Chordata 61
Simone Ming Wang
Anelize
Robin
65. Discovery of 1000s of new microbial bile acids – an
example of the use of the ecosystem
and what is becoming possible with CMMC.
Helena Ipsita Yasin
66. Conjugation:
Glycine/Taurine
Liver
BILE ACIDS
66
Why are bile acids
important?
Different
conjugations
Phe
Tyr
Leu
Quinn R et al. Global chemical effects of the microbiomeinclude new bile-acid conjugations. Nature 579, 123–129 (2020).
67. How many more bile acids modifications can
occur? How can we find them?
68. Data filtering using Mass
Spectrometry patterns
1 2 3
4
5
MASS SPEC QUERY LANGUAGE
Jarmusch AK et al. A Universal Language for Finding Mass Spectrometry Data Patterns. bioRxiv (2022).
68
Mingxun Wang
73. QUERY scaninfo(MS2DATA) WHERE
MS2PROD=337.25:TOLERANCEMZ=0.01:INTENSITYPERCENT=5 AND
MS2PROD=319.24:TOLERANCEMZ=0.01:INTENSITYPERCENT=5 AND
MS2PREC=X AND MS2PROD=X-
390.277:TOLERANCEMZ=0.01:INTENSITYPERCENT=5
QUERY TRANSLATION
Returning the scan information on MS2.
The following conditions are applied to find scans
in the mass spec data.
Finding MS2 peak at m/z 337.25 with a 0.01 m/z
tolerance and a minimum percent intensity
relative to base peak of 5.0%.
Finding MS2 peak at m/z 319.24 with a 0.01 m/z
tolerance and a minimum percent intensity
relative to base peak of 5.0%.
Finding MS2 spectra with a precursor m/z X.
Finding MS2 peak at m/z X-390.277 with a 0.01 m/z
tolerance and a minimum percent intensity
relative to base peak of 5.0%.
BILE ACIDS: MASSQL QUERY DESIGN
73
75. ~1.2 billion MS/MS spectra in GNPS/MassIVE
594,365 total MS/MS spectra from MassQL
21,483 unique MS/MS spectra from MassQL
5,576 delta masses (candidate modifications)
BILE ACIDS: MASSQL QUERY RESULTS
75
Precursor ion
m/z 465.32
Delta mass
m/z 57.02
Gly
– Free bile acid
(m/z 408.29)
HO OH
OH
O
OH
76. BILE ACIDS REACH DISTANT ORGANS: RODENTS
+
(Controlled vocabulary metadata)
Mass Spectrometry Search Tool
Wang M et al. Mass spectrometry searches using MASST. Nat Biotechnol 38, 23–26 (2020).
Jarmusch AK et al. ReDU: a framework to find and reanalyze public mass spectrometry data. Nat Methods 17, 901–904 (2020).
76
77. +
(Controlled vocabulary metadata)
Mass Spectrometry Search Tool
BILE ACIDS REACH DISTANT ORGANS: HUMANS
Wang M et al. Mass spectrometry searches using MASST. Nat Biotechnol 38, 23–26 (2020).
Jarmusch AK et al. ReDU: a framework to find and reanalyze public mass spectrometry data. Nat Methods 17, 901–904 (2020).
77
81. Reverse Metabolomics Reveals Disease Associations
Mining the Knowledge Graph of Public data
Gentry Nature 2024
0 25 50 75 100
% of samples per condition
dihydroxy BA trihydroxy BA
Ala
Arg
Asn
Asp
Cit
Cys
Gln
Glu
Gly
His
Leu
Lys
Met
Orn
Phe
Ser
Tau
Thr
Trp
Tyr
Val
Ala
Arg
Asn
Asp
Cit
Cys
Gln
Glu
Gly
His
Leu
Lys
Met
Orn
Phe
Ser
Tau
Thr
Trp
Tyr
Val
no disease reported
Chagas disease
circadian disorders
sleep deprivation
CD
UC
IBD
obesity
disease NOS
Amino acid conjugation
Health
Condition
0.0
0.2
0.4
0.6
0.8
proportion of samples
Emily Gentry
82. 0 25 50 75 100
% of samples per condition
dihydroxy BA trihydroxy BA
Ala
Arg
Asn
Asp
Cit
Cys
Gln
Glu
Gly
His
Leu
Lys
Met
Orn
Phe
Ser
Tau
Thr
Trp
Tyr
Val
Ala
Arg
Asn
Asp
Cit
Cys
Gln
Glu
Gly
His
Leu
Lys
Met
Orn
Phe
Ser
Tau
Thr
Trp
Tyr
Val
no disease reported
Chagas disease
circadian disorders
sleep deprivation
CD
UC
IBD
obesity
disease NOS
Amino acid conjugation
Health
Condition
0.0
0.2
0.4
0.6
0.8
proportion of samples
Emily Gentry
Reverse Metabolomics Reveals Disease Associations
Mining the Knowledge Graph of Public data.
Gentry Nature 2024
84. Emily Gentry
Reverse Metabolomics Reveals Disease Associations
Mining the Knowledge Graph of Public data.
Pediatric IBD cohort
Gentry Nature 2024
85. IBD Association is Validated in Independent Cohorts
(and now have retention time match as well)
CD
n=265
UC
n=146
non-IBD
n=135
iHMP2 cohort
Gentry Nature 2024
89. The community is beginning to show activities
for these bile acids.
Accepted to Nature
FXR Germination
FXR
TGR5
PXR
TGR5
Stem cell
Intestinal barrier
Brain Injury
FXR
AHR
PPARa
PPARy
PPARd
VDR
CAR
91. Summary and some additional take home messages.
-We have a community curation system in place.
-We encourage data deposition. The more data and knowledge the
easier it will become to mechanistically interpret the microbiome. If you want to
contribute or be beta-testers for tools let us know.
-CMMC website with all information. https://cmmc.gnps2.org/
-All the resources are open access and FAIR compliant.
-Give hands on CMMC data science workshops (the past three
are recorded and documentation available on the CMMC website).
-We need to rewrite textbooks associated with microbial bile acids and roles
in biology.