Building a
Model Organism
Metabolome
Database
Christoph Steinbeck
European Bioinformatics Institute, Cambridge, UK
Friedrich-Schiller-University, Jena, Germany
The
European Bioinformatics Institute
(EBI)
The
European Bioinformatics Institute
(EBI)
The
European Bioinformatics Institute
(EBI)
The
European Bioinformatics Institute
(EBI)
Friedrich-Schiller-University Jena, Germany
Founded 1558
European Bioinformatics Institute (EBI)
Genes, genomes & variation
Literature &
ontologies
Europe PubMed Central
Gene Ontology
Experimental Factor Ontology
Molecular structures
Protein Data Bank in Europe
Electron Microscopy Data Bank
European Nucleotide Archive
1000 Genomes
Gene, protein & metabolite expression
Protein sequences, families & motifs
Chemical biology
Reactions, interactions &
pathways Systems
Ensembl
Ensembl Genomes
European Genome-phenome Archive
Metagenomics portal
•8.7 mio eukaryotic species on earth (+- 1.3mio)
•8.7 mio eukaryotic species on earth (+- 1.3mio)
•1.2 mio species identified and classified
•8.7 mio eukaryotic species on earth (+- 1.3mio)
•1.2 mio species identified and classified
•3000 - 4000 complete species genomes sequenced
•8.7 mio eukaryotic species on earth (+- 1.3mio)
•1.2 mio species identified and classified
•3000 - 4000 complete species genomes sequenced
•8.7 mio eukaryotic species on earth (+- 1.3mio)
•1.2 mio species identified and classified
•3000 - 4000 complete species genomes sequenced
What about completed metabolomes?
•8.7 mio eukaryotic species on earth (+- 1.3mio)
•1.2 mio species identified and classified
•3000 - 4000 complete species genomes sequenced
What about completed metabolomes?
1	
10	
100	
1000	
10000	
100000	
1000000	
10000000	
100000000	
Metabolites	
in	Human	
Metabolites	
in	Microbes	
Compounds	
in	ChEBI	
V154	
Metabolites	
in	HMDB	
V3.6	
Metabolites	
in	Plants	
Compounds	
in	ChEMBL	
V23	
Compounds	
in	PubChem	
V8-2017	
Species Metabolomes and
How Little We Know
(2008 vs 2016)
1	
10	
100	
1000	
10000	
100000	
1000000	
10000000	
100000000	
Metabolites	
in	Human	
Metabolites	
in	Microbes	
Compounds	
in	ChEBI	
V154	
Metabolites	
in	HMDB	
V3.6	
Metabolites	
in	Plants	
Compounds	
in	ChEMBL	
V23	
Compounds	
in	PubChem	
V8-2017	
Species Metabolomes and
How Little We Know
(2008 vs 2016)
80,000
1	
10	
100	
1000	
10000	
100000	
1000000	
10000000	
100000000	
Metabolites	
in	Human	
Metabolites	
in	Microbes	
Compounds	
in	ChEBI	
V154	
Metabolites	
in	HMDB	
V3.6	
Metabolites	
in	Plants	
Compounds	
in	ChEMBL	
V23	
Compounds	
in	PubChem	
V8-2017	
Species Metabolomes and
How Little We Know
(2008 vs 2016)
80,000
200,000
1	
10	
100	
1000	
10000	
100000	
1000000	
10000000	
100000000	
Metabolites	
in	Human	
Metabolites	
in	Microbes	
Compounds	
in	ChEBI	
V154	
Metabolites	
in	HMDB	
V3.6	
Metabolites	
in	Plants	
Compounds	
in	ChEMBL	
V23	
Compounds	
in	PubChem	
V8-2017	
Species Metabolomes and
How Little We Know
(2008 vs 2016)
80,000
200,000
2,000,000
Metabolomics has taken
off world-wide
The way things go
(for those fields I observed in Molecular Biology)
• Field emerges
• Scattered ecosystem of academic specialist databases
appears
• Field matures
• Efforts on open data formats, (MI-) standards, happen
• Long-term maintained databases by large data
providers are founded (at NCBI, EBI, DDBJ, …)
• Global data exchange network emerge
Experimental Repository
Reference Layer
Chemistry Spectroscopy Biology
AnalysisTools
Primary Literature
Primary data and Meta-Data, Spectra, Protocols, Synopses, ...
MetaboLights Database at the EBI
MetaboLights Database at the EBI
Labs around the
world send us
their data and
we…
Archive it
Classify it
Share it with
other data
providers
Analyse it
…provide
tools to help
researchers
use it
A collaborative
enterprise
MetaboLights Database at the EBI
Labs around the
world send us
their data and
we…
Archive it
Classify it
Share it with
other data
providers
Analyse it
…provide
tools to help
researchers
use it
A collaborative
enterprise
Data at the EBI
can be
used freely
by anyone for any
purpose
MetaboLights Repository at the EBI
Labs around the
world send us
their data and
we…
Archive it
Classify it
Share it with
other data
providers
Analyse it
…provide
tools to help
researchers
use it
A collaborative
enterprise
EBI databases are
supported over
decades
Data growth in EBI data repositories
Data growth in EBI data repositories
3-month
doubling time
for
Metabolomics
Data growth in EBI data repositories
3-month
doubling time
for
Metabolomics
MetaboLights is now
the recommended
repository
for the Nature journals,
EMBO journal, PLOS
journals, Metabolomics
Journal and others
Funded through European Commission COSMOS Grant EC312941
Sansone,… Steinbeck et al. (2012)
Toward interoperable bioscience data.
Nature Genetics, 44, 121–126.
Sansone,… Steinbeck et al. (2012)
Toward interoperable bioscience data.
Nature Genetics, 44, 121–126.
ControlledVocabularies
Ontologies
Sansone,… Steinbeck et al. (2012)
Toward interoperable bioscience data.
Nature Genetics, 44, 121–126.
ControlledVocabularies
Ontologies
Minimum Information Standards
Repository Entry
Repository Entry
Reference Layer
Reference Layer
Reference Layer
1000 species up from Spring 2016
30 most annotated
metabolomes in MetaboLights
Metabolome sizes in
MetaboLights on a log scale
Number of Studies in
MetaboLights per Species
Lessons learned from this
excursion into bioinformatics
• You can get people into depositing data by
convincing publishers to require it.
• This is easier once it became a community meme.
• Publishers, learned societies and funders actually
want this to happen, but are afraid.
• If one starts, others follow.
• It takes about six years to get there, once others
have done it.
Slides at
https://www.slideshare.net/csteinbeck
Funding
Thanks for your attention

Building a Model Organism Metabolome Database