Building a Model Organism Metabolome Database

Building a
Model Organism
Metabolome
Database
Christoph Steinbeck
European Bioinformatics Institute, Cambridge, UK
Friedrich-Schiller-University, Jena, Germany

The
European Bioinformatics Institute
(EBI)

Friedrich-Schiller-University Jena, Germany
Founded 1558

European Bioinformatics Institute (EBI)
Genes, genomes & variation
Literature &
ontologies
Europe PubMed Central
Gene Ontology
Experimental Factor Ontology
Molecular structures
Protein Data Bank in Europe
Electron Microscopy Data Bank
European Nucleotide Archive
1000 Genomes
Gene, protein & metabolite expression
Protein sequences, families & motifs
Chemical biology
Reactions, interactions &
pathways Systems
Ensembl
Ensembl Genomes
European Genome-phenome Archive
Metagenomics portal

•8.7 mio eukaryotic species on earth (+- 1.3mio)

•1.2 mio species identiﬁed and classiﬁed

•3000 - 4000 complete species genomes sequenced

•3000 - 4000 complete species genomes sequenced
What about completed metabolomes?

1
10
100
1000
10000
100000
1000000
10000000
100000000
Metabolites
in Human
Metabolites
in Microbes
Compounds
in ChEBI
V154
Metabolites
in HMDB
V3.6
Metabolites
in Plants
Compounds
in ChEMBL
V23
Compounds
in PubChem
V8-2017
Species Metabolomes and
How Little We Know
(2008 vs 2016)

1
10
100
1000
10000
100000
1000000
10000000
100000000
Metabolites
in Human
Metabolites
in Microbes
Compounds
in ChEBI
V154
Metabolites
in HMDB
V3.6
Metabolites
in Plants
Compounds
in ChEMBL
V23
Compounds
in PubChem
V8-2017
How Little We Know
(2008 vs 2016)
80,000

1
10
100
1000
10000
100000
1000000
10000000
100000000
Metabolites
in Human
Metabolites
in Microbes
Compounds
in ChEBI
V154
Metabolites
in HMDB
V3.6
Metabolites
in Plants
Compounds
in ChEMBL
V23
Compounds
in PubChem
V8-2017
How Little We Know
(2008 vs 2016)
80,000
200,000

1
10
100
1000
10000
100000
1000000
10000000
100000000
Metabolites
in Human
Metabolites
in Microbes
Compounds
in ChEBI
V154
Metabolites
in HMDB
V3.6
Metabolites
in Plants
Compounds
in ChEMBL
V23
Compounds
in PubChem
V8-2017
How Little We Know
(2008 vs 2016)
80,000
200,000
2,000,000

Metabolomics has taken
off world-wide

The way things go
(for those ﬁelds I observed in Molecular Biology)
• Field emerges
• Scattered ecosystem of academic specialist databases
appears
• Field matures
• Efforts on open data formats, (MI-) standards, happen
• Long-term maintained databases by large data
providers are founded (at NCBI, EBI, DDBJ, …)
• Global data exchange network emerge

Experimental Repository
Reference Layer
Chemistry Spectroscopy Biology
AnalysisTools
Primary Literature
Primary data and Meta-Data, Spectra, Protocols, Synopses, ...
MetaboLights Database at the EBI

Labs around the
world send us
their data and
we…
Archive it
Classify it
Share it with
other data
providers
Analyse it
…provide
tools to help
researchers
use it
A collaborative
enterprise

Labs around the
world send us
their data and
we…
Archive it
Classify it
Share it with
other data
providers
Analyse it
…provide
tools to help
researchers
use it
A collaborative
enterprise
Data at the EBI
can be
used freely
by anyone for any
purpose

MetaboLights Repository at the EBI
Labs around the
world send us
their data and
we…
Archive it
Classify it
Share it with
other data
providers
Analyse it
…provide
tools to help
researchers
use it
A collaborative
enterprise
EBI databases are
supported over
decades

Data growth in EBI data repositories

3-month
doubling time
for
Metabolomics

3-month
doubling time
for
Metabolomics
MetaboLights is now
the recommended
repository
for the Nature journals,
EMBO journal, PLOS
journals, Metabolomics
Journal and others

Funded through European Commission COSMOS Grant EC312941

Sansone,… Steinbeck et al. (2012)
Toward interoperable bioscience data.
Nature Genetics, 44, 121–126.

ControlledVocabularies
Ontologies

ControlledVocabularies
Ontologies
Minimum Information Standards

Reference Layer
1000 species up from Spring 2016

30 most annotated
metabolomes in MetaboLights

Metabolome sizes in
MetaboLights on a log scale

Number of Studies in
MetaboLights per Species

Lessons learned from this
excursion into bioinformatics
• You can get people into depositing data by
convincing publishers to require it.
• This is easier once it became a community meme.
• Publishers, learned societies and funders actually
want this to happen, but are afraid.
• If one starts, others follow.
• It takes about six years to get there, once others
have done it.

Slides at
https://www.slideshare.net/csteinbeck
Funding

Building a Model Organism Metabolome Database

More Related Content

What's hot

Similar to Building a Model Organism Metabolome Database

More from Christoph Steinbeck

Recently uploaded

Building a Model Organism Metabolome Database