These slides provide a guide to using the METASPACE platform for annotating metabolites in high resolving power imaging mass spectrometry datasets. It describes
* the science behind molecular annotation
* how to use our web application to upload, browse, interpret and export annotations from the platform.
3. Part 1: Introduction
Learning Outcomes
Introduction into the METASPACE project
Metabolite annotation in HR imaging MS (Bioinformatics)
Overview of the annotation engine
Part 2: METASPACE Annotation Platform
Tutorial
Learning Outcomes
Data requirements
Data Submission
Annotation Browsing
Interpretation
Part 3: Export to imzML
FTICR (Bruker)
Orbitrap (Thermo)
Other vendors
Training Overview
5. What we hope you will learn today
● Ins and outs of metabolite annotation in HR imaging MS
● Bioinformatics we developed for this problem
○ Metabolite Signal Match (MSM) score
○ False Discovery Rate estimation
○ FDR-controlled annotation
● The online engine we implemented
○ How to prepare data for submission to our service
○ How to submit your data
8. Outline
● Inputs (data and metadata)
● Online Software
● Data Submission
● Annotation Browsing
● Use Cases
a. mouse brain, MALDI-FTICR (U Rennes 1)
b. human colorectal tumor, DESI-Orbitrap (ICL)
14. Example 2
Human colorectal tumor
(DESI-Orbitrap)
Data provided by
James McKenzie, Zoltan Takats,
Imperial College London
Filter different datasets
See data details
View metadata
15. Tutorial on how to use
METASPACE engine and molecular knowledgebase
19. Data Requirements
Data Format
- imzML
Centroided
- vendor preferred
- http://metaspace2020.eu/imzml
http://imzml.org/wp/introduction/
20. Customised Processing
Processing is tailored to your data!
- Technical metadata
- Resolving power
- isotope prediction
- Polarity
- adducts
R200=70K R200=280K
[C41H78NO7P+K]+
21. Data Requirements
Your responsibility:
- Data is processed ‘as is’
- Check metadata is correct
- Report resolving power accurately (check within data-set)
- Low numbers of annotations often correspond to poor quality mass spectra
- Calibration inaccuracy
- Lock-mass errors
23. 1. Follow conversion instructions for your instrument
2. Select the centroided files, .imzML and .ibd
3. The dataset will be copied to the cloud storage
(accessible only to our team)
Data upload
24. Metadata form
● Fields are auto-completed
● Please fill truthfully
○ Don’t want to disclose?
Just put ‘N/A’
● Click (top right)
○ Enabled once the files
finished uploading
28. Sorting/filtering annotations Click on column headers to sort
Add as many filters as you need
Quickly add a filter by hovering over a cell and clicking the icon
29. Molecule search Search by name (partial name search)
Or by molecular formula (exact match only)
Click to edit
30. Details for highlighted annotation
molecule distribution
(principal isotope)
Putative metabolite entries
from the database
31. Visual insight into MSM score assignment
Exact m/z of each
ion image
Click and drag
to zoom
Ion images for each
isotope peak
Isotopic patterns
Blue: theoretical abundance
(at instrument resolving power)
Red: measured image intensity
35. Step-by-step search
Export to CSV will save the current annotations table.
Changing the filters will change which annotations
are exported
Always export annotations for comparison together
(so they are at the same FDR)
36. Results Browsing Summary
1. Choose database
2. Choose data-set
3. Add molecule filter and type ‘PC’
a. molecular class filter
4. Type ‘PC(16:0/18:0)
a. single metabolite filter
5. Select row of table
a. single ion filter
6. Simple comparison of spatial distributions
between adducts
7. Export of annotations to csv
Also possible
● Filter by m/z
● Formula search
● Comparison across datasets
38. FDR Controlled Annotation
False Discovery Rate - the fraction of incorrect annotations
Control - request a set of annotations at a fixed estimated
FDR
Setting the level:
- Adjust the number of molecules for follow-up analysis
- When only limited numbers of molecules can be reviewed,
adjust the FDR so that fewer/great numbers of molecules are
annotated
- Compare annotations between datasets
- A principled way of selecting molecules to compare between
True annotation
False discovery
MSM
score
FDR = 0.2
FDR = nTrue
nFalse + nTrue
39. Choice of metabolite database
synthesized/recorded
88M CAS registry
biologically occurring/active
50M PubChem compounds
single biological system
40K HMDB
sample specific
1K LC-MS
40. Choice of metabolite database
Impacts search and False-Discovery-Rate estimation
● Use one that’s relevant
● Larger database
○ more false-hits --> fewer annotations at a fixed FDR
● Different databases give different annotations
○ even for molecules in both databases due to FDR control
○ for data-set comparison, use the same database
41. Annotating at level of molecular formula
● Possibility of multiple metabolites per sum formula
○ webapp shows all hits from the database search (learn the ambiguity!)
○ other databases can be searched (e.g. PubChem)
○ use enrichment analysis to get biological leads
● Use an orthogonal technique for reporting individual metabolites
○ not directly integrated (yet)
○ use web-app results help to target MS/MS studies (e.g. purchase of standards)
42. ● The METASPACE platform putatively annotate* molecular formula along with several
candidate metabolites
● A set of annotations should be reported along with the FDR threshold selected.
○ e.g. “Molecular annotation was performed using the METASPACE annotation
engine (Palmer et al, Nature Methods 2016). 150 molecules were annotated
against the LipidMAPS database at 10% FDR. Results are publically available at
annotate.metaspace2020.eu“
● The export function of the website delivered a spreadsheet that can be included as
supporting for any publication.
Reporting Results
Metabolomics Standards Initiative identification levels Sumner et al, 2007, Metabolomics
43. ● Preparing data for submission
○ imzML export
○ metadata
● Submitting data
○ web-app, upload
● Browsing knowledgebase
○ web-app, annotations
Learning Summary
● METASPACE team:
○ web: metaspace2020.eu
○ email: contact@metaspace2020.eu
○ twitter: @metaspace2020
○ source code:
https://github.com/METASPACE2020/
● FTICR data conversion
○ SCiLS: support@scils.de
● Orbitrap data conversion
How to get help?
46. Export to METASPACE
● Export your centroided high-resolution spectra in the imzML format
● Only available for “FT-ICR type” SCiLS Lab files in SCiLS Lab 2016b
● Best results in METASPACE if peak list is required for centroiding
● Two different Bruker data formats
○ SQLite peak list data: Peak list provided during import
○ FT-ICR profile data: Generate a peak list after import
47. Create imzML file for METASPACE
● In the objects tab, click the export symbol
of the region to be exported and select
“Export to METASPACE”
● The Export Spectra dialog opens
● Set your normalization of choice
● Select your peak list of choice
for example “Imported Peaks” in case of
SQLite
● Provide your scan polarity
● Click OK to save imzML file
48. SQLite peak list data
● Data must have been acquired with on-the-
fly centroid detection
i.e. there is a file called ‘peaks.sqlite’ within the .d
folder of the data-set
● In SCiLS Lab a peak list “Imported peaks” is
available, selecting most frequent peaks
By default all peaks appearing more frequently
than 1% of spectra
49. FT-ICR profile data
● Older Solarix Files do not directly contain a peak
list to perform centroiding
● Create peak list with Data Analysis
SCiLS Lab Help Section 7.4
● Use METASPACE tool for peak finding
https://spatialmetabolomics.github.io/centroidize/
● Use other external tools (mMass, …)
● Import the external peak list into SCiLS Lab
File > Import > m/z intervals from CSV or Clipboard
50. Use METASPACE tool for peak finding
● Select the overview spectrum CSV exported from SCiLS
● Upload CSV file to METASPACE tool
● Copy values to clipboard
● Use File > Import > m/z intervals from CSV
51. Upload imzML files to METASPACE
● Go to http://annotate.metaspace202.eu/#/upload
52. Export into imzML: Orbitrap data (.raw)
Instructions: metaspace2020.eu/imzML
Software tools:
imageQuest / raw-converter
- Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-)
imzmlConverter
- Recommended for: DESI/flowProbe with separate files per row
Recommended for bioinformaticians: pyimzML (Python parser)
56. This project has received funding from the European Union’s Horizon 2020 research and
innovation programme under grant agreement № 634402.
Acknowledgments
Example data provided by:
University of Rennes 1
Regis Lavigne
Charles Pineau
EMBL
Ksenija Radic
Alexandra Koumoutsi
Andrew Palmer
Imperial College London
James McKenzie
Zoltan Takats
METASPACE R&D team at EMBL
Theodore Alexandrov
Vitaly Kovalev
Artem Tarasov
Andrew Palmer
Dominik Fay
SCiLS imzML export
Dennis Trede
Jan Hendrik Kobarg