ARCHIVED: new version available. 2016 - METASPACE Training Course

METASPACE training course
OurCon’16, 17.10.2016
Theodore Alexandrov (EMBL/UCSD/SCiLS)
Andy Palmer (EMBL)
Vitaly Kovalev (EMBL)
Artem Tarasov (EMBL)
@METASPACE2020

Welcome everyone!
Artem
Tarasov
“hacker”
Andy
Palmer
“scientist”
Vitaly
Kovalev
“developer”
Theodore
Alexandrov
“leader”

Part 1: Theory
14:00-14:10 Welcome
14:10-14:15 Introduction into the METASPACE project
14:15-14:45 Metabolite annotation in HR imaging MS
14:45-15:00 Overview of the annotation engine
Coffee Break 15:00-15:30
Part 2: Tutorial
15:30-16:30 Step-by-step analysis of datasets provided in
advance, questions
- Data requirements: 10 min
- Upload UI: 5 min
- Webapp UI: 15 min
- Interpretation: 15 min
- Split into 2 groups: 5 min
- Export to imzML, ideally parallel sessions: 15 min
- SCiLS, FTICR
- EMBL, Orbitrap
Part 3: Hands-on training
17:00-18:00 questions, data analysis
Agenda

Introduction
Theodore Alexandrov (EMBL, UCSD, SCiLS)

What we hope you will learn today
● Ins and outs of metabolite annotation in HR imaging MS
● Bioinformatics we developed for this problem
○ Metabolite Signal Match (MSM) score
○ False Discovery Rate estimation
○ FDR-controlled annotation
● The online engine we implemented
○ How to prepare data for submission to our service
○ How to submit your data
○ How to view molecular annotations in our webapp

Project overview: slides on slideshare
Bioinformatics: slides on slideshare
Theodore Alexandrov (EMBL, UCSD, SCiLS)

Overview of the annotation engine
Vitaly Kovalev (EMBL)

Outline
● Inputs (data and metadata)
● Online Software
● Data Submission
● Annotation Browsing
● Use Cases
a. mouse brain, MALDI-FTICR (UoR1)
b. human colorectal tumor, DESI-Orbitrap (ICL)

Input
● Centroided imzML
(http://ms-imaging.org/)

Input
● Centroided imzML
(http://ms-imaging.org/)
● Dataset metadata

Online Software
annotations
database
browse
annotations
task scheduler
upload

Upload Web UI
● http://upload.metasp.eu
● Easy upload

Upload Web App
● http://upload.metasp.eu
● Easy upload
● Metadata collection

SM Web UI
● http://alpha.metasp.eu
● Annotation browsing
● Technical details
● Feedback form

Use Case 1
Mouse brain
(MALDI-FTICR)
Data provided by
Regis Lavigne, Charles Pineau,
University of Rennes 1

Data provided by
James McKenzie, Zoltan Takats,
Imperial College London
Use Case 2
Human colorectal tumor
(DESI-Orbitrap)

Tutorial
Andy Palmer (EMBL)
Artem Tarasov (EMBL)

Part 1: Theory
14:00-14:10 Welcome, Outline, Learning objectives
14:10-14:15 Introduction into the METASPACE project
14:15-14:45 Metabolite annotation in HR imaging MS
14:45-15:00 Overview of the annotation engine
Part 2: Tutorial
15:30-16:30 Step-by-step analysis of datasets provided in
advance, questions
- Data requirements: 10 min
- Upload UI: 5 min
- Webapp UI: 15 min
- Interpretation: 15 min
- Split into 2 groups: 5 min
- Export to imzML, ideally parallel sessions: 15 min
- SCiLS, FTICR
- EMBL, Orbitrap
Part 3: Hands-on training
17:00-18:00 questions, data analysis
Agenda

Learning Outcomes
1. Preparing data for submission
2. Submitting data
3. Browsing results

Data Requirements
Imaging mass spectrometry data
- Any ionisation source
- Any spatial resolution
- Any tissue
- One section per dataset

Data Requirements
High resolving power
RFWHM
(@400) > 90K
Well-calibrated
ideally < 3 ppm

Data Requirements
Data Format
- imzML
Centroided
- vendor preferred
- http://metasp.eu/imzml
http://imzml.org/wp/introduction/

Customised Processing
Processing is tailored to your data!
- Technical metadata
- Resolving power
- isotope prediction
- Polarity
- adducts
R200
=70K R200
=280K
[C41
H78
NO7
P+K]+

Data Requirements
Your responsibility:
- Data is processed ‘as is’
- Check metadata is correct
- Report resolving power accurately (check within data-set)
- Low numbers of annotations often correspond to poor quality mass spectra
- Calibration inaccuracy
- Lock-mass errors

1. Follow conversion instructions for your instrument
2. Select the centroided files, imzML and ibd
3. Click the Upload button.
The dataset will be copied to the cloud storage
(accessible only to our team)
Data upload

Metadata form
● Appears once the upload is started
● Please fill truthfully
○ Most fields have ‘Other…’ option
○ Don’t want to disclose → put ‘-‘
● Click (at the very bottom)

Main web interface
http://alpha.metasp.eu
Results are public
Datasets are not

Annotation table
Sign in with a Google ID to provide feedback
Currently selected
molecule
(click to select)
MSM scoreprincipal peak m/z

Sorting/filtering annotations
Click on column headers to sortStart typing a formula or a metabolite name
Filter by database
or dataset Select an adduct Set minimum
MSM score
Enter m/z of interest

FDR color-coding
Green = annotated @ chosen FDR level
Red = not annotated @ chosen FDR

Details for highlighted annotation
molecule distribution
(sum of isotope images)
Putative metabolite IDs
from the database
Feedback!
Thumbs up: reasonable
Thumbs down: dubious
- tell us why it could be wrong!
Feedback is not public

Visual insight into MSM score assignment
Adduct
Exact m/z of
each ion image
Zoom plot
Ion images for each
isotope peak
Isotopic patterns
Blue: theoretical abundance
(at instrument resolving power)
Red: measured image intensity

Step-by-step search
Choose molecular formula database

Step-by-step search
Choose dataset

Step-by-step search
Type molecular class

Step-by-step search
Type single metabolite name Potassium adduct

Step-by-step search
Type single metabolite name Sodium adduct

Step-by-step search
Type single metabolite name Hydrogen adduct

Results Browsing Summary
1. Choose database
2. Choose data-set
3. Type ‘PC’
a. molecular class filter
4. Type ‘PC(16:0/18:0)
a. single metabolite filter
5. Select row of table
a. single ion filter
6. Simple comparison of spatial distributions
between adducts
Also possible
● Filter by m/z
● Formula search
● Comparison across datasets

FDR Controlled Annotation
False Discovery Rate - the fraction of incorrect annotations
Control - request a set of annotations at a fixed estimated FDR
Setting the level:
- Adjust the number of molecules for follow-up analysis
- When only limited numbers of molecules can be reviewed, adjust the
FDR so that fewer/great numbers of molecules are annotated
- Compare annotations between datasets
- A principled way of selecting molecules to compare between
datasets
True annotation
False discovery
MSM
score
FDR = 0.1
FDR = 0.2
FDR = nTrue
nFalse
+ nTrue

Choice of metabolite database
synthesized/recorded
88M CAS registry
biologically occurring/active
50M PubChem compounds
single biological system
40K HMDB
sample specific
1K LC-MS

Choice of metabolite database
Impacts search and False-Discovery-Rate estimation
● Use one that’s relevant
● Larger database
○ more false-hits --> fewer annotations at a fixed FDR
● Different databases give different annotations
○ even for molecules in both databases due to FDR control
○ for data-set comparison, use the same database

Annotating at level of molecular formula
● Possibility of multiple metabolites per sum formula
○ webapp shows all hits from the database search (learn the ambiguity!)
○ other databases can be searched (e.g. PubChem)
○ use enrichment analysis to get biological leads
● Use an orthogonal technique for reporting individual metabolites
○ not directly integrated (yet)
○ use web-app results help to target MS/MS studies (e.g. purchase of standards)

● we annotate molecular formula along with several putative metabolites
■ MSI Levels of classification:
1. identified metabolites
2. putatively annotated compounds
3. putatively characterised compound classes
4. unknown compounds
● In preparation: formal guidelines for reporting imaging mass spectrometry annotations
Guidelines for reporting
The role of reporting standards for metabolite annotation and identification in metabolomic studies,
Salek et al, 2013, gigascience

● Preparing data for submission
○ imzML export
○ metadata
● Submitting data
○ upload web-app
upload.metasp.eu
● Browsing results
○ results web-app
alpha.metasp.eu
Learning Summary
● METASPACE team:
○ web: metaspace2020.eu
○ email: contact@metaspace2020.eu
○ twitter: @metaspace2020
○ github: github.com/spatialmetabolomics
● FTICR data conversion
○ SCiLS: support@scils.de
● Orbitrap data conversion
○ Thermo Fisher Scientific:
kerstin.strupat@thermofisher.com
How to get help?

(Group 1) Export into imzML: FT-ICR data
Using SCiLS Lab’s METASPACE export

Export to METASPACE
● Export your centroided high-resolution spectra in the imzML format
● Only available for “FT-ICR type” SCiLS Lab files in SCiLS Lab 2016b
● Best results in METASPACE if peak list is required for centroiding
● Two different Bruker data formats
○ SQLite peak list data: Peak list provided during import
○ FT-ICR profile data: Generate a peak list after import

Create imzML file for METASPACE
● In the objects tab, click the export symbol of
the region to be exported and select
“Export to METASPACE”
● The Export Spectra dialog opens
● Set your normalization of choice
● Select your peak list of choice
for example “Imported Peaks” in case of
SQLite
● Provide your scan polarity
● Click OK to save imzML file

SQLite peak list data
● Data must have been acquired with
on-the-fly centroid detection
i.e. there is a file called ‘peaks.sqlite’ within the .d
folder of the data-set
● In SCiLS Lab a peak list “Imported peaks” is
available, selecting most frequent peaks
By default all peaks appearing more frequently
than 1% of spectra

FT-ICR profile data
● Older Solarix Files do not directly contain a peak
list to perform centroiding
● Create peak list with Data Analysis
SCiLS Lab Help Section 7.4
● Use METASPACE tool for peak finding
https://spatialmetabolomics.github.io/centroidize/
● Use other external tools (mMass, …)
● Import the external peak list into SCiLS Lab
File > Import > m/z intervals from CSV or Clipboard

Use METASPACE tool for peak finding
● Select the overview spectrum CSV exported from SCiLS
● Upload CSV file to METASPACE tool
● Copy values to clipboard
● Use File > Import > m/z intervals from CSV

Upload imzML files to METASPACE
● Go to http://upload.metasp.eu/

SCiLS Cloud: Exchange within the Scientific Community
1. SCiLS Lab: computational analysis
2. SCiLS Cloud: data & results can be
shared and viewed in web browser, e.g.,
○ MALDI imaging data,
○ Discriminative m/z markers,
○ Regions of interest, …
Comparison of mean spectra for ROIs m/z images of co-localized ions

Future Vision: SCiLS Cloud and METASPACE
SCiLS Lab
Statistical
analysis
METASPACE SCiLS Cloud
Upload data
and findings
Export data
to imzML
and upload
prospect: direct interface

(Group 2) Export into imzML: Orbitrap data (.raw)
Instructions: metaspace2020.eu/imzML
Software tools:
imageQuest / raw-converter
- Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-)
imzmlConverter
- Recommended for: DESI/flowProbe with separate files per row
Recommended for bioinformaticians: pyimzML (Python parser)

.raw -> imzML
● Commercial
○ Thermo
Scientific
imageQuest

Raw-imzml converter
.raw -> imzML
● Free
● http://ms-imaging.org/wp/raw-to-imzm
l-converter/

.raw -> mzML -> imzML
● MSConvert
○ free (link)
● imzMLConverter
○ free
○ requires registration
○ http://www.cs.bham.ac.uk/~ibs/imzMLC
onverter/>
imzmlConverter

This project has received funding from the European Union’s Horizon 2020 research and
innovation programme under grant agreement № 634402.
Acknowledgments
Example data was provided by:
University of Rennes 1
Regis Lavigne
Charles Pineau
EMBL
Ksenija Radic
Alexandra Koumoutsi
Andrew Palmer
EMBL
Theodore Alexandrov
Vitaly Kovalev
Artem Tarasov
Andrew Palmer
Dominik Fay
SCiLS
Dennis Trede
Jan Hendrik Kobarg

ARCHIVED: new version available. 2016 - METASPACE Training Course

Recommended

Recommended

More Related Content

Similar to ARCHIVED: new version available. 2016 - METASPACE Training Course

Similar to ARCHIVED: new version available. 2016 - METASPACE Training Course (20)

Recently uploaded

Recently uploaded (20)

ARCHIVED: new version available. 2016 - METASPACE Training Course