SlideShare a Scribd company logo
METASPACE training guide
2017
Theodore Alexandrov (EMBL, UCSD)
Andy Palmer (EMBL)
Vitaly Kovalev (EMBL)
Artem Tarasov (EMBL)
@METASPACE2020
Welcome everyone!
Artem
Tarasov
“hacker”
Andy
Palmer
“scientist”
Vitaly
Kovalev
“developer”
Theodore
Alexandrov
“leader”
Part 1: Introduction
Learning Outcomes
Introduction into the METASPACE project
Metabolite annotation in HR imaging MS (Bioinformatics)
Overview of the annotation engine
Part 2: METASPACE Annotation Platform
Tutorial
Learning Outcomes
Data requirements
Data Submission
Annotation Browsing
Interpretation
Part 3: Export to imzML
FTICR (Bruker)
Orbitrap (Thermo)
Other vendors
Training Overview
Introduction
What we hope you will learn today
● Ins and outs of metabolite annotation in HR imaging MS
● Bioinformatics we developed for this problem
○ Metabolite Signal Match (MSM) score
○ False Discovery Rate estimation
○ FDR-controlled annotation
● The online engine we implemented
○ How to prepare data for submission to our service
○ How to submit your data
Project overview: slides on slideshare
Bioinformatics: slides on slideshare
Overview of the annotation engine
Outline
● Inputs (data and metadata)
● Online Software
● Data Submission
● Annotation Browsing
● Use Cases
a. mouse brain, MALDI-FTICR (U Rennes 1)
b. human colorectal tumor, DESI-Orbitrap (ICL)
http://annotate.metaspace2020.e
u
imzML,
metadata
upload
annotations
database
task scheduler
explore
annotations
10 minutes
engine
The METASPACE platform
Amazon Cloud
Upload: Data
centroided imzML
http://ms-imaging.org/
Upload: Metadata
sample information
acquisition details
Upload: Interface annotate.metaspace2020.eu/#/upload
metadata collection
data upload
Example 1
Mouse brain
(MALDI-FTICR)
Data provided by
Regis Lavigne, Charles Pineau,
University of Rennes 1
Select an annotation
See the molecular
distribution
Example 2
Human colorectal tumor
(DESI-Orbitrap)
Data provided by
James McKenzie, Zoltan Takats,
Imperial College London
Filter different datasets
See data details
View metadata
Tutorial on how to use
METASPACE engine and molecular knowledgebase
Learning Outcomes
1. Preparing data for submission
2. Submitting data
3. Browsing results
Data Requirements
Imaging mass spectrometry data
- Any ionisation source
- Any spatial resolution
- Any tissue
- One section per dataset
Data Requirements
High resolving power
RFWHM(@400) > 90K
Well-calibrated
ideally < 3 ppm
Data Requirements
Data Format
- imzML
Centroided
- vendor preferred
- http://metaspace2020.eu/imzml
http://imzml.org/wp/introduction/
Customised Processing
Processing is tailored to your data!
- Technical metadata
- Resolving power
- isotope prediction
- Polarity
- adducts
R200=70K R200=280K
[C41H78NO7P+K]+
Data Requirements
Your responsibility:
- Data is processed ‘as is’
- Check metadata is correct
- Report resolving power accurately (check within data-set)
- Low numbers of annotations often correspond to poor quality mass spectra
- Calibration inaccuracy
- Lock-mass errors
Data Submission
1. Follow conversion instructions for your instrument
2. Select the centroided files, .imzML and .ibd
3. The dataset will be copied to the cloud storage
(accessible only to our team)
Data upload
Metadata form
● Fields are auto-completed
● Please fill truthfully
○ Don’t want to disclose?
Just put ‘N/A’
● Click (top right)
○ Enabled once the files
finished uploading
Browsing Results
Annotation
knowledgebase
web app
http://annotate.metaspace2020.eu
Results and metadata are public
Datasets are not
Annotation table
Currently selected
molecule
(click to select)
MSM score
principal peak m/z
Sorting/filtering annotations Click on column headers to sort
Add as many filters as you need
Quickly add a filter by hovering over a cell and clicking the icon
Molecule search Search by name (partial name search)
Or by molecular formula (exact match only)
Click to edit
Details for highlighted annotation
molecule distribution
(principal isotope)
Putative metabolite entries
from the database
Visual insight into MSM score assignment
Exact m/z of each
ion image
Click and drag
to zoom
Ion images for each
isotope peak
Isotopic patterns
Blue: theoretical abundance
(at instrument resolving power)
Red: measured image intensity
Step-by-step search
Choose molecular formula database
Step-by-step search
Add dataset filter, then choose dataset(s)
Step-by-step search
Type molecule name or molecular formula
The annotations table will dynamically update
Step-by-step search
Export to CSV will save the current annotations table.
Changing the filters will change which annotations
are exported
Always export annotations for comparison together
(so they are at the same FDR)
Results Browsing Summary
1. Choose database
2. Choose data-set
3. Add molecule filter and type ‘PC’
a. molecular class filter
4. Type ‘PC(16:0/18:0)
a. single metabolite filter
5. Select row of table
a. single ion filter
6. Simple comparison of spatial distributions
between adducts
7. Export of annotations to csv
Also possible
● Filter by m/z
● Formula search
● Comparison across datasets
Interpretation
FDR Controlled Annotation
False Discovery Rate - the fraction of incorrect annotations
Control - request a set of annotations at a fixed estimated
FDR
Setting the level:
- Adjust the number of molecules for follow-up analysis
- When only limited numbers of molecules can be reviewed,
adjust the FDR so that fewer/great numbers of molecules are
annotated
- Compare annotations between datasets
- A principled way of selecting molecules to compare between
True annotation
False discovery
MSM
score
FDR = 0.2
FDR = nTrue
nFalse + nTrue
Choice of metabolite database
synthesized/recorded
88M CAS registry
biologically occurring/active
50M PubChem compounds
single biological system
40K HMDB
sample specific
1K LC-MS
Choice of metabolite database
Impacts search and False-Discovery-Rate estimation
● Use one that’s relevant
● Larger database
○ more false-hits --> fewer annotations at a fixed FDR
● Different databases give different annotations
○ even for molecules in both databases due to FDR control
○ for data-set comparison, use the same database
Annotating at level of molecular formula
● Possibility of multiple metabolites per sum formula
○ webapp shows all hits from the database search (learn the ambiguity!)
○ other databases can be searched (e.g. PubChem)
○ use enrichment analysis to get biological leads
● Use an orthogonal technique for reporting individual metabolites
○ not directly integrated (yet)
○ use web-app results help to target MS/MS studies (e.g. purchase of standards)
● The METASPACE platform putatively annotate* molecular formula along with several
candidate metabolites
● A set of annotations should be reported along with the FDR threshold selected.
○ e.g. “Molecular annotation was performed using the METASPACE annotation
engine (Palmer et al, Nature Methods 2016). 150 molecules were annotated
against the LipidMAPS database at 10% FDR. Results are publically available at
annotate.metaspace2020.eu“
● The export function of the website delivered a spreadsheet that can be included as
supporting for any publication.
Reporting Results
Metabolomics Standards Initiative identification levels Sumner et al, 2007, Metabolomics
● Preparing data for submission
○ imzML export
○ metadata
● Submitting data
○ web-app, upload
● Browsing knowledgebase
○ web-app, annotations
Learning Summary
● METASPACE team:
○ web: metaspace2020.eu
○ email: contact@metaspace2020.eu
○ twitter: @metaspace2020
○ source code:
https://github.com/METASPACE2020/
● FTICR data conversion
○ SCiLS: support@scils.de
● Orbitrap data conversion
How to get help?
imzML Export
Export into imzML: FT-ICR data
Using SCiLS Lab’s METASPACE export
Export to METASPACE
● Export your centroided high-resolution spectra in the imzML format
● Only available for “FT-ICR type” SCiLS Lab files in SCiLS Lab 2016b
● Best results in METASPACE if peak list is required for centroiding
● Two different Bruker data formats
○ SQLite peak list data: Peak list provided during import
○ FT-ICR profile data: Generate a peak list after import
Create imzML file for METASPACE
● In the objects tab, click the export symbol
of the region to be exported and select
“Export to METASPACE”
● The Export Spectra dialog opens
● Set your normalization of choice
● Select your peak list of choice
for example “Imported Peaks” in case of
SQLite
● Provide your scan polarity
● Click OK to save imzML file
SQLite peak list data
● Data must have been acquired with on-the-
fly centroid detection
i.e. there is a file called ‘peaks.sqlite’ within the .d
folder of the data-set
● In SCiLS Lab a peak list “Imported peaks” is
available, selecting most frequent peaks
By default all peaks appearing more frequently
than 1% of spectra
FT-ICR profile data
● Older Solarix Files do not directly contain a peak
list to perform centroiding
● Create peak list with Data Analysis
SCiLS Lab Help Section 7.4
● Use METASPACE tool for peak finding
https://spatialmetabolomics.github.io/centroidize/
● Use other external tools (mMass, …)
● Import the external peak list into SCiLS Lab
File > Import > m/z intervals from CSV or Clipboard
Use METASPACE tool for peak finding
● Select the overview spectrum CSV exported from SCiLS
● Upload CSV file to METASPACE tool
● Copy values to clipboard
● Use File > Import > m/z intervals from CSV
Upload imzML files to METASPACE
● Go to http://annotate.metaspace202.eu/#/upload
Export into imzML: Orbitrap data (.raw)
Instructions: metaspace2020.eu/imzML
Software tools:
imageQuest / raw-converter
- Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-)
imzmlConverter
- Recommended for: DESI/flowProbe with separate files per row
Recommended for bioinformaticians: pyimzML (Python parser)
.raw -> imzML
● Commercial
○ Thermo
Scientific
imageQuest
Raw-imzml converter
.raw -> imzML
● Free
● http://ms-imaging.org/wp/raw-to-
imzml-converter/
.raw -> mzML -> imzML
● MSConvert
○ free (link)
● imzMLConverter
○ free
○ requires registration
○ http://www.cs.bham.ac.uk/~ibs/imzMLConverte
r
Export into imzML: Generic
This project has received funding from the European Union’s Horizon 2020 research and
innovation programme under grant agreement № 634402.
Acknowledgments
Example data provided by:
University of Rennes 1
Regis Lavigne
Charles Pineau
EMBL
Ksenija Radic
Alexandra Koumoutsi
Andrew Palmer
Imperial College London
James McKenzie
Zoltan Takats
METASPACE R&D team at EMBL
Theodore Alexandrov
Vitaly Kovalev
Artem Tarasov
Andrew Palmer
Dominik Fay
SCiLS imzML export
Dennis Trede
Jan Hendrik Kobarg

More Related Content

Similar to ARCHIVED: new version available - METASPACE Step by Step guide

ifip2008albashiri.pdf
ifip2008albashiri.pdfifip2008albashiri.pdf
ifip2008albashiri.pdf
KamalAlbashiri
 
Pittcon06 Auto Chrom
Pittcon06 Auto ChromPittcon06 Auto Chrom
Pittcon06 Auto Chrom
niharaina
 
Machine learning Experiments report
Machine learning Experiments report Machine learning Experiments report
Machine learning Experiments report
AlmkdadAli
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark ML
Ahmet Bulut
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
Dmitry Grapov
 
Integrative information management for systems biology
Integrative information management for systems biologyIntegrative information management for systems biology
Integrative information management for systems biology
Neil Swainston
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Anubhav Jain
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET Journal
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
Shesha R
 
Module 3 - Basics of Data Manipulation in Time Series
Module 3 - Basics of Data Manipulation in Time SeriesModule 3 - Basics of Data Manipulation in Time Series
Module 3 - Basics of Data Manipulation in Time Series
ssusere5ddd6
 
Raptor user manual3.0
Raptor user manual3.0Raptor user manual3.0
Raptor user manual3.0
Elizabeth Reyna
 
Presentation
PresentationPresentation
Presentation
butest
 
Metabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsMetabolic network mapping for metabolomics
Metabolic network mapping for metabolomics
Dinesh Barupal
 
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
ChemAxon
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
Ivo Andreev
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
ESCOM
 
Metadata Matters! What it is and How to Manage it
Metadata Matters! What it is and How to Manage itMetadata Matters! What it is and How to Manage it
Metadata Matters! What it is and How to Manage it
Safe Software
 
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear RegressionMachine Learning - Simple Linear Regression
Machine Learning - Simple Linear Regression
Siddharth Shrivastava
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
IRJET Journal
 
The ABAP Query
The ABAP QueryThe ABAP Query
The ABAP Query
PeterHBrown
 

Similar to ARCHIVED: new version available - METASPACE Step by Step guide (20)

ifip2008albashiri.pdf
ifip2008albashiri.pdfifip2008albashiri.pdf
ifip2008albashiri.pdf
 
Pittcon06 Auto Chrom
Pittcon06 Auto ChromPittcon06 Auto Chrom
Pittcon06 Auto Chrom
 
Machine learning Experiments report
Machine learning Experiments report Machine learning Experiments report
Machine learning Experiments report
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark ML
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
Integrative information management for systems biology
Integrative information management for systems biologyIntegrative information management for systems biology
Integrative information management for systems biology
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
Module 3 - Basics of Data Manipulation in Time Series
Module 3 - Basics of Data Manipulation in Time SeriesModule 3 - Basics of Data Manipulation in Time Series
Module 3 - Basics of Data Manipulation in Time Series
 
Raptor user manual3.0
Raptor user manual3.0Raptor user manual3.0
Raptor user manual3.0
 
Presentation
PresentationPresentation
Presentation
 
Metabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsMetabolic network mapping for metabolomics
Metabolic network mapping for metabolomics
 
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
Metadata Matters! What it is and How to Manage it
Metadata Matters! What it is and How to Manage itMetadata Matters! What it is and How to Manage it
Metadata Matters! What it is and How to Manage it
 
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear RegressionMachine Learning - Simple Linear Regression
Machine Learning - Simple Linear Regression
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
 
The ABAP Query
The ABAP QueryThe ABAP Query
The ABAP Query
 

Recently uploaded

Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Sérgio Sacani
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
Ritik83251
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Sérgio Sacani
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Tissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptxTissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptx
muralinath2
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
Sérgio Sacani
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
Frédéric Baudron
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
sammy700571
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Tissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptxTissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptx
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
 

ARCHIVED: new version available - METASPACE Step by Step guide

  • 1. METASPACE training guide 2017 Theodore Alexandrov (EMBL, UCSD) Andy Palmer (EMBL) Vitaly Kovalev (EMBL) Artem Tarasov (EMBL) @METASPACE2020
  • 3. Part 1: Introduction Learning Outcomes Introduction into the METASPACE project Metabolite annotation in HR imaging MS (Bioinformatics) Overview of the annotation engine Part 2: METASPACE Annotation Platform Tutorial Learning Outcomes Data requirements Data Submission Annotation Browsing Interpretation Part 3: Export to imzML FTICR (Bruker) Orbitrap (Thermo) Other vendors Training Overview
  • 5. What we hope you will learn today ● Ins and outs of metabolite annotation in HR imaging MS ● Bioinformatics we developed for this problem ○ Metabolite Signal Match (MSM) score ○ False Discovery Rate estimation ○ FDR-controlled annotation ● The online engine we implemented ○ How to prepare data for submission to our service ○ How to submit your data
  • 6. Project overview: slides on slideshare Bioinformatics: slides on slideshare
  • 7. Overview of the annotation engine
  • 8. Outline ● Inputs (data and metadata) ● Online Software ● Data Submission ● Annotation Browsing ● Use Cases a. mouse brain, MALDI-FTICR (U Rennes 1) b. human colorectal tumor, DESI-Orbitrap (ICL)
  • 13. Example 1 Mouse brain (MALDI-FTICR) Data provided by Regis Lavigne, Charles Pineau, University of Rennes 1 Select an annotation See the molecular distribution
  • 14. Example 2 Human colorectal tumor (DESI-Orbitrap) Data provided by James McKenzie, Zoltan Takats, Imperial College London Filter different datasets See data details View metadata
  • 15. Tutorial on how to use METASPACE engine and molecular knowledgebase
  • 16. Learning Outcomes 1. Preparing data for submission 2. Submitting data 3. Browsing results
  • 17. Data Requirements Imaging mass spectrometry data - Any ionisation source - Any spatial resolution - Any tissue - One section per dataset
  • 18. Data Requirements High resolving power RFWHM(@400) > 90K Well-calibrated ideally < 3 ppm
  • 19. Data Requirements Data Format - imzML Centroided - vendor preferred - http://metaspace2020.eu/imzml http://imzml.org/wp/introduction/
  • 20. Customised Processing Processing is tailored to your data! - Technical metadata - Resolving power - isotope prediction - Polarity - adducts R200=70K R200=280K [C41H78NO7P+K]+
  • 21. Data Requirements Your responsibility: - Data is processed ‘as is’ - Check metadata is correct - Report resolving power accurately (check within data-set) - Low numbers of annotations often correspond to poor quality mass spectra - Calibration inaccuracy - Lock-mass errors
  • 23. 1. Follow conversion instructions for your instrument 2. Select the centroided files, .imzML and .ibd 3. The dataset will be copied to the cloud storage (accessible only to our team) Data upload
  • 24. Metadata form ● Fields are auto-completed ● Please fill truthfully ○ Don’t want to disclose? Just put ‘N/A’ ● Click (top right) ○ Enabled once the files finished uploading
  • 27. Annotation table Currently selected molecule (click to select) MSM score principal peak m/z
  • 28. Sorting/filtering annotations Click on column headers to sort Add as many filters as you need Quickly add a filter by hovering over a cell and clicking the icon
  • 29. Molecule search Search by name (partial name search) Or by molecular formula (exact match only) Click to edit
  • 30. Details for highlighted annotation molecule distribution (principal isotope) Putative metabolite entries from the database
  • 31. Visual insight into MSM score assignment Exact m/z of each ion image Click and drag to zoom Ion images for each isotope peak Isotopic patterns Blue: theoretical abundance (at instrument resolving power) Red: measured image intensity
  • 33. Step-by-step search Add dataset filter, then choose dataset(s)
  • 34. Step-by-step search Type molecule name or molecular formula The annotations table will dynamically update
  • 35. Step-by-step search Export to CSV will save the current annotations table. Changing the filters will change which annotations are exported Always export annotations for comparison together (so they are at the same FDR)
  • 36. Results Browsing Summary 1. Choose database 2. Choose data-set 3. Add molecule filter and type ‘PC’ a. molecular class filter 4. Type ‘PC(16:0/18:0) a. single metabolite filter 5. Select row of table a. single ion filter 6. Simple comparison of spatial distributions between adducts 7. Export of annotations to csv Also possible ● Filter by m/z ● Formula search ● Comparison across datasets
  • 38. FDR Controlled Annotation False Discovery Rate - the fraction of incorrect annotations Control - request a set of annotations at a fixed estimated FDR Setting the level: - Adjust the number of molecules for follow-up analysis - When only limited numbers of molecules can be reviewed, adjust the FDR so that fewer/great numbers of molecules are annotated - Compare annotations between datasets - A principled way of selecting molecules to compare between True annotation False discovery MSM score FDR = 0.2 FDR = nTrue nFalse + nTrue
  • 39. Choice of metabolite database synthesized/recorded 88M CAS registry biologically occurring/active 50M PubChem compounds single biological system 40K HMDB sample specific 1K LC-MS
  • 40. Choice of metabolite database Impacts search and False-Discovery-Rate estimation ● Use one that’s relevant ● Larger database ○ more false-hits --> fewer annotations at a fixed FDR ● Different databases give different annotations ○ even for molecules in both databases due to FDR control ○ for data-set comparison, use the same database
  • 41. Annotating at level of molecular formula ● Possibility of multiple metabolites per sum formula ○ webapp shows all hits from the database search (learn the ambiguity!) ○ other databases can be searched (e.g. PubChem) ○ use enrichment analysis to get biological leads ● Use an orthogonal technique for reporting individual metabolites ○ not directly integrated (yet) ○ use web-app results help to target MS/MS studies (e.g. purchase of standards)
  • 42. ● The METASPACE platform putatively annotate* molecular formula along with several candidate metabolites ● A set of annotations should be reported along with the FDR threshold selected. ○ e.g. “Molecular annotation was performed using the METASPACE annotation engine (Palmer et al, Nature Methods 2016). 150 molecules were annotated against the LipidMAPS database at 10% FDR. Results are publically available at annotate.metaspace2020.eu“ ● The export function of the website delivered a spreadsheet that can be included as supporting for any publication. Reporting Results Metabolomics Standards Initiative identification levels Sumner et al, 2007, Metabolomics
  • 43. ● Preparing data for submission ○ imzML export ○ metadata ● Submitting data ○ web-app, upload ● Browsing knowledgebase ○ web-app, annotations Learning Summary ● METASPACE team: ○ web: metaspace2020.eu ○ email: contact@metaspace2020.eu ○ twitter: @metaspace2020 ○ source code: https://github.com/METASPACE2020/ ● FTICR data conversion ○ SCiLS: support@scils.de ● Orbitrap data conversion How to get help?
  • 45. Export into imzML: FT-ICR data Using SCiLS Lab’s METASPACE export
  • 46. Export to METASPACE ● Export your centroided high-resolution spectra in the imzML format ● Only available for “FT-ICR type” SCiLS Lab files in SCiLS Lab 2016b ● Best results in METASPACE if peak list is required for centroiding ● Two different Bruker data formats ○ SQLite peak list data: Peak list provided during import ○ FT-ICR profile data: Generate a peak list after import
  • 47. Create imzML file for METASPACE ● In the objects tab, click the export symbol of the region to be exported and select “Export to METASPACE” ● The Export Spectra dialog opens ● Set your normalization of choice ● Select your peak list of choice for example “Imported Peaks” in case of SQLite ● Provide your scan polarity ● Click OK to save imzML file
  • 48. SQLite peak list data ● Data must have been acquired with on-the- fly centroid detection i.e. there is a file called ‘peaks.sqlite’ within the .d folder of the data-set ● In SCiLS Lab a peak list “Imported peaks” is available, selecting most frequent peaks By default all peaks appearing more frequently than 1% of spectra
  • 49. FT-ICR profile data ● Older Solarix Files do not directly contain a peak list to perform centroiding ● Create peak list with Data Analysis SCiLS Lab Help Section 7.4 ● Use METASPACE tool for peak finding https://spatialmetabolomics.github.io/centroidize/ ● Use other external tools (mMass, …) ● Import the external peak list into SCiLS Lab File > Import > m/z intervals from CSV or Clipboard
  • 50. Use METASPACE tool for peak finding ● Select the overview spectrum CSV exported from SCiLS ● Upload CSV file to METASPACE tool ● Copy values to clipboard ● Use File > Import > m/z intervals from CSV
  • 51. Upload imzML files to METASPACE ● Go to http://annotate.metaspace202.eu/#/upload
  • 52. Export into imzML: Orbitrap data (.raw) Instructions: metaspace2020.eu/imzML Software tools: imageQuest / raw-converter - Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-) imzmlConverter - Recommended for: DESI/flowProbe with separate files per row Recommended for bioinformaticians: pyimzML (Python parser)
  • 53. .raw -> imzML ● Commercial ○ Thermo Scientific imageQuest
  • 54. Raw-imzml converter .raw -> imzML ● Free ● http://ms-imaging.org/wp/raw-to- imzml-converter/
  • 55. .raw -> mzML -> imzML ● MSConvert ○ free (link) ● imzMLConverter ○ free ○ requires registration ○ http://www.cs.bham.ac.uk/~ibs/imzMLConverte r Export into imzML: Generic
  • 56. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement № 634402. Acknowledgments Example data provided by: University of Rennes 1 Regis Lavigne Charles Pineau EMBL Ksenija Radic Alexandra Koumoutsi Andrew Palmer Imperial College London James McKenzie Zoltan Takats METASPACE R&D team at EMBL Theodore Alexandrov Vitaly Kovalev Artem Tarasov Andrew Palmer Dominik Fay SCiLS imzML export Dennis Trede Jan Hendrik Kobarg