SlideShare a Scribd company logo
1 of 56
METASPACE training guide
2017
Theodore Alexandrov (EMBL, UCSD)
Andy Palmer (EMBL)
Vitaly Kovalev (EMBL)
Artem Tarasov (EMBL)
@METASPACE2020
Welcome everyone!
Artem
Tarasov
“hacker”
Andy
Palmer
“scientist”
Vitaly
Kovalev
“developer”
Theodore
Alexandrov
“leader”
Part 1: Introduction
Learning Outcomes
Introduction into the METASPACE project
Metabolite annotation in HR imaging MS (Bioinformatics)
Overview of the annotation engine
Part 2: METASPACE Annotation Platform
Tutorial
Learning Outcomes
Data requirements
Data Submission
Annotation Browsing
Interpretation
Part 3: Export to imzML
FTICR (Bruker)
Orbitrap (Thermo)
Other vendors
Training Overview
Introduction
What we hope you will learn today
● Ins and outs of metabolite annotation in HR imaging MS
● Bioinformatics we developed for this problem
○ Metabolite Signal Match (MSM) score
○ False Discovery Rate estimation
○ FDR-controlled annotation
● The online engine we implemented
○ How to prepare data for submission to our service
○ How to submit your data
Project overview: slides on slideshare
Bioinformatics: slides on slideshare
Overview of the annotation engine
Outline
● Inputs (data and metadata)
● Online Software
● Data Submission
● Annotation Browsing
● Use Cases
a. mouse brain, MALDI-FTICR (U Rennes 1)
b. human colorectal tumor, DESI-Orbitrap (ICL)
http://annotate.metaspace2020.e
u
imzML,
metadata
upload
annotations
database
task scheduler
explore
annotations
10 minutes
engine
The METASPACE platform
Amazon Cloud
Upload: Data
centroided imzML
http://ms-imaging.org/
Upload: Metadata
sample information
acquisition details
Upload: Interface annotate.metaspace2020.eu/#/upload
metadata collection
data upload
Example 1
Mouse brain
(MALDI-FTICR)
Data provided by
Regis Lavigne, Charles Pineau,
University of Rennes 1
Select an annotation
See the molecular
distribution
Example 2
Human colorectal tumor
(DESI-Orbitrap)
Data provided by
James McKenzie, Zoltan Takats,
Imperial College London
Filter different datasets
See data details
View metadata
Tutorial on how to use
METASPACE engine and molecular knowledgebase
Learning Outcomes
1. Preparing data for submission
2. Submitting data
3. Browsing results
Data Requirements
Imaging mass spectrometry data
- Any ionisation source
- Any spatial resolution
- Any tissue
- One section per dataset
Data Requirements
High resolving power
RFWHM(@400) > 90K
Well-calibrated
ideally < 3 ppm
Data Requirements
Data Format
- imzML
Centroided
- vendor preferred
- http://metaspace2020.eu/imzml
http://imzml.org/wp/introduction/
Customised Processing
Processing is tailored to your data!
- Technical metadata
- Resolving power
- isotope prediction
- Polarity
- adducts
R200=70K R200=280K
[C41H78NO7P+K]+
Data Requirements
Your responsibility:
- Data is processed ‘as is’
- Check metadata is correct
- Report resolving power accurately (check within data-set)
- Low numbers of annotations often correspond to poor quality mass spectra
- Calibration inaccuracy
- Lock-mass errors
Data Submission
1. Follow conversion instructions for your instrument
2. Select the centroided files, .imzML and .ibd
3. The dataset will be copied to the cloud storage
(accessible only to our team)
Data upload
Metadata form
● Fields are auto-completed
● Please fill truthfully
○ Don’t want to disclose?
Just put ‘N/A’
● Click (top right)
○ Enabled once the files
finished uploading
Browsing Results
Annotation
knowledgebase
web app
http://annotate.metaspace2020.eu
Results and metadata are public
Datasets are not
Annotation table
Currently selected
molecule
(click to select)
MSM score
principal peak m/z
Sorting/filtering annotations Click on column headers to sort
Add as many filters as you need
Quickly add a filter by hovering over a cell and clicking the icon
Molecule search Search by name (partial name search)
Or by molecular formula (exact match only)
Click to edit
Details for highlighted annotation
molecule distribution
(principal isotope)
Putative metabolite entries
from the database
Visual insight into MSM score assignment
Exact m/z of each
ion image
Click and drag
to zoom
Ion images for each
isotope peak
Isotopic patterns
Blue: theoretical abundance
(at instrument resolving power)
Red: measured image intensity
Step-by-step search
Choose molecular formula database
Step-by-step search
Add dataset filter, then choose dataset(s)
Step-by-step search
Type molecule name or molecular formula
The annotations table will dynamically update
Step-by-step search
Export to CSV will save the current annotations table.
Changing the filters will change which annotations
are exported
Always export annotations for comparison together
(so they are at the same FDR)
Results Browsing Summary
1. Choose database
2. Choose data-set
3. Add molecule filter and type ‘PC’
a. molecular class filter
4. Type ‘PC(16:0/18:0)
a. single metabolite filter
5. Select row of table
a. single ion filter
6. Simple comparison of spatial distributions
between adducts
7. Export of annotations to csv
Also possible
● Filter by m/z
● Formula search
● Comparison across datasets
Interpretation
FDR Controlled Annotation
False Discovery Rate - the fraction of incorrect annotations
Control - request a set of annotations at a fixed estimated
FDR
Setting the level:
- Adjust the number of molecules for follow-up analysis
- When only limited numbers of molecules can be reviewed,
adjust the FDR so that fewer/great numbers of molecules are
annotated
- Compare annotations between datasets
- A principled way of selecting molecules to compare between
True annotation
False discovery
MSM
score
FDR = 0.2
FDR = nTrue
nFalse + nTrue
Choice of metabolite database
synthesized/recorded
88M CAS registry
biologically occurring/active
50M PubChem compounds
single biological system
40K HMDB
sample specific
1K LC-MS
Choice of metabolite database
Impacts search and False-Discovery-Rate estimation
● Use one that’s relevant
● Larger database
○ more false-hits --> fewer annotations at a fixed FDR
● Different databases give different annotations
○ even for molecules in both databases due to FDR control
○ for data-set comparison, use the same database
Annotating at level of molecular formula
● Possibility of multiple metabolites per sum formula
○ webapp shows all hits from the database search (learn the ambiguity!)
○ other databases can be searched (e.g. PubChem)
○ use enrichment analysis to get biological leads
● Use an orthogonal technique for reporting individual metabolites
○ not directly integrated (yet)
○ use web-app results help to target MS/MS studies (e.g. purchase of standards)
● The METASPACE platform putatively annotate* molecular formula along with several
candidate metabolites
● A set of annotations should be reported along with the FDR threshold selected.
○ e.g. “Molecular annotation was performed using the METASPACE annotation
engine (Palmer et al, Nature Methods 2016). 150 molecules were annotated
against the LipidMAPS database at 10% FDR. Results are publically available at
annotate.metaspace2020.eu“
● The export function of the website delivered a spreadsheet that can be included as
supporting for any publication.
Reporting Results
Metabolomics Standards Initiative identification levels Sumner et al, 2007, Metabolomics
● Preparing data for submission
○ imzML export
○ metadata
● Submitting data
○ web-app, upload
● Browsing knowledgebase
○ web-app, annotations
Learning Summary
● METASPACE team:
○ web: metaspace2020.eu
○ email: contact@metaspace2020.eu
○ twitter: @metaspace2020
○ source code:
https://github.com/METASPACE2020/
● FTICR data conversion
○ SCiLS: support@scils.de
● Orbitrap data conversion
How to get help?
imzML Export
Export into imzML: FT-ICR data
Using SCiLS Lab’s METASPACE export
Export to METASPACE
● Export your centroided high-resolution spectra in the imzML format
● Only available for “FT-ICR type” SCiLS Lab files in SCiLS Lab 2016b
● Best results in METASPACE if peak list is required for centroiding
● Two different Bruker data formats
○ SQLite peak list data: Peak list provided during import
○ FT-ICR profile data: Generate a peak list after import
Create imzML file for METASPACE
● In the objects tab, click the export symbol
of the region to be exported and select
“Export to METASPACE”
● The Export Spectra dialog opens
● Set your normalization of choice
● Select your peak list of choice
for example “Imported Peaks” in case of
SQLite
● Provide your scan polarity
● Click OK to save imzML file
SQLite peak list data
● Data must have been acquired with on-the-
fly centroid detection
i.e. there is a file called ‘peaks.sqlite’ within the .d
folder of the data-set
● In SCiLS Lab a peak list “Imported peaks” is
available, selecting most frequent peaks
By default all peaks appearing more frequently
than 1% of spectra
FT-ICR profile data
● Older Solarix Files do not directly contain a peak
list to perform centroiding
● Create peak list with Data Analysis
SCiLS Lab Help Section 7.4
● Use METASPACE tool for peak finding
https://spatialmetabolomics.github.io/centroidize/
● Use other external tools (mMass, …)
● Import the external peak list into SCiLS Lab
File > Import > m/z intervals from CSV or Clipboard
Use METASPACE tool for peak finding
● Select the overview spectrum CSV exported from SCiLS
● Upload CSV file to METASPACE tool
● Copy values to clipboard
● Use File > Import > m/z intervals from CSV
Upload imzML files to METASPACE
● Go to http://annotate.metaspace202.eu/#/upload
Export into imzML: Orbitrap data (.raw)
Instructions: metaspace2020.eu/imzML
Software tools:
imageQuest / raw-converter
- Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-)
imzmlConverter
- Recommended for: DESI/flowProbe with separate files per row
Recommended for bioinformaticians: pyimzML (Python parser)
.raw -> imzML
● Commercial
○ Thermo
Scientific
imageQuest
Raw-imzml converter
.raw -> imzML
● Free
● http://ms-imaging.org/wp/raw-to-
imzml-converter/
.raw -> mzML -> imzML
● MSConvert
○ free (link)
● imzMLConverter
○ free
○ requires registration
○ http://www.cs.bham.ac.uk/~ibs/imzMLConverte
r
Export into imzML: Generic
This project has received funding from the European Union’s Horizon 2020 research and
innovation programme under grant agreement № 634402.
Acknowledgments
Example data provided by:
University of Rennes 1
Regis Lavigne
Charles Pineau
EMBL
Ksenija Radic
Alexandra Koumoutsi
Andrew Palmer
Imperial College London
James McKenzie
Zoltan Takats
METASPACE R&D team at EMBL
Theodore Alexandrov
Vitaly Kovalev
Artem Tarasov
Andrew Palmer
Dominik Fay
SCiLS imzML export
Dennis Trede
Jan Hendrik Kobarg

More Related Content

Similar to ARCHIVED: new version available - METASPACE Step by Step guide

Pittcon06 Auto Chrom
Pittcon06 Auto ChromPittcon06 Auto Chrom
Pittcon06 Auto Chromniharaina
 
Machine learning Experiments report
Machine learning Experiments report Machine learning Experiments report
Machine learning Experiments report AlmkdadAli
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark MLAhmet Bulut
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
 
Integrative information management for systems biology
Integrative information management for systems biologyIntegrative information management for systems biology
Integrative information management for systems biologyNeil Swainston
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Anubhav Jain
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET Journal
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-stepsShesha R
 
Presentation
PresentationPresentation
Presentationbutest
 
Metabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsMetabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsDinesh Barupal
 
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem ChemAxon
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design TrainingESCOM
 
Metadata Matters! What it is and How to Manage it
Metadata Matters! What it is and How to Manage itMetadata Matters! What it is and How to Manage it
Metadata Matters! What it is and How to Manage itSafe Software
 
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear RegressionMachine Learning - Simple Linear Regression
Machine Learning - Simple Linear RegressionSiddharth Shrivastava
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONIRJET Journal
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 

Similar to ARCHIVED: new version available - METASPACE Step by Step guide (20)

ifip2008albashiri.pdf
ifip2008albashiri.pdfifip2008albashiri.pdf
ifip2008albashiri.pdf
 
Pittcon06 Auto Chrom
Pittcon06 Auto ChromPittcon06 Auto Chrom
Pittcon06 Auto Chrom
 
Machine learning Experiments report
Machine learning Experiments report Machine learning Experiments report
Machine learning Experiments report
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark ML
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
Integrative information management for systems biology
Integrative information management for systems biologyIntegrative information management for systems biology
Integrative information management for systems biology
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
Raptor user manual3.0
Raptor user manual3.0Raptor user manual3.0
Raptor user manual3.0
 
Presentation
PresentationPresentation
Presentation
 
Metabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsMetabolic network mapping for metabolomics
Metabolic network mapping for metabolomics
 
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
USUGM 2014 - Dana Vanderwall (Bristol-Myers Squibb): Instant JChem
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
Metadata Matters! What it is and How to Manage it
Metadata Matters! What it is and How to Manage itMetadata Matters! What it is and How to Manage it
Metadata Matters! What it is and How to Manage it
 
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear RegressionMachine Learning - Simple Linear Regression
Machine Learning - Simple Linear Regression
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
 
The ABAP Query
The ABAP QueryThe ABAP Query
The ABAP Query
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 

Recently uploaded

Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...Lokesh Kothari
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsNurulAfiqah307317
 

Recently uploaded (20)

Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening Designs
 

ARCHIVED: new version available - METASPACE Step by Step guide

  • 1. METASPACE training guide 2017 Theodore Alexandrov (EMBL, UCSD) Andy Palmer (EMBL) Vitaly Kovalev (EMBL) Artem Tarasov (EMBL) @METASPACE2020
  • 3. Part 1: Introduction Learning Outcomes Introduction into the METASPACE project Metabolite annotation in HR imaging MS (Bioinformatics) Overview of the annotation engine Part 2: METASPACE Annotation Platform Tutorial Learning Outcomes Data requirements Data Submission Annotation Browsing Interpretation Part 3: Export to imzML FTICR (Bruker) Orbitrap (Thermo) Other vendors Training Overview
  • 5. What we hope you will learn today ● Ins and outs of metabolite annotation in HR imaging MS ● Bioinformatics we developed for this problem ○ Metabolite Signal Match (MSM) score ○ False Discovery Rate estimation ○ FDR-controlled annotation ● The online engine we implemented ○ How to prepare data for submission to our service ○ How to submit your data
  • 6. Project overview: slides on slideshare Bioinformatics: slides on slideshare
  • 7. Overview of the annotation engine
  • 8. Outline ● Inputs (data and metadata) ● Online Software ● Data Submission ● Annotation Browsing ● Use Cases a. mouse brain, MALDI-FTICR (U Rennes 1) b. human colorectal tumor, DESI-Orbitrap (ICL)
  • 13. Example 1 Mouse brain (MALDI-FTICR) Data provided by Regis Lavigne, Charles Pineau, University of Rennes 1 Select an annotation See the molecular distribution
  • 14. Example 2 Human colorectal tumor (DESI-Orbitrap) Data provided by James McKenzie, Zoltan Takats, Imperial College London Filter different datasets See data details View metadata
  • 15. Tutorial on how to use METASPACE engine and molecular knowledgebase
  • 16. Learning Outcomes 1. Preparing data for submission 2. Submitting data 3. Browsing results
  • 17. Data Requirements Imaging mass spectrometry data - Any ionisation source - Any spatial resolution - Any tissue - One section per dataset
  • 18. Data Requirements High resolving power RFWHM(@400) > 90K Well-calibrated ideally < 3 ppm
  • 19. Data Requirements Data Format - imzML Centroided - vendor preferred - http://metaspace2020.eu/imzml http://imzml.org/wp/introduction/
  • 20. Customised Processing Processing is tailored to your data! - Technical metadata - Resolving power - isotope prediction - Polarity - adducts R200=70K R200=280K [C41H78NO7P+K]+
  • 21. Data Requirements Your responsibility: - Data is processed ‘as is’ - Check metadata is correct - Report resolving power accurately (check within data-set) - Low numbers of annotations often correspond to poor quality mass spectra - Calibration inaccuracy - Lock-mass errors
  • 23. 1. Follow conversion instructions for your instrument 2. Select the centroided files, .imzML and .ibd 3. The dataset will be copied to the cloud storage (accessible only to our team) Data upload
  • 24. Metadata form ● Fields are auto-completed ● Please fill truthfully ○ Don’t want to disclose? Just put ‘N/A’ ● Click (top right) ○ Enabled once the files finished uploading
  • 27. Annotation table Currently selected molecule (click to select) MSM score principal peak m/z
  • 28. Sorting/filtering annotations Click on column headers to sort Add as many filters as you need Quickly add a filter by hovering over a cell and clicking the icon
  • 29. Molecule search Search by name (partial name search) Or by molecular formula (exact match only) Click to edit
  • 30. Details for highlighted annotation molecule distribution (principal isotope) Putative metabolite entries from the database
  • 31. Visual insight into MSM score assignment Exact m/z of each ion image Click and drag to zoom Ion images for each isotope peak Isotopic patterns Blue: theoretical abundance (at instrument resolving power) Red: measured image intensity
  • 33. Step-by-step search Add dataset filter, then choose dataset(s)
  • 34. Step-by-step search Type molecule name or molecular formula The annotations table will dynamically update
  • 35. Step-by-step search Export to CSV will save the current annotations table. Changing the filters will change which annotations are exported Always export annotations for comparison together (so they are at the same FDR)
  • 36. Results Browsing Summary 1. Choose database 2. Choose data-set 3. Add molecule filter and type ‘PC’ a. molecular class filter 4. Type ‘PC(16:0/18:0) a. single metabolite filter 5. Select row of table a. single ion filter 6. Simple comparison of spatial distributions between adducts 7. Export of annotations to csv Also possible ● Filter by m/z ● Formula search ● Comparison across datasets
  • 38. FDR Controlled Annotation False Discovery Rate - the fraction of incorrect annotations Control - request a set of annotations at a fixed estimated FDR Setting the level: - Adjust the number of molecules for follow-up analysis - When only limited numbers of molecules can be reviewed, adjust the FDR so that fewer/great numbers of molecules are annotated - Compare annotations between datasets - A principled way of selecting molecules to compare between True annotation False discovery MSM score FDR = 0.2 FDR = nTrue nFalse + nTrue
  • 39. Choice of metabolite database synthesized/recorded 88M CAS registry biologically occurring/active 50M PubChem compounds single biological system 40K HMDB sample specific 1K LC-MS
  • 40. Choice of metabolite database Impacts search and False-Discovery-Rate estimation ● Use one that’s relevant ● Larger database ○ more false-hits --> fewer annotations at a fixed FDR ● Different databases give different annotations ○ even for molecules in both databases due to FDR control ○ for data-set comparison, use the same database
  • 41. Annotating at level of molecular formula ● Possibility of multiple metabolites per sum formula ○ webapp shows all hits from the database search (learn the ambiguity!) ○ other databases can be searched (e.g. PubChem) ○ use enrichment analysis to get biological leads ● Use an orthogonal technique for reporting individual metabolites ○ not directly integrated (yet) ○ use web-app results help to target MS/MS studies (e.g. purchase of standards)
  • 42. ● The METASPACE platform putatively annotate* molecular formula along with several candidate metabolites ● A set of annotations should be reported along with the FDR threshold selected. ○ e.g. “Molecular annotation was performed using the METASPACE annotation engine (Palmer et al, Nature Methods 2016). 150 molecules were annotated against the LipidMAPS database at 10% FDR. Results are publically available at annotate.metaspace2020.eu“ ● The export function of the website delivered a spreadsheet that can be included as supporting for any publication. Reporting Results Metabolomics Standards Initiative identification levels Sumner et al, 2007, Metabolomics
  • 43. ● Preparing data for submission ○ imzML export ○ metadata ● Submitting data ○ web-app, upload ● Browsing knowledgebase ○ web-app, annotations Learning Summary ● METASPACE team: ○ web: metaspace2020.eu ○ email: contact@metaspace2020.eu ○ twitter: @metaspace2020 ○ source code: https://github.com/METASPACE2020/ ● FTICR data conversion ○ SCiLS: support@scils.de ● Orbitrap data conversion How to get help?
  • 45. Export into imzML: FT-ICR data Using SCiLS Lab’s METASPACE export
  • 46. Export to METASPACE ● Export your centroided high-resolution spectra in the imzML format ● Only available for “FT-ICR type” SCiLS Lab files in SCiLS Lab 2016b ● Best results in METASPACE if peak list is required for centroiding ● Two different Bruker data formats ○ SQLite peak list data: Peak list provided during import ○ FT-ICR profile data: Generate a peak list after import
  • 47. Create imzML file for METASPACE ● In the objects tab, click the export symbol of the region to be exported and select “Export to METASPACE” ● The Export Spectra dialog opens ● Set your normalization of choice ● Select your peak list of choice for example “Imported Peaks” in case of SQLite ● Provide your scan polarity ● Click OK to save imzML file
  • 48. SQLite peak list data ● Data must have been acquired with on-the- fly centroid detection i.e. there is a file called ‘peaks.sqlite’ within the .d folder of the data-set ● In SCiLS Lab a peak list “Imported peaks” is available, selecting most frequent peaks By default all peaks appearing more frequently than 1% of spectra
  • 49. FT-ICR profile data ● Older Solarix Files do not directly contain a peak list to perform centroiding ● Create peak list with Data Analysis SCiLS Lab Help Section 7.4 ● Use METASPACE tool for peak finding https://spatialmetabolomics.github.io/centroidize/ ● Use other external tools (mMass, …) ● Import the external peak list into SCiLS Lab File > Import > m/z intervals from CSV or Clipboard
  • 50. Use METASPACE tool for peak finding ● Select the overview spectrum CSV exported from SCiLS ● Upload CSV file to METASPACE tool ● Copy values to clipboard ● Use File > Import > m/z intervals from CSV
  • 51. Upload imzML files to METASPACE ● Go to http://annotate.metaspace202.eu/#/upload
  • 52. Export into imzML: Orbitrap data (.raw) Instructions: metaspace2020.eu/imzML Software tools: imageQuest / raw-converter - Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-) imzmlConverter - Recommended for: DESI/flowProbe with separate files per row Recommended for bioinformaticians: pyimzML (Python parser)
  • 53. .raw -> imzML ● Commercial ○ Thermo Scientific imageQuest
  • 54. Raw-imzml converter .raw -> imzML ● Free ● http://ms-imaging.org/wp/raw-to- imzml-converter/
  • 55. .raw -> mzML -> imzML ● MSConvert ○ free (link) ● imzMLConverter ○ free ○ requires registration ○ http://www.cs.bham.ac.uk/~ibs/imzMLConverte r Export into imzML: Generic
  • 56. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement № 634402. Acknowledgments Example data provided by: University of Rennes 1 Regis Lavigne Charles Pineau EMBL Ksenija Radic Alexandra Koumoutsi Andrew Palmer Imperial College London James McKenzie Zoltan Takats METASPACE R&D team at EMBL Theodore Alexandrov Vitaly Kovalev Artem Tarasov Andrew Palmer Dominik Fay SCiLS imzML export Dennis Trede Jan Hendrik Kobarg