SlideShare a Scribd company logo
Basic Use of XCMS -- Local
Xiuxia Du
Department of Bioinformatics & Genomics
University of North Carolina at Charlotte
Preparation
2	
  
•  Required: install R
•  Optional: install Rstudio, an IDE (Integrated Development
Environment) for R
R
3	
  
RStudio
4	
  
Get help documents
•  Three ways
–  http://www.bioconductor.org/packages/release/bioc/html/
xcms.html
–  Google XCMS bioconductor
–  Google XCMS à Scripps Center for Metabolomics and Mass
Spectrometry – XCMS à installation à XCMS bioconductor
Document: step-by-step preprocessing
6	
  
XCMS workflow
7	
  
Install and load XCMS packages (I)
8	
  
Check if the XCMS
package has been
installed in R
Answer: No
Install and load the XCMS packages (II)
9	
  
10	
  
Check again.
XCMS and a few
other packages
have been
installed.
Install and load the XCMS packages (IV)
11	
  
for multiple hypothesis testing
demo data supplied by XCMS
Raw data format and organization
12	
  
•  Open formats that XCMS can read
–  AIA/ANDI NetCDF
–  mzData
–  mzXML
•  Organization
–  Use sub-directories that correspond to sample class information
•  Datasets for demonstration
–  faahKO data package supplied by XCMS
–  Data is stored in netCDF format.
–  The raw data sets are stored in a folder on your computer.
Raw data preparation (I)
13	
  
From the terminal, check where the datasets are on your computer:
In R command window on Mac:
Raw data preparation (II)
14	
  
Check where the datasets are on your computer:
In R command window on Windows:
Raw data preparation (III)
15	
  
Get the list of the raw data files:
Raw data preparation (IV)
16	
  
•  Alternatively
–  Specify the working directory
–  By default, XCMS will recursively search through the current
working directory for NetCDF/mzXML/mzData files.
Peak identification (I)
17	
  
•  Command: xcmsSet()
•  One separate row for a dataset
•  For each pair of numbers, the first number is the m/z XCMS is
currently processing. The second number is the number of peaks
that have been identified so far.
Peak identification (II)
18	
  
•  If raw data files are in your working directory, then:
Peak Identificaiton (III)
19	
  
•  Take a look at the xcmsSet object:
Peak identification (IV)
20	
  
•  The default parameters should work acceptably in most
cases.
–  Default peak detection method: matched filter
–  Alternative approach: centWave for high resolution MS data
•  However, a number of parameters might need to be
optimized for particular instruments or experimental
conditions.
–  Matched filtration: model peak width, m/z step size for creating
extracted ion base peak chromatograms (EIBPC), the algorithm to
create EIBPC, …
–  centWave: ppm, peak width range, …
•  To be explained in the next section “Parameter set-up …”
by Paul
Matching peaks across samples
21	
  
•  After peak identification, peaks representing the same
analyte across samples must be placed into groups.
•  This is accomplished with the group() method.
•  There are several grouping parameters to consider
optimizing for your chromatography and mass
spectrometer (to be explained by Paul).
Retention time correction (I)
22	
  
•  XCMS uses peak groups to identify and correct drifts in
retention time from run to run.
•  Only well-behaved peak groups are used: missing the peak
from at most one sample and having at most one extra
peak.
•  These parameters can be changed with the missing and extra
arguments.
•  For each of those well-behaved groups, XCMS calculates a
median retention time and, for every sample, a deviation
from that median.
Retention time correction (II)
23	
  
•  Within a sample, the observed deviation generally changes
over time in a nonlinear fashion.
•  Those changes are modeled using a local polynomial
regression technique.
•  Retention time correction is performed by the retcor()
method.
•  The plottype argument produces the plot on the next slide.
Retention time correction (III)
24	
  
Retention time correction (IV)
25	
  
•  Use the plot to supervise the algorithm. The plot includes
data points used for regression and the resulting deviation
profiles.
•  The plot also shows the distribution of peak groups across
retention time.
Retention time correction (V)
26	
  
•  After retention time correction, the initial peak grouping
becomes invalid.
•  Peak re-grouping is needed.
•  This iteration of peak grouping and alignment can be
repeated in an iterative fashion.
Filling in missing peaks (I)
27	
  
•  Peaks could be missing due to imperfection in peak
identification or because an analyte was not present in a
sample.
•  For missing peaks that correspond to analytes that are
actually in the sample, the missing data points can be filled
in by re-reading the raw data files and integrating them in
the regions of the missing peaks.
•  This is performed using the fillPeaks() method.
Filling in missing peaks (II)
28	
  
Analyzing and visualizing results (I)
29	
  
•  A report showing the most statistically significant
differences in analyte intensities can be generated with the
diffreport() method.
•  Results are stored in two folders and one spread sheet file.
Analyzing and visualizing results (II)
30	
  
Extracted ion
chromatograms
for significant
ions
001.png 006.png 010.png
Analyzing and visualizing results (III)
31	
  
001.png 006.png 010.png
Box plots for
significant ions
Analyzing and visualizing results (IV)
32	
  
example.tsv (column A to K)
Analyzing and visualizing results (V)
33	
  
example.tsv (column O to AA)
Going back to the raw data (I)
34	
  
mzmed	
  =	
  300.2,	
  rtmed	
  =	
  3390.3	
  sec	
  =	
  56.5	
  min	
  
Going back to the raw data (II)
35	
  
Visualize raw data
36	
  
•  Mass spectrometer vendors save data in proprietary
formats.
•  These data files can be converted to open data formats for
easy reading.
•  Software tools to do the conversion: msConvert
msConvert (I)
37	
  
•  Part of ProteoWizard
•  Read from
–  mzML, mzXML, MGF
–  Agilent, Bruker, Thermo, Waters, ABSciex
•  Write to
–  open formats
–  perform various filters and transformations
•  http://proteowizard.sourceforge.net/
•  For Windows, msConvertGUI is available for easy file
conversion.
msConvert (II)
38	
  
Raw data visualization
39	
  
•  Software tool: Insilicos Viewer
–  View raw MS data in formats including mzXML,
mzData, mzML, and ANDI CDF
–  http://insilicos.com/products/insilicos-viewer-1
–  Quick demo
Run all the commands
40	
  
41	
  
Thank you!

More Related Content

What's hot

Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...
NECST Lab @ Politecnico di Milano
 
Sorting
SortingSorting
poster
posterposter
ca-ap9222-pdf
ca-ap9222-pdfca-ap9222-pdf
Detecting probability of ice formation on overhead lines of the Dutch railway...
Detecting probability of ice formation on overhead lines of the Dutch railway...Detecting probability of ice formation on overhead lines of the Dutch railway...
Detecting probability of ice formation on overhead lines of the Dutch railway...
Irene Garcia-Marti
 
Graph Matching
Graph MatchingGraph Matching
Graph Matching
graphitech
 
Data Structures 01
Data Structures 01Data Structures 01
Data Structures 01
Budditha Hettige
 
Parallel algorithm in linear algebra
Parallel algorithm in linear algebraParallel algorithm in linear algebra
Parallel algorithm in linear algebra
Harshana Madusanka Jayamaha
 
JGrass-Newage snow component
JGrass-Newage snow componentJGrass-Newage snow component
JGrass-Newage snow component
Marialaura Bancheri
 
02 Stack
02 Stack02 Stack
Os revision ques
Os revision quesOs revision ques
Os revision ques
John Jo
 
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks  for Aerial Imagery SegmentationExploring Fused Convolutional Neural Networks  for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
Mendrika Ramarlina
 
Lo18
Lo18Lo18
Lo18
liankei
 
Stack Data structure
Stack Data structureStack Data structure
Stack Data structure
B Liyanage Asanka
 
Parallel algorithms
Parallel algorithmsParallel algorithms
Parallel algorithms
guest084d20
 
JGrass-NewAge rain-snow separation
JGrass-NewAge rain-snow separationJGrass-NewAge rain-snow separation
JGrass-NewAge rain-snow separation
Marialaura Bancheri
 
Real-time or full-precision CRS imaging using a cloud computing portal: multi...
Real-time or full-precision CRS imaging using a cloud computing portal: multi...Real-time or full-precision CRS imaging using a cloud computing portal: multi...
Real-time or full-precision CRS imaging using a cloud computing portal: multi...
CRS4 Research Center in Sardinia
 
Point Clouds: What's New
Point Clouds: What's NewPoint Clouds: What's New
Point Clouds: What's New
Safe Software
 

What's hot (18)

Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...
 
Sorting
SortingSorting
Sorting
 
poster
posterposter
poster
 
ca-ap9222-pdf
ca-ap9222-pdfca-ap9222-pdf
ca-ap9222-pdf
 
Detecting probability of ice formation on overhead lines of the Dutch railway...
Detecting probability of ice formation on overhead lines of the Dutch railway...Detecting probability of ice formation on overhead lines of the Dutch railway...
Detecting probability of ice formation on overhead lines of the Dutch railway...
 
Graph Matching
Graph MatchingGraph Matching
Graph Matching
 
Data Structures 01
Data Structures 01Data Structures 01
Data Structures 01
 
Parallel algorithm in linear algebra
Parallel algorithm in linear algebraParallel algorithm in linear algebra
Parallel algorithm in linear algebra
 
JGrass-Newage snow component
JGrass-Newage snow componentJGrass-Newage snow component
JGrass-Newage snow component
 
02 Stack
02 Stack02 Stack
02 Stack
 
Os revision ques
Os revision quesOs revision ques
Os revision ques
 
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks  for Aerial Imagery SegmentationExploring Fused Convolutional Neural Networks  for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
 
Lo18
Lo18Lo18
Lo18
 
Stack Data structure
Stack Data structureStack Data structure
Stack Data structure
 
Parallel algorithms
Parallel algorithmsParallel algorithms
Parallel algorithms
 
JGrass-NewAge rain-snow separation
JGrass-NewAge rain-snow separationJGrass-NewAge rain-snow separation
JGrass-NewAge rain-snow separation
 
Real-time or full-precision CRS imaging using a cloud computing portal: multi...
Real-time or full-precision CRS imaging using a cloud computing portal: multi...Real-time or full-precision CRS imaging using a cloud computing portal: multi...
Real-time or full-precision CRS imaging using a cloud computing portal: multi...
 
Point Clouds: What's New
Point Clouds: What's NewPoint Clouds: What's New
Point Clouds: What's New
 

Viewers also liked

Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Alain van Gool
 
MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)
Juan Antonio Vizcaino
 
DESI Mass Spectrometry
DESI Mass SpectrometryDESI Mass Spectrometry
DESI Mass Spectrometry
jmwiseman
 
Principles of lc ms data processing
Principles of lc ms data processingPrinciples of lc ms data processing
Principles of lc ms data processing
Xiuxia Du
 
MS (and NMR) data standards in Metabolomics why, how and some caveats
MS (and NMR) data standards in Metabolomics why, how and some caveatsMS (and NMR) data standards in Metabolomics why, how and some caveats
MS (and NMR) data standards in Metabolomics why, how and some caveats
Steffen Neumann
 
Using Social Media for Community Planning
Using Social Media for Community PlanningUsing Social Media for Community Planning
Using Social Media for Community Planning
Orville Morales
 
1345 omma metrics dennis mortensen
1345 omma metrics dennis mortensen1345 omma metrics dennis mortensen
1345 omma metrics dennis mortensen
MediaPost
 
Supporting formal and informal social learning
Supporting formal and informal social learningSupporting formal and informal social learning
Supporting formal and informal social learning
Jane Hart
 
Issue desk slides summer 11
Issue desk slides summer 11Issue desk slides summer 11
Issue desk slides summer 11
uclmainlibrary
 
Agosto (2)jardim
Agosto (2)jardimAgosto (2)jardim
Agosto (2)jardim
patronatobonanca
 
8 2-43 normas apa y referencias bibliograficas
8 2-43 normas apa y referencias bibliograficas8 2-43 normas apa y referencias bibliograficas
8 2-43 normas apa y referencias bibliograficas
geraldinezapata23
 
Инновационный метод лечения широкого круга аллергических заболеваний
Инновационный метод лечения широкого круга аллергических заболеванийИнновационный метод лечения широкого круга аллергических заболеваний
Инновационный метод лечения широкого круга аллергических заболеванийkulibin
 
Social Media and Networking - Infographic
Social Media and Networking - InfographicSocial Media and Networking - Infographic
Social Media and Networking - Infographic
Todd Wheatland
 
10 características arquitectónicas en edificaciones del mundo egeo
10  características arquitectónicas en edificaciones del mundo egeo10  características arquitectónicas en edificaciones del mundo egeo
10 características arquitectónicas en edificaciones del mundo egeo
Alexis Leiva
 
Pride Cluster 062016 Update
Pride Cluster 062016 UpdatePride Cluster 062016 Update
Pride Cluster 062016 Update
Juan Antonio Vizcaino
 
πολυτεχνείο 2014
πολυτεχνείο 2014πολυτεχνείο 2014
πολυτεχνείο 2014michaelathea
 
20 percent tips
20 percent tips20 percent tips
20 percent tips
Maria Guryanova
 
La participación en la Estrategia de ASCyT
La participación en la Estrategia de ASCyTLa participación en la Estrategia de ASCyT
La participación en la Estrategia de ASCyT
5ForoASCTI
 
Cl zeel plast-machinery
Cl zeel plast-machineryCl zeel plast-machinery
Cl zeel plast-machinery
Zeel Plast Machinery
 
compte-rendu du colloque international sur le financement de la création
compte-rendu du colloque international sur le financement de la créationcompte-rendu du colloque international sur le financement de la création
compte-rendu du colloque international sur le financement de la création
MinistereCC
 

Viewers also liked (20)

Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
 
MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)
 
DESI Mass Spectrometry
DESI Mass SpectrometryDESI Mass Spectrometry
DESI Mass Spectrometry
 
Principles of lc ms data processing
Principles of lc ms data processingPrinciples of lc ms data processing
Principles of lc ms data processing
 
MS (and NMR) data standards in Metabolomics why, how and some caveats
MS (and NMR) data standards in Metabolomics why, how and some caveatsMS (and NMR) data standards in Metabolomics why, how and some caveats
MS (and NMR) data standards in Metabolomics why, how and some caveats
 
Using Social Media for Community Planning
Using Social Media for Community PlanningUsing Social Media for Community Planning
Using Social Media for Community Planning
 
1345 omma metrics dennis mortensen
1345 omma metrics dennis mortensen1345 omma metrics dennis mortensen
1345 omma metrics dennis mortensen
 
Supporting formal and informal social learning
Supporting formal and informal social learningSupporting formal and informal social learning
Supporting formal and informal social learning
 
Issue desk slides summer 11
Issue desk slides summer 11Issue desk slides summer 11
Issue desk slides summer 11
 
Agosto (2)jardim
Agosto (2)jardimAgosto (2)jardim
Agosto (2)jardim
 
8 2-43 normas apa y referencias bibliograficas
8 2-43 normas apa y referencias bibliograficas8 2-43 normas apa y referencias bibliograficas
8 2-43 normas apa y referencias bibliograficas
 
Инновационный метод лечения широкого круга аллергических заболеваний
Инновационный метод лечения широкого круга аллергических заболеванийИнновационный метод лечения широкого круга аллергических заболеваний
Инновационный метод лечения широкого круга аллергических заболеваний
 
Social Media and Networking - Infographic
Social Media and Networking - InfographicSocial Media and Networking - Infographic
Social Media and Networking - Infographic
 
10 características arquitectónicas en edificaciones del mundo egeo
10  características arquitectónicas en edificaciones del mundo egeo10  características arquitectónicas en edificaciones del mundo egeo
10 características arquitectónicas en edificaciones del mundo egeo
 
Pride Cluster 062016 Update
Pride Cluster 062016 UpdatePride Cluster 062016 Update
Pride Cluster 062016 Update
 
πολυτεχνείο 2014
πολυτεχνείο 2014πολυτεχνείο 2014
πολυτεχνείο 2014
 
20 percent tips
20 percent tips20 percent tips
20 percent tips
 
La participación en la Estrategia de ASCyT
La participación en la Estrategia de ASCyTLa participación en la Estrategia de ASCyT
La participación en la Estrategia de ASCyT
 
Cl zeel plast-machinery
Cl zeel plast-machineryCl zeel plast-machinery
Cl zeel plast-machinery
 
compte-rendu du colloque international sur le financement de la création
compte-rendu du colloque international sur le financement de la créationcompte-rendu du colloque international sur le financement de la création
compte-rendu du colloque international sur le financement de la création
 

Similar to Basic use of xcms

Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
Alpine Data
 
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
Spark Summit
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
multimediaeval
 
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORINGAPPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
csandit
 
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORINGAPPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
cscpconf
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
talktoharry
 
Reducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksReducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networks
Hakky St
 
lecture_16.pptx
lecture_16.pptxlecture_16.pptx
lecture_16.pptx
ObaidUllah693733
 
Chronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the BlockChronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the Block
QAware GmbH
 
The new time series kid on the block
The new time series kid on the blockThe new time series kid on the block
The new time series kid on the block
Florian Lautenschlager
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
NECST Lab @ Politecnico di Milano
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Ian Foster
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
byteLAKE
 
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
NETWAYS
 
A Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrA Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache Solr
QAware GmbH
 
Chronix: A fast and efficient time series storage based on Apache Solr
Chronix: A fast and efficient time series storage based on Apache SolrChronix: A fast and efficient time series storage based on Apache Solr
Chronix: A fast and efficient time series storage based on Apache Solr
Florian Lautenschlager
 
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsPRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
NECST Lab @ Politecnico di Milano
 
RUCK 2017 MxNet과 R을 연동한 딥러닝 소개
RUCK 2017 MxNet과 R을 연동한 딥러닝 소개RUCK 2017 MxNet과 R을 연동한 딥러닝 소개
RUCK 2017 MxNet과 R을 연동한 딥러닝 소개
r-kor
 
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Databricks
 
A methodology for full system power modeling in heterogeneous data centers
A methodology for full system power modeling in  heterogeneous data centersA methodology for full system power modeling in  heterogeneous data centers
A methodology for full system power modeling in heterogeneous data centers
Raimon Bosch
 

Similar to Basic use of xcms (20)

Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
 
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
 
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORINGAPPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
 
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORINGAPPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
APPLICATION OF BICLUSTERING TECHNIQUE IN MACHINE MONITORING
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
Reducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksReducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networks
 
lecture_16.pptx
lecture_16.pptxlecture_16.pptx
lecture_16.pptx
 
Chronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the BlockChronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the Block
 
The new time series kid on the block
The new time series kid on the blockThe new time series kid on the block
The new time series kid on the block
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
 
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
 
A Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrA Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache Solr
 
Chronix: A fast and efficient time series storage based on Apache Solr
Chronix: A fast and efficient time series storage based on Apache SolrChronix: A fast and efficient time series storage based on Apache Solr
Chronix: A fast and efficient time series storage based on Apache Solr
 
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsPRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
 
RUCK 2017 MxNet과 R을 연동한 딥러닝 소개
RUCK 2017 MxNet과 R을 연동한 딥러닝 소개RUCK 2017 MxNet과 R을 연동한 딥러닝 소개
RUCK 2017 MxNet과 R을 연동한 딥러닝 소개
 
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
 
A methodology for full system power modeling in heterogeneous data centers
A methodology for full system power modeling in  heterogeneous data centersA methodology for full system power modeling in  heterogeneous data centers
A methodology for full system power modeling in heterogeneous data centers
 

Recently uploaded

DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 

Recently uploaded (20)

DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 

Basic use of xcms

  • 1. Basic Use of XCMS -- Local Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte
  • 2. Preparation 2   •  Required: install R •  Optional: install Rstudio, an IDE (Integrated Development Environment) for R
  • 5. Get help documents •  Three ways –  http://www.bioconductor.org/packages/release/bioc/html/ xcms.html –  Google XCMS bioconductor –  Google XCMS à Scripps Center for Metabolomics and Mass Spectrometry – XCMS à installation à XCMS bioconductor
  • 8. Install and load XCMS packages (I) 8   Check if the XCMS package has been installed in R Answer: No
  • 9. Install and load the XCMS packages (II) 9  
  • 10. 10   Check again. XCMS and a few other packages have been installed.
  • 11. Install and load the XCMS packages (IV) 11   for multiple hypothesis testing demo data supplied by XCMS
  • 12. Raw data format and organization 12   •  Open formats that XCMS can read –  AIA/ANDI NetCDF –  mzData –  mzXML •  Organization –  Use sub-directories that correspond to sample class information •  Datasets for demonstration –  faahKO data package supplied by XCMS –  Data is stored in netCDF format. –  The raw data sets are stored in a folder on your computer.
  • 13. Raw data preparation (I) 13   From the terminal, check where the datasets are on your computer: In R command window on Mac:
  • 14. Raw data preparation (II) 14   Check where the datasets are on your computer: In R command window on Windows:
  • 15. Raw data preparation (III) 15   Get the list of the raw data files:
  • 16. Raw data preparation (IV) 16   •  Alternatively –  Specify the working directory –  By default, XCMS will recursively search through the current working directory for NetCDF/mzXML/mzData files.
  • 17. Peak identification (I) 17   •  Command: xcmsSet() •  One separate row for a dataset •  For each pair of numbers, the first number is the m/z XCMS is currently processing. The second number is the number of peaks that have been identified so far.
  • 18. Peak identification (II) 18   •  If raw data files are in your working directory, then:
  • 19. Peak Identificaiton (III) 19   •  Take a look at the xcmsSet object:
  • 20. Peak identification (IV) 20   •  The default parameters should work acceptably in most cases. –  Default peak detection method: matched filter –  Alternative approach: centWave for high resolution MS data •  However, a number of parameters might need to be optimized for particular instruments or experimental conditions. –  Matched filtration: model peak width, m/z step size for creating extracted ion base peak chromatograms (EIBPC), the algorithm to create EIBPC, … –  centWave: ppm, peak width range, … •  To be explained in the next section “Parameter set-up …” by Paul
  • 21. Matching peaks across samples 21   •  After peak identification, peaks representing the same analyte across samples must be placed into groups. •  This is accomplished with the group() method. •  There are several grouping parameters to consider optimizing for your chromatography and mass spectrometer (to be explained by Paul).
  • 22. Retention time correction (I) 22   •  XCMS uses peak groups to identify and correct drifts in retention time from run to run. •  Only well-behaved peak groups are used: missing the peak from at most one sample and having at most one extra peak. •  These parameters can be changed with the missing and extra arguments. •  For each of those well-behaved groups, XCMS calculates a median retention time and, for every sample, a deviation from that median.
  • 23. Retention time correction (II) 23   •  Within a sample, the observed deviation generally changes over time in a nonlinear fashion. •  Those changes are modeled using a local polynomial regression technique. •  Retention time correction is performed by the retcor() method. •  The plottype argument produces the plot on the next slide.
  • 25. Retention time correction (IV) 25   •  Use the plot to supervise the algorithm. The plot includes data points used for regression and the resulting deviation profiles. •  The plot also shows the distribution of peak groups across retention time.
  • 26. Retention time correction (V) 26   •  After retention time correction, the initial peak grouping becomes invalid. •  Peak re-grouping is needed. •  This iteration of peak grouping and alignment can be repeated in an iterative fashion.
  • 27. Filling in missing peaks (I) 27   •  Peaks could be missing due to imperfection in peak identification or because an analyte was not present in a sample. •  For missing peaks that correspond to analytes that are actually in the sample, the missing data points can be filled in by re-reading the raw data files and integrating them in the regions of the missing peaks. •  This is performed using the fillPeaks() method.
  • 28. Filling in missing peaks (II) 28  
  • 29. Analyzing and visualizing results (I) 29   •  A report showing the most statistically significant differences in analyte intensities can be generated with the diffreport() method. •  Results are stored in two folders and one spread sheet file.
  • 30. Analyzing and visualizing results (II) 30   Extracted ion chromatograms for significant ions 001.png 006.png 010.png
  • 31. Analyzing and visualizing results (III) 31   001.png 006.png 010.png Box plots for significant ions
  • 32. Analyzing and visualizing results (IV) 32   example.tsv (column A to K)
  • 33. Analyzing and visualizing results (V) 33   example.tsv (column O to AA)
  • 34. Going back to the raw data (I) 34   mzmed  =  300.2,  rtmed  =  3390.3  sec  =  56.5  min  
  • 35. Going back to the raw data (II) 35  
  • 36. Visualize raw data 36   •  Mass spectrometer vendors save data in proprietary formats. •  These data files can be converted to open data formats for easy reading. •  Software tools to do the conversion: msConvert
  • 37. msConvert (I) 37   •  Part of ProteoWizard •  Read from –  mzML, mzXML, MGF –  Agilent, Bruker, Thermo, Waters, ABSciex •  Write to –  open formats –  perform various filters and transformations •  http://proteowizard.sourceforge.net/ •  For Windows, msConvertGUI is available for easy file conversion.
  • 39. Raw data visualization 39   •  Software tool: Insilicos Viewer –  View raw MS data in formats including mzXML, mzData, mzML, and ANDI CDF –  http://insilicos.com/products/insilicos-viewer-1 –  Quick demo
  • 40. Run all the commands 40