Your SlideShare is downloading. ×
0
Dmitry Grapov and Oliver Fiehn
University of California, Davis
Multivariate Analysis and
Visualization Tools for
Metabolom...
State of the art facility producing massive
amounts of biological data…
>20-30K samples/yr
>200 studies
Sample
Variable
Data Analysis and Visualization
Quality Assessment
• use replicated mesurements
and/or internal standards ...
Sample
Variable
Data Analysis and Visualization
Quality Assessment
• use replicated mesurements
and/or internal standards ...
Principal Component
Analysis (PCA) of all
analytes, showing QC
sample scores
Data Quality Assessment
Drift in >400 replica...
Data Quality Assessment
Analyte specific data quality
overview
Sample specific normalization can be used
to estimate and r...
Network Mapping
Ranked statistically
significant differences
within a a biochemical
context
Statistics
Multivariate
Contex...
Network Mapping
Statistics
Multivariate
Context
+
+
=
Statistical and Multivariate Analyses
Group 1
Group 2
What analytes ...
DeviumWebhttps://github.com/dgrapov/DeviumWeb
• visualization
• statistics
• clustering
• PCA
• O-PLS
DeviumWebhttps://github.com/dgrapov/DeviumWeb
• visualization
• statistics
• clustering
• PCA
• O-PLS
Functional Analysis
Nucl. Acids Res. (2008) 36 (suppl 2): W423-W426.doi: 10.1093/nar/gkn282
Identify changes or enrichment...
Functional Analysis: opportunity for ‘Omic integration
Use domain knowledge
databases to integrate
genomic, proteomic
and ...
Networks
Biochemical
•reaction
•domain
Structural
•molecular fingerprints
• mass spectra
Empirical
•correlation
•partial c...
Mapped
Network
- displaying metabolic
differences in control vs.
malignant lung tissue
Biochemical
Relationships
http://ww...
Structural
Similarity
http://pubchem.ncbi.nlm.nih.gov//score_matrix/score_matrix.cgi
Empirical Networks
Use experiment specific or data driven relationships to gain novel insight
into biochemical relationshi...
Mass Spectral Networks
Use mass spectra as a proxy for structure to help make sense of
unknown compounds’ biochemical iden...
Mass Spectral Networks
Use mass spectra and empirical relationships to narrow down the
biochemical roles for unknown compo...
MetaMapRhttps://github.com/dgrapov/MetaMapR
Analysis at the Metabolomic Scale and Beyond
pyruvate lactate
enzyme
gene Bgene A
Pathway independent metabolomic (known a...
Software and Resources
•DeviumWeb- Dynamic multivariate data analysis and
visualization platform
url: https://github.com/d...
dgrapov@ucdavis.edu
metabolomics.ucdavis.edu
This research was supported in part by NIH 1 U24 DK097154
Multivarite and network tools for biological data analysis
Upcoming SlideShare
Loading in...5
×

Multivarite and network tools for biological data analysis

17,331

Published on

See video:
http://imdevsoftware.wordpress.com/2014/06/27/multivariate-data-analysis-and-visualization-through-network-mapping/

Published in: Science, Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
17,331
On Slideshare
0
From Embeds
0
Number of Embeds
28
Actions
Shares
0
Downloads
43
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Transcript of "Multivarite and network tools for biological data analysis"

  1. 1. Dmitry Grapov and Oliver Fiehn University of California, Davis Multivariate Analysis and Visualization Tools for Metabolomic Data
  2. 2. State of the art facility producing massive amounts of biological data… >20-30K samples/yr >200 studies
  3. 3. Sample Variable Data Analysis and Visualization Quality Assessment • use replicated mesurements and/or internal standards to estimate analytical variance Statistical and Multivariate • use the experimental design to test hypotheses and/or identify trends in analytes Functional • use statistical and multivariate results to identify impacted biochemical domains Network • integrate statistical and multivariate results with the experimental design and analyte metadata experimental design - organism, sex, age etc. analyte description and metadata - biochemical class, mass spectra, etc. VariableSample
  4. 4. Sample Variable Data Analysis and Visualization Quality Assessment • use replicated mesurements and/or internal standards to estimate analytical variance Statistical and Multivariate • use the experimental design to test hypotheses and/or identify trends in analytes Functional • use statistical and multivariate results to identify impacted biochemical domains Network • integrate statistical and multivariate results with the experimental design and analyte metadata Network Mapping experimental design - organism, sex, age etc. analyte description and metadata - biochemical class, mass spectra, etc. VariableSample
  5. 5. Principal Component Analysis (PCA) of all analytes, showing QC sample scores Data Quality Assessment Drift in >400 replicated measurements across >100 analytical batches for a single analyte Acquisition batch Abundance QCs embedded among >5,5000 samples (1:10) collected over 1.5 yrs If the biological effect size is less than the analytical variance then the experiment will incorrectly yield insignificant results
  6. 6. Data Quality Assessment Analyte specific data quality overview Sample specific normalization can be used to estimate and remove analytical variance Raw Data Normalized Data Normalizations need to be numerically and visually validated log mean low precision %RSD high precision Samples QCs
  7. 7. Network Mapping Ranked statistically significant differences within a a biochemical context Statistics Multivariate Context + + = Statistical and Multivariate Analyses Group 1 Group 2 What analytes are different between the two groups of samples? Statistical significant differences lacking rank and context t-Test Multivariate ranked differences lacking significance and context O-PLS-DA
  8. 8. Network Mapping Statistics Multivariate Context + + = Statistical and Multivariate Analyses Group 1 Group 2 What analytes are different between the two groups of samples? Statistical t-Test Multivariate O-PLS-DA To see the big picture it is necessary too view the data from multiple different angles
  9. 9. DeviumWebhttps://github.com/dgrapov/DeviumWeb • visualization • statistics • clustering • PCA • O-PLS
  10. 10. DeviumWebhttps://github.com/dgrapov/DeviumWeb • visualization • statistics • clustering • PCA • O-PLS
  11. 11. Functional Analysis Nucl. Acids Res. (2008) 36 (suppl 2): W423-W426.doi: 10.1093/nar/gkn282 Identify changes or enrichment in biochemical domains • decrease • increase
  12. 12. Functional Analysis: opportunity for ‘Omic integration Use domain knowledge databases to integrate genomic, proteomic and metabolomic data Current approaches can be limited to pathway level analyses
  13. 13. Networks Biochemical •reaction •domain Structural •molecular fingerprints • mass spectra Empirical •correlation •partial correlation BMC Bioinformatics 2012, 13:99 doi:10.1186/1471-2105-13-99 
  14. 14. Mapped Network - displaying metabolic differences in control vs. malignant lung tissue Biochemical Relationships http://www.genome.jp/dbget-bin/www_bget?rn:R00975
  15. 15. Structural Similarity http://pubchem.ncbi.nlm.nih.gov//score_matrix/score_matrix.cgi
  16. 16. Empirical Networks Use experiment specific or data driven relationships to gain novel insight into biochemical relationships urea cycle nucleotide synthesis protein glycosylation
  17. 17. Mass Spectral Networks Use mass spectra as a proxy for structure to help make sense of unknown compounds’ biochemical identities Watrous J et al. PNAS 2012;109:E1743-E1752 unknown compounds are likely phytosterol esters
  18. 18. Mass Spectral Networks Use mass spectra and empirical relationships to narrow down the biochemical roles for unknown compounds Rigorous chemical experiments identified the unknown compounds as partial derivatization products of glucose
  19. 19. MetaMapRhttps://github.com/dgrapov/MetaMapR
  20. 20. Analysis at the Metabolomic Scale and Beyond pyruvate lactate enzyme gene Bgene A Pathway independent metabolomic (known and unknown), proteomic and genomic data integration
  21. 21. Software and Resources •DeviumWeb- Dynamic multivariate data analysis and visualization platform url: https://github.com/dgrapov/DeviumWeb •imDEV- Microsoft Excel add-in for multivariate analysis url: http://sourceforge.net/projects/imdev/ •MetaMapR: Network analysis tools for metabolomics url: https://github.com/dgrapov/MetaMapR •TeachingDemos- Tutorials and demonstrations •url: http://sourceforge.net/projects/teachingdemos/?source=directory •url: https://github.com/dgrapov/TeachingDemos •Data analysis case studies and Examples url: http://imdevsoftware.wordpress.com/
  22. 22. dgrapov@ucdavis.edu metabolomics.ucdavis.edu This research was supported in part by NIH 1 U24 DK097154
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×