Multivarite and network tools for biological data analysis
Upcoming SlideShare
Loading in...5
×
 

Multivarite and network tools for biological data analysis

on

  • 7,305 views

See video:

See video:
http://imdevsoftware.wordpress.com/2014/06/27/multivariate-data-analysis-and-visualization-through-network-mapping/

Statistics

Views

Total Views
7,305
Views on SlideShare
2,356
Embed Views
4,949

Actions

Likes
3
Downloads
13
Comments
0

22 Embeds 4,949

http://www.r-bloggers.com 2811
http://imdevsoftware.wordpress.com 1022
http://feedly.com 494
https://imdevsoftware.wordpress.com 440
http://feeds.feedburner.com 44
http://digg.com 34
http://www.feedspot.com 30
http://www.inoreader.com 26
http://www.newsblur.com 12
http://reader.aol.com 7
http://feedproxy.google.com 5
http://feedreader.com 5
https://www.linkedin.com 4
http://127.0.0.1 4
http://www.hanrss.com 3
http://www.slideee.com 2
http://webcache.googleusercontent.com 1
https://www.google.com 1
http://cf.theraymonds.org 1
https://www.commafeed.com 1
http://webmailcommunicator.alice.it 1
http://news.google.com 1
More...

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Multivarite and network tools for biological data analysis Multivarite and network tools for biological data analysis Presentation Transcript

  • Dmitry Grapov and Oliver Fiehn University of California, Davis Multivariate Analysis and Visualization Tools for Metabolomic Data
  • State of the art facility producing massive amounts of biological data… >20-30K samples/yr >200 studies
  • Sample Variable Data Analysis and Visualization Quality Assessment • use replicated mesurements and/or internal standards to estimate analytical variance Statistical and Multivariate • use the experimental design to test hypotheses and/or identify trends in analytes Functional • use statistical and multivariate results to identify impacted biochemical domains Network • integrate statistical and multivariate results with the experimental design and analyte metadata experimental design - organism, sex, age etc. analyte description and metadata - biochemical class, mass spectra, etc. VariableSample
  • Sample Variable Data Analysis and Visualization Quality Assessment • use replicated mesurements and/or internal standards to estimate analytical variance Statistical and Multivariate • use the experimental design to test hypotheses and/or identify trends in analytes Functional • use statistical and multivariate results to identify impacted biochemical domains Network • integrate statistical and multivariate results with the experimental design and analyte metadata Network Mapping experimental design - organism, sex, age etc. analyte description and metadata - biochemical class, mass spectra, etc. VariableSample
  • Principal Component Analysis (PCA) of all analytes, showing QC sample scores Data Quality Assessment Drift in >400 replicated measurements across >100 analytical batches for a single analyte Acquisition batch Abundance QCs embedded among >5,5000 samples (1:10) collected over 1.5 yrs If the biological effect size is less than the analytical variance then the experiment will incorrectly yield insignificant results
  • Data Quality Assessment Analyte specific data quality overview Sample specific normalization can be used to estimate and remove analytical variance Raw Data Normalized Data Normalizations need to be numerically and visually validated log mean low precision %RSD high precision Samples QCs
  • Network Mapping Ranked statistically significant differences within a a biochemical context Statistics Multivariate Context + + = Statistical and Multivariate Analyses Group 1 Group 2 What analytes are different between the two groups of samples? Statistical significant differences lacking rank and context t-Test Multivariate ranked differences lacking significance and context O-PLS-DA
  • Network Mapping Statistics Multivariate Context + + = Statistical and Multivariate Analyses Group 1 Group 2 What analytes are different between the two groups of samples? Statistical t-Test Multivariate O-PLS-DA To see the big picture it is necessary too view the data from multiple different angles
  • DeviumWebhttps://github.com/dgrapov/DeviumWeb • visualization • statistics • clustering • PCA • O-PLS
  • DeviumWebhttps://github.com/dgrapov/DeviumWeb • visualization • statistics • clustering • PCA • O-PLS
  • Functional Analysis Nucl. Acids Res. (2008) 36 (suppl 2): W423-W426.doi: 10.1093/nar/gkn282 Identify changes or enrichment in biochemical domains • decrease • increase
  • Functional Analysis: opportunity for ‘Omic integration Use domain knowledge databases to integrate genomic, proteomic and metabolomic data Current approaches can be limited to pathway level analyses
  • Networks Biochemical •reaction •domain Structural •molecular fingerprints • mass spectra Empirical •correlation •partial correlation BMC Bioinformatics 2012, 13:99 doi:10.1186/1471-2105-13-99 
  • Mapped Network - displaying metabolic differences in control vs. malignant lung tissue Biochemical Relationships http://www.genome.jp/dbget-bin/www_bget?rn:R00975
  • Structural Similarity http://pubchem.ncbi.nlm.nih.gov//score_matrix/score_matrix.cgi
  • Empirical Networks Use experiment specific or data driven relationships to gain novel insight into biochemical relationships urea cycle nucleotide synthesis protein glycosylation
  • Mass Spectral Networks Use mass spectra as a proxy for structure to help make sense of unknown compounds’ biochemical identities Watrous J et al. PNAS 2012;109:E1743-E1752 unknown compounds are likely phytosterol esters
  • Mass Spectral Networks Use mass spectra and empirical relationships to narrow down the biochemical roles for unknown compounds Rigorous chemical experiments identified the unknown compounds as partial derivatization products of glucose
  • MetaMapRhttps://github.com/dgrapov/MetaMapR
  • Analysis at the Metabolomic Scale and Beyond pyruvate lactate enzyme gene Bgene A Pathway independent metabolomic (known and unknown), proteomic and genomic data integration
  • Software and Resources •DeviumWeb- Dynamic multivariate data analysis and visualization platform url: https://github.com/dgrapov/DeviumWeb •imDEV- Microsoft Excel add-in for multivariate analysis url: http://sourceforge.net/projects/imdev/ •MetaMapR: Network analysis tools for metabolomics url: https://github.com/dgrapov/MetaMapR •TeachingDemos- Tutorials and demonstrations •url: http://sourceforge.net/projects/teachingdemos/?source=directory •url: https://github.com/dgrapov/TeachingDemos •Data analysis case studies and Examples url: http://imdevsoftware.wordpress.com/
  • dgrapov@ucdavis.edu metabolomics.ucdavis.edu This research was supported in part by NIH 1 U24 DK097154