2 cluster analysis

21,683 views

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
21,683
On SlideShare
0
From Embeds
0
Number of Embeds
19,068
Actions
Shares
0
Downloads
46
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

2 cluster analysis

  1. 1. Biology Chemistry Informatics Evaluation of metabolomic sample processing methods using hierarchical cluster analysis Cluster Analysis Goal: Use hierarchical cluster analysis (HCA) to evaluate data variance structure Topics: 1. Evaluate sample and variable similarities 2. Identify the effect of data transformation, distance and linkage methods on data similarities
  2. 2. Clustering data Biology Chemistry Informatics Cluster Analysis Goal: Use HCA to cluster samples (Use DATA: Pumpkin data 1.csv) Visualize: 1. Sample (row) raw similarities as a heat map 2. Annotate heatmap with extraction and treatment type 3. Select cluster distance and linkage method to cluster the samples 4. Determine the effect of data transformations on the cluster structure (view as a dendrogram) Exercises: 1. What factor, extraction or treatment, has the greatest contribution to the data variance structure? 2. Describe the effect of clustering raw data or sample correlations
  3. 3. Biology Chemistry Raw data matrix visualized as a heatmap samples variables Cluster Analysis Informatics
  4. 4. Raw data matrix organized by HCA Biology Chemistry Informatics Cluster Analysis •ACN:/IPA/water|fresh and MeOH/CH3Cl/water|dried display distinct patterns in metabolites which are most similar to each other •Sample similarities are linked to metabolite magnitudes
  5. 5. Biology Chemistry Clustering based on sample correlations (spearman) Informatics Cluster Analysis •100% MeOH/fresh is the most dissimilar protocol from all others •ACN:/IPA/water and MeOH/CH3Cl/water are most similar to each other •Sample similarities are decoupled from metabolite magnitudes
  6. 6. Clustering metabolites Biology Chemistry Informatics Goal 2: Use HCA to evaluate metabolite similarities Cluster Analysis Visualize: 1.Z-scaled and correlation based variable clustering 2.Use a dendrogram to extract variable clusters 3.Select two variables from the same cluster and visualize their correlation Exercise: 1.Do the clustered variables share biological functions? 2.Which type of correlation is most robust to outliers? 3.Are the correlations for the visualized variable independent of extraction/treatment?
  7. 7. Z-scaled variable clusters Biology Chemistry Cluster Analysis Informatics
  8. 8. Correlation based variable clusters Biology Chemistry Cluster Analysis Informatics
  9. 9. Biology Chemistry Extraction of clusters of correlated variables Informatics less similar most similar cluster Cluster Analysis lowest common branch height more similar
  10. 10. Biology Chemistry Cluster Analysis Informatics Correlation among cluster members (4)

×