MeV: Joe White

562 views

Published on

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
562
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • MeV: Joe White

    1. 1. Analysis of Multiple Experiments TIGR Multiple Experiment Viewer (MeV) Joseph White DFCI January 24,2008
    2. 2. MeV <ul><li>Stand-alone java application for analysis </li></ul><ul><li>New version: 4.1 </li></ul><ul><li>Not database centric; uses TDMS files </li></ul><ul><li>Writes TDMS files </li></ul><ul><li>Primarily for normalized data </li></ul><ul><li>MeV does not currently write MAGE-TAB </li></ul><ul><li>Download MeV from: tm4.org </li></ul>
    3. 3. Outline <ul><li>Description of MeV </li></ul><ul><li>How MeV treats expression </li></ul><ul><li>Some essential concepts </li></ul><ul><li>Demo: basic operations in MeV </li></ul><ul><ul><li>New file loader </li></ul></ul><ul><ul><li>ANOVA example </li></ul></ul><ul><li>Demo of MeV new features </li></ul><ul><ul><li>Affymetrix file reader </li></ul></ul><ul><ul><li>Non-parametric tests </li></ul></ul><ul><ul><li>CGH </li></ul></ul><ul><li>GCOD </li></ul>
    4. 4. The Expression Matrix is a representation of data from multiple microarray experiments. Each element is a log ratio (usually log 2 (Cy5 / Cy3) ) Red indicates a positive log ratio, i.e, Cy5 > Cy3 Green indicates a negative log ratio , i.e., Cy5 < Cy3 Black indicates a log ratio of zero, i. e., Cy5 and Cy3 are very close in value Gray indicates missing data Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6 Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6
    5. 5. Expression Vectors <ul><li>-Gene Expression Vectors </li></ul><ul><li>encapsulate the expression of a gene over a set of experimental conditions or sample types. </li></ul>Log2(cy5/cy3) -0.8 0.8 1.5 1.8 0.5 -1.3 -0.4 1.5
    6. 6. Expression Vectors As Points in ‘Expression Space’ Experiment 1 Experiment 2 Experiment 3 Similar Expression -0.8 -0.6 0.9 1.2 -0.3 1.3 -0.7 Exp 1 Exp 2 Exp 3 G1 G2 G3 G4 G5 -0.4 -0.4 -0.8 -0.8 -0.7 1.3 0.9 -0.6
    7. 7. Distance and Similarity -the ability to calculate a distance (or similarity, it’s inverse) between two expression vectors is fundamental to clustering algorithms -distance between vectors is the basis upon which decisions are made when grouping similar patterns of expression -selection of a distance metric defines the concept of distance
    8. 8. Distance: a measure of similarity between genes. <ul><li>Some distances: (MeV provides 11 metrics) </li></ul><ul><li>Euclidean:  i = 1 (x iA - x iB ) 2 </li></ul>3. Pearson correlation p 0 p 1 Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6 Gene A Gene B x 1A x 2A x 3A x 4A x 5A x 6A x 1B x 2B x 3B x 4B x 5B x 6B 6 <ul><li>Manhattan:  i = 1 |x iA – x iB | </li></ul>6
    9. 9. Distance is Defined by a Metric 4.2 1.4 -1.00 -0.90 Euclidean Pearson(r*-1) Distance Metric : D D
    10. 10. Normal distribution X = μ (mean of the distribution) σ = std. deviation of the distribution
    11. 11. Current MeV Algorithms <ul><li>Hierarchical Clustering </li></ul><ul><li>K Means clustering </li></ul><ul><li>Support Trees for HCL </li></ul><ul><li>EASE (annotation clustering </li></ul><ul><li>Self-organizing maps </li></ul><ul><li>K-Nearest Neighbors </li></ul><ul><li>Support Vector Machines </li></ul><ul><li>Relevance Networks </li></ul><ul><li>Template Matching </li></ul><ul><li>PCA </li></ul><ul><li>CGH </li></ul><ul><li>Bayesean Networks </li></ul><ul><li>T-test </li></ul><ul><li>ANOVA </li></ul><ul><ul><li>One and two factor </li></ul></ul><ul><li>SAM </li></ul><ul><li>Non-parametric tests </li></ul><ul><ul><li>Wilcoxon </li></ul></ul><ul><ul><li>Fisher Exact Test </li></ul></ul><ul><ul><li>Mack-Skillings </li></ul></ul><ul><ul><li>Kruskat-Wallins </li></ul></ul><ul><li>BRIDGE </li></ul>
    12. 12. Demos <ul><li>File loaders </li></ul><ul><li>HTA data: ANOVA </li></ul><ul><li>Affymetrix data: SAM </li></ul><ul><li>Non-Parametric tests </li></ul><ul><li>CGH </li></ul>
    13. 13. GeneChip Oncology Database
    14. 14. GeneChip Oncology Database
    15. 15. GCOD statistics <ul><li>Studies: 52 </li></ul><ul><li>Hybridizations: 4591 </li></ul><ul><li>Analysis Result sets: 12,637 </li></ul><ul><li>Signal values: 204,296,195 </li></ul><ul><li>Samples: 3644 </li></ul><ul><li>Probesets: 160,817 </li></ul><ul><ul><ul><ul><li>eg. (HG-U133A: 22,293) </li></ul></ul></ul></ul><ul><ul><ul><li> (HG_U133_Plus_2: 54,684) </li></ul></ul></ul><ul><li>Arraydesigns: 9 </li></ul><ul><li>Accessions: 54,414 </li></ul>
    16. 16. MeV Team <ul><li>Eleanor Howe </li></ul><ul><li>Sarita Nair </li></ul><ul><li>Raktim Sinha </li></ul><ul><li>[email_address] </li></ul>

    ×