Gene expression profiling identifies molecular subtypes of gliomas Ruty Shai, Tao Shi, Thomas J Kremen, Steve Horvath, Linda M Liau, Timothy F Cloughesy, Paul S Mischel* and Stanley F Nelson Presented by Stephanie Tsung
Evaluate distance from ‘new obj’ to all other objects and Go to Step 2
R: h1 <- hclust(dist(x), method=“average”)
Hierarchical Clustering Figure 1. (b) The same 42 tissue samples were grouped into hierarchical clusters. Tissue samples are color-coded. I II I & II : P =0.00006, Fisher’s exact test III & IV : P =0.00001 III IV
Fisher’s Exact Test The two-tailed probability: .326 + .007+ .093 + .163 + .019 = .608 Ho: Whether proportion of interest differs between two groups. N B+D A+C Total C+D D C 2 A+B B A 1 Total w/o charat. w/ charat. Sample Ex. 55 8 7
Q. Can we uncover these subtypes without prior knowledge? i.e. How many categories of gliomas are suggested by the gene expression data?
Leave-one-out crossvalidation error rates were calculated.
For a given method and sample size, n, a classifier is generated using
(n - l) cases and tested on the single remaining case. This is repeated n times, each time designing a classifier by leaving-one-out. Thus, each case in the sample is used as a test case, and each time nearly all the cases are used to design a classifier