Successfully reported this slideshow.

More Related Content

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Clustering Microarray Data

  1. 1. Clustering Microarray Data Heather Turner Department of Statistics University of Warwick, UK Heather Turner (University of Warwick) 1/9
  2. 2. Overview of Microarray Experiment −→ −→ Array of p genes Scanned image n × p matrix (×n) (×n) Heather Turner (University of Warwick) Clustering Microarray Data 2/9
  3. 3. Example: Serum Stimulation of Human Fibroblasts (Eisen, Spellman, Brown & Botstein, PNAS, 1998) 9,800 spots representing 8,600 genes 12 samples taken over 24 hour period Highlighted clusters can be roughly categorised as genes involved in A cholesterol biosynthesis B the cell cycle C the immediate–early response D signaling and angiogenesis E wound healing and tissue remodelling Heather Turner (University of Warwick) Clustering Microarray Data 3/9
  4. 4. Why the need for specialised techniques? Application Dimensions of the data are nonstandard (large n, small p) Structure Both genes and sample clusters may be of interest Co-expression may be restricted to a subset of the attributes Genes/samples may belong to more than one group Many “uninteresting” genes Nature Clusters of interest may not be characterised by similar expression profile Samples may be taken over time Heather Turner (University of Warwick) Clustering Microarray Data 4/9
  5. 5. One-way Clustering Techniques Increased structural flexibility Overlapping non-exhaustive clusters Context-specific clusters Gene shaving: Hastie et al, Clustering On Subsets of Genome Biol., 2000 Attributes (COSA): Friedman and Meulman, JRSS B, 2004 Heather Turner (University of Warwick) Clustering Microarray Data 5/9
  6. 6. Two-way Clustering Techniques Use conventional one-way methods iteratively Sample clusters within gene clusters Clusters within two-way clusters Inter-related two-way Coupled Two-Way Clustering clustering: Tang et al, BIBE 01 (CTWC): Getz et al, PNAS, 2003 EMMIX-GENE: McLachlan et al, Bioinformatics, 2002 Heather Turner (University of Warwick) Clustering Microarray Data 6/9
  7. 7. Co-clustering Techniques Simultaneously cluster both genes and samples Two-way partition Conjugate clusters Spectral bi-clustering: Kluger, Double Conjugated Clustering Genome Res., 2003 (DCC): Busygin et al, SIAM ICDM 02 Co-clustering: Cho, SIAM ICDM 04 Heather Turner (University of Warwick) Clustering Microarray Data 7/9
  8. 8. Biclustering Techniques Retrieve isolated two-way clusters: biclusters Clusters based on latent model Biclusters Rich probabilistic models: Segal SAMBA: Tanay et al, et al, Bioinformatics, 2001 Bioinformatics, 2002 Plaid models: Lazzeroni and Owen, Statist. Sinica, 2002 Heather Turner (University of Warwick) Clustering Microarray Data 8/9
  9. 9. Current Situation Many novel methods, few used in practice Molecular biologists often have limited (access to) statistical expertise Limited number of methods in publically available software Little work on performance evaluation Development of methods continues Improved algorithms Time series Three-way data Integretation of other sources of data Heather Turner (University of Warwick) Clustering Microarray Data 9/9

×