Graph based approaches to Gene Expression Clustering
Upcoming SlideShare
Loading in...5
×
 

Graph based approaches to Gene Expression Clustering

on

  • 906 views

Presentation on Graph based approaches to Gene Expression Clustering.

Presentation on Graph based approaches to Gene Expression Clustering.

Statistics

Views

Total Views
906
Slideshare-icon Views on SlideShare
906
Embed Views
0

Actions

Likes
1
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Centrality : Influence

Graph based approaches to Gene Expression Clustering Graph based approaches to Gene Expression Clustering Presentation Transcript

  • GENE EXPRESSIONCLUSTERINGGRAPH BASED APPROACHES A P R E S E N T A T I O N B Y GOVIND M (M120432CS) MTECH COMPUTER SCIENCE AND ENGINEERING N AT I O N A L I N S T I T U T E O F T E C H N O L O G Y C A L I C U T govindmaheswaran@gmail.com
  • Clustering and Graph Theory Using Graphs in Clustering Simple Graph Partitioning Outline Spectral Graph PartitioningConclusion
  • Clustering• Process of Grouping a set of data objects, in terms of similarity• Same Cluster => Similar Objects and vice versa.• Widely used in data mining, market analysis etc.• Used to make sense of Bioinformatics data.• Two major purposes, in Bioinformatics • Find properties of genes ( Relationship among genes, deduce the functions of genes etc) • Predict more relevant factors (eg. Clustering cancerous and non cancerous genes, finding the effect of a medication)
  • Graphs• Data Structure• Used in multiple domains• Key Terms • Edge • Vertex • Weighted Graph
  • Some Graph Theory • Cut • Partitioning
  • Clustering using Graphs Involves 3 steps1. Preprocessing ◦ Convert data set into a graph ◦ Using Adjacency matrix and Degree Matrix representation ◦ Similarity between nodes can be taken as the weight of an edge.2. Partitioning ◦ Partition the graph3. Clustering ◦ Repeat until required number of clusters are obtained ◦ Alternatively, extra iterations followed by joinings may also be implemented.
  • Simple Graph Partitioning• Weight of an edge = Similarity between the nodes• Find Minimum Cut• Edge Value decreases, cluster differs
  • Simple Graph Partitioning : TheAlgorithmInput : Graph G<V,E>, Number of Clusters kOutput: Cluster of GraphsRepeat k-1 times Low_val = infinity For each edge e of the graph Calculate Cut_Cost, cost of a CUT at that edge if Cut_Cost < Low_val Low_Val = cut_cost Cut_Edge = e Cut at edge e
  • Simple Graph Partitioning (cont..)• Advantage • Simple to implement • Uses the concept of Min Cut.• Disadvantage • What about intra-cluster similarity..?
  • Spectral Graph Partitioning• Is widely used• Uses Eigen Vectors of Laplacian Matrix• Recursive algorithm• Qualitatively Good• Computationally Better than SGP.
  • Some graph theory… d1 = 7 • Degree : d2 = 3 d3 = 1 d4 = 0 0 2 5 0 • Affinity Matrix : 0 0 3 0 0 0 0 1 0 0 0 0 7 0 0 0 0 3 0 0 • Degree Matrix 0 0 1 0 0 0 0 0 -7 2 5 0 0 -3 3 0 • Laplacian Matrix : 0 0 -1 1 0 0 0 0
  • Some more Graph Theory…• Spectrum : Eigen vectors, arranged in the order of magnitude of eigen values.• Eigen Values of Graphs • Calculated as Eigen values of Laplacian matrix of the graph • Corresponidngly Eigen Vectors too• Fiedler Theorm • Correlation b/w eigen vectors and graph properties • Principal Eigen Vectors. Kth Principal Eigen Vector. • Principal Eigen Vector : Centrality of Vertices• 2nd Principal Eigen Vector : algebraic connectivity • Called Fiedler Vector • Matrix of positive and negative values • Partition is decided by the Sign of the value.
  • Spectral Graph PartitioningInput : Graph G<V,E>Output: Graphs G1< V1,E1>, G2< V2,E2> Create the Laplacian Vector L, of the Graph G. Calculate the Fiedler Vector F for each vertex vi in G if F[i]>0 V1.append(v) else V2.append(v)
  • SPG : Example 2nd Principal Vector = <0.415, 0.309, 0.069, −0.221, 0.221, −0.794> 2nd Principal Vector = <0.415, 0.309, -0.190, 0.169, > (of 1235)
  • SGP : Bipartitioning Method (contd.)• Recursive Algorithm• Although better than Simple Graph Partitioning, not optimum• Multiple times bipartitioning.• Can be improved by Multipartitioning• Use more eigen vectors.
  • Conclusion• Clustering is Based on simple concepts of graph theory• Optimal results (Spectral methods)• Can give better performance than traditional clustering.• Preprocessing overhead.
  • References1. Yanhua Chen; Ming Dong; Rege, M., "Gene Expression Clustering: a Novel Graph Partitioning Approach," Neural Networks, 2007. IJCNN 2007. International Joint Conference on , vol., no., pp.1542,1547, 12-17 Aug. 2007, doi: 10.1109/IJCNN.2007.4371187 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4371187&isnumber=4370 8912. Hagen, L.; Kahng, A.B., "New spectral methods for ratio cut partitioning and clustering," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , vol.11, no.9, pp.1074,1085, Sep 1992, doi: 10.1109/43.159993 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=159993&isnumber=41903. Donath, W.E.; Hoffman, A.J., "Lower Bounds for the Partitioning of Graphs," IBM Journal of Research and Development, vol. 17, pp. 420-425, 1973.4. Pavla Kabel´ıková , “Graph Partitioning Using Spectral Methods”, Thesis, VˇSB - Technical University of Ostrava, 2006.5. Chung, F.R.K., "Spectral Graph Theory," American Mathematical Society, 1997.