Clustering

132 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
132
On SlideShare
0
From Embeds
0
Number of Embeds
32
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Clustering

  1. 1. Kelly Chan | Dec 6 2013 Hierarchical Clustering X dist() hclust() as.matrix() plot() heatmap() data.frame(x,y) Y plot(x,y) Distances – dist() (1) Continuous -> Euclidean distance set.seed(124); par(mar=c(0,0,0,0)) x <- rnorm(20,mean=rep(1:3,each=4),sd=0.5) y <- rnorm(20,mean=rep(c(1,2,1),each=4),sd=0.5) plot(x,y,col="blue",pch=19,cex=2) text(x+0.05,y+0.05,labels=as.character(1:20)) √((x −x ) +( y − y ) ) 2 1 2 2 1 2 (2) Binary -> Manhattan distance ∣x 1 −x 2∣+∣ y1 − y2∣ (3) Continuous – Correlation Similarity dataFrame <- data.frame(x=x,y=y) distxy <- dist(dataFrame) hClustering <- hclust(distxy) plot(hClustering) dataFrame <- data.frame(x=x,y=y) set.seed(1344) dataMatrix <- as.matrix(dataFrame)[sample(1:20),] heatmap(dataMatrix)
  2. 2. Kelly Chan | Dec 6 2013 K-Means Clustering X kmeansObj$centers data.frame(x,y) kmeansObj$cluster Y plot(x,y) plot() kmeans() as.matrix() kmeansObj – kmeans() kmeansObj$ cluster <- N points: clusterValue kmeansObj$ centers <- K centers: centerValue kmeansObj$ size <- K centers: n points kmeansObj$ totss kmeansObj$ withinss kmeansObj$ tot.withinss kmeansObj$ betweenss image(t(matrix[], ) # k-means dataFrame <- data.frame(x,y) kmeansObj <- kmeans(dataFrame,centers=3) names(kmeansObj) kmeansObj$cluster # plot k-means par(mar=rep(0.2,4)) plot(x,y,col=kmeansObj$cluster,pch=19,cex=2) points(kmeansObj$centers,col=1:3,pch=3,cex=3,lwd=3) # heatmap set.seed(1234) dataMatrix <- as.matrix(dataFrame)[sample(1:12),] kmeansObj2 <- kmeans(dataMatrix,centers=3) par(mfrow=c(1,2),mar=rep(0.2,4)) image(t(dataMatrix)[,nrow(dataMatrix):1],yaxt="n") image(t(dataMatrix)[,order(kmeansObj$cluster)],yaxt="n")

×