Datamining 7th Kmeans

936 views
886 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
936
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Datamining 7th Kmeans

  1. 1. 4.1 4 • • 1 10 20 30 0.74 0.76 1.34 40 1.75 10 2 2.01 2.62 30 0.87 20 40 0.69 3 0.74 0.87 0.60 1.34 0.76 1.83 1.90 1.75 4 1.73 1.83 0.96 0.93 2.01 2.62 0.87 0.69 4.1 10 20 30 40 0.87 4 0.60 1.83 0 1.90 1.73 1.83 0.96 0.93 2
  2. 2. • • • • • •
  3. 3. (2) • • • • 4
  4. 4. (3) • 2 • • • • • • ( • • • 5
  5. 5. x2 x2 x x dx2 + dx2 1 2 |dx1 | + |dx2 | dx2 dx2 y y dx1 dx1 x1 x1 (A) (B) 6
  6. 6. n i=1 (xi − x)(yi − y ) ¯ ¯ r= n n i=1 (xi − x)2 ¯ i=1 (yi − y )2 ¯ y y y x x x r≈1 r≈0 r ≈ −1 7
  7. 7. • • • • (Top-down Clustering, Divisive Clustering) (Bottom-up Clustering, Agglomerative Clustering) C B A F G E D A B C D E F G (A) (B) 8
  8. 8. k- • k • d S S k S1 , S 2 , . . . , S k k- • S S = S1 ∪ S2 ∪ · · · ∪ Sk • Si ∩ Sj = φ (i = j) • 3- C C B B A A G E F F G E D D (A) 3- (B) 3- 10
  9. 9. k- (2) • k- • (A) • v(Si) • (B) • q(V) ci = (1/|Si |) x (A) x∈Si (A) 1 n v(Si ) = (d(x, ci )) |Si | x∈Si (B) diameter(Si ) = max {d(x1 , x2 )|x1 , x2 ∈ Si } q(V ) = max {diameter(Si ) | i = 1, . . . , k} (B) 11
  10. 10. k- (3) • 2 • n, d O(n^(O(dn))) • • k-Means • • NP- • 2 12
  11. 11. (Hard) K-means(K- ) • k-means • • K • 1. • K • K • K 2. 3. 4. 2, 3
  12. 12. 7 2 . 2-means 2 (0) m(1) 7 2 m(2) m(1), m(2) 2 (1) k m 14
  13. 13. m(1) x x k = arg min{d(m(k) , x)} m(2) k k x x m (k) (2) x m (k) d(m(1) , x) > d(m(2) , x) + x m (2) □ d(x,y) □ □ □ 15
  14. 14. +m(1) + □ m(2) □ □ □ □ m(k) (3) (n) (n) n rk x m(k) = R(k) (n) rk x(n) k R(k) k 16
  15. 15. m(1) x m(2) x (4) m (2) m (1) + + + □ (2),(3) □ □ □ 17
  16. 16. K-means • 5000 • • 0 1 2 3 4 5 6 7 8 9 0 21 3 1 7 1 7 14 1 1 3 4 2 21 1 1 19 1 3 6 7 2 3 21 1 14 Cluster # 4 1 24 21 1 1 5 37 1 1 17 9 4 6 27 13 6 8 8 1 9 7 15 6 10 8 29 22 2 12 23 1 9 4 5 1 12 3 7 18
  17. 17. K-means • • • k • Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 288 20 — An Example Inf 10 10 Figure 2 for a cas 8 8 clusters. data. (b 6 6 assignm (a) (b) four poi 4 4 cluster h assigned 2 2 (Points cluster a 0 0 0 2 4 6 8 10 0 2 4 6 8 10 19 Figure 2
  18. 18. (1) • 112 4 (1) V C C {} (2) V 1 c1 ∈ V C c1 V (3) j = 2, . . . , k C (a),(b) C (a) B neighbor(x) B x(∈ V − C) C x neighbor(x) A A E E (b) cj F C F G d(cj , neighbor(cj )) = max {d(x, neighbor(x)) | x ∈ V − C} G D x∈V −C D (A) 7 (B) G cj 20
  19. 19. (1) V C C {} (2) V 1 c1 ∈ V C c1 (2) V (3) j = 2, . . . , k (a),(b) (a) neighbor(x) x(∈ V − C) C x neighbor(x) (b) cj C d(cj , neighbor(cj )) = max {d(x, neighbor(x)) | x ∈ V − C} x∈V −C cj C C C B B B 4.38 A A A E E F E 4.10 F F G G G D k- D NP D (C) (D) (E) C D k- 21
  20. 20. (2) C C B B A A E F E F G G D D (E) (F) D 2 22
  21. 21. (Self Organizing Map) • • • • • 23

×