Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Datamining 2nd decisiontree

629 views

Published on

  • Be the first to comment

  • Be the first to like this

Datamining 2nd decisiontree

  1. 1. (1/3)• •• • •
  2. 2. (2/3)• ( Classification, Pattern Recognition) • • • A
  3. 3. (3/3)• (Clustering) • • A B • A B• (Association Rules) • • A B •
  4. 4. •• 9
  5. 5. • K L 10
  6. 6. • • •• • T T’ Yes No Yes No (A) (B) 11
  7. 7. • T v∈T cost(v) v T . {cost(x) | x ∈ T is leaf}5.4. 133 X1 X2 X3 1 0 1 X4 : 2 2 2 1 0 3 3 12
  8. 8. 13
  9. 9. 5.1 EXACT COVER BY 3-SET • NP NP • NP NP EXACT EXACT COVER BY 3-SET COVER BY 3-SET EXACT COVER BY 3-SET • EXACT COVER BY 3-SET 5.2 3 X 3 X S = {T 1, T 2, ...} S1 ⊂ S NP (1) ∪{T |T ∈ S1 } = X 5.4. X S1 135134 5 (2) i=j Ti ∩ Tj = φ S 1 5.4 EXACT COVER BY 3-SET 1 2 3 {{1, 4, 7}, {2, 3, 5}, {6, 8, 9}} X EXACT COVER BY 3-SET 4 5 6 EXACT COVER : NP BY 3-SET 7 8 9 14
  10. 10. BY 3-SET •: NP • • X 1 |X| • X Y = {y1 , y2 , ..., y|X| } |X| 0 Y t X Y Y = {y1 , y2 , . . . X |X| } ,y 1 Y • t 0X Y X 1 Y 0 1 t∈X t[A] = 0 t∈YX 3 T1 , T 2 , . . . 1 0 yi 1 1 t ∈ Ti 1 t = yi
  11. 11. 0 t∈Y•X X 3 3 T1 , T 2 , . . . T1 , T 2 , . . . 1 1 0 0 yi• yi 1 1 1 t ∈ Ti 1 t = yi t[Ti ] = , t[yi ] = 0 t ∈ Ti 0 t = yi• Ti yi•• 2 • Ti 9 3 5.5(A) • 5.5(B)
  12. 12. • • |X| |X| |X| 1 + 2 + ··· + + 3 3 • • EXACT COVER BY 3-SET 136 EXACT COVER BY 3-SET5 : :Y T1 T2 T3 yi yi T1 |Y | = |X| 9 5.6(A) 1 T2 1 T3 1 + 2 + · · · + |X| + |X|1 0 (A) (B)
  13. 13. 5.6(A)• Ti yi yi yi |Y| = |X|• 1 + 2 + · · · + |X| + |X|• 2 1 + 2 + ··· + |X| |X| + 3 3• EXACT COVER BY 3-SET EXACT COVER BY 3-SET 1 + 2 + · · · + |X|/3 + |X|/3 NP 5.4. 137 y1 5.4.2 0 y2 y1 y2 y3 NP 0 y9 0 1 (A) (B)
  14. 14. 19
  15. 15. T T’ Yes No Yes No(A) (B) 20
  16. 16. S = {(x1 , c1 ), (x2 , c2 ), . . . , (xN , cN )} H(C) = −p log2 p − p× log2 p× p p×p 21
  17. 17. • 4 6 p = , p× = 10 10• H(C) = −p log2 p − p× log2 p× 4 4 6 6 = − log2 − log2 = 0.971 10 10 10 10 22
  18. 18. • 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 23
  19. 19. T1: 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 2 2 2 2 H(C | T1 = Yes) = − log2 − log2 = 1.0 4 4 4 4 2 2 4 4 H(C | T1 = No) = − log2 − log2 = 0.918 6 6 6 6 • 4 6H(C | T1 ) = H(C | T1 = Yes) + H(C | T1 = No) = 0.951 10 10 24
  20. 20. • T I(T) I(T ) = H(C) − H(C | T )• I(T1 ) = H(C) − H(C | T1 ) = 0.971 − 0.951 = 0.020• I(T2 ) = 0.420, I(T3 ) = 0.091, I(T4 ) = 0.420• T2 T2 • T4 T2 25
  21. 21. T2: Yes No• • ••
  22. 22. • Yes 4 4 2 2 H(C) = − log2 − log2 = 0.918 6 6 6 6 T1 4 2 2 2 2 H(C | T1 ) = − log2 − log2 6 4 4 4 4 2 2 2 − log2 6 2 2 = 0.667 I(T1 ) = 0.918 − 0.667 = 0.251 I(T3 ) = 0, I(T4 ) = 0.918 T4 27
  23. 23.
  24. 24. • • • naive bayes • • •• • • • • 29
  25. 25. •• • • ID3 2• • • CART (Classification And Regression Tree) C4.5 30
  26. 26. • CART • 2 • •• C4.5 • •• • • • Forest 31
  27. 27. (10/21)• sesejun+dm10@sel.is.ocha.ac.jp•• 11/2( )• • http://togodb.sel.is.ocha.ac.jp/ 22 2010 32
  28. 28. 33

×