(1/3)•    ••    •    •
(2/3)•               (    Classification, Pattern Recognition)    •    •    •      A
(3/3)•                       (Clustering)    •    •   A                 B    •           A            B•                  ...
••    9
•        K    L            10
•    •    ••    •                    T                T’              Yes       No     Yes        No        (A)           ...
•   T     v∈T                   cost(v)           v       T                 .                          {cost(x) | x ∈ T is...
13
5.1                EXACT COVER BY 3-SET •          NP                                 NP      •     NP             NP     ...
BY 3-SET    •:                                 NP            •            •    X                                          ...
0 t∈Y•X X     3                  3                                                      T1 , T 2 , . . . T1 , T 2 , . . . ...
•    •                      |X|                                   |X| |X|                     1 + 2 + ··· +    +          ...
5.6(A)•                            Ti             yi        yi             yi          |Y| = |X|•                1 + 2 + ·...
19
T                T’      Yes       No     Yes        No(A)                  (B)                                       20
S = {(x1 , c1 ), (x2 , c2 ), . . . , (xN , cN )}     H(C) = −p log2 p − p× log2 p×     p   p×p                            ...
•                     4        6              p   =    , p× =                     10        10• H(C)   = −p log2 p − p× lo...
•             30                  YES   NO    C:             2    2    4         ×         2    4    6                   4...
T1: 30                         YES       NO   C:                     2         2       4                 ×        2       ...
•        T                                 I(T)             I(T ) = H(C) − H(C | T )•    I(T1 ) = H(C) − H(C | T1 ) = 0.97...
T2:        Yes         No•    •    ••
•    Yes            4    4 2    2    H(C) = − log2 − log2 = 0.918            6    6 6    6                                ...
•
•    •        •   naive bayes    •        •    ••    •    •        •    •                     29
••    •    •   ID3                                               2•    •    •   CART (Classification And Regression Tree)  ...
•   CART    •           2    •    ••   C4.5    •    ••    •        •    •            Forest                     31
(10/21)•              sesejun+dm10@sel.is.ocha.ac.jp••            11/2(   )•    •   http://togodb.sel.is.ocha.ac.jp/      ...
33
Datamining 2nd decisiontree
Datamining 2nd decisiontree
Datamining 2nd decisiontree
Datamining 2nd decisiontree
Datamining 2nd decisiontree
Upcoming SlideShare
Loading in …5
×

Datamining 2nd decisiontree

584 views
551 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
584
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Datamining 2nd decisiontree

  1. 1. (1/3)• •• • •
  2. 2. (2/3)• ( Classification, Pattern Recognition) • • • A
  3. 3. (3/3)• (Clustering) • • A B • A B• (Association Rules) • • A B •
  4. 4. •• 9
  5. 5. • K L 10
  6. 6. • • •• • T T’ Yes No Yes No (A) (B) 11
  7. 7. • T v∈T cost(v) v T . {cost(x) | x ∈ T is leaf}5.4. 133 X1 X2 X3 1 0 1 X4 : 2 2 2 1 0 3 3 12
  8. 8. 13
  9. 9. 5.1 EXACT COVER BY 3-SET • NP NP • NP NP EXACT EXACT COVER BY 3-SET COVER BY 3-SET EXACT COVER BY 3-SET • EXACT COVER BY 3-SET 5.2 3 X 3 X S = {T 1, T 2, ...} S1 ⊂ S NP (1) ∪{T |T ∈ S1 } = X 5.4. X S1 135134 5 (2) i=j Ti ∩ Tj = φ S 1 5.4 EXACT COVER BY 3-SET 1 2 3 {{1, 4, 7}, {2, 3, 5}, {6, 8, 9}} X EXACT COVER BY 3-SET 4 5 6 EXACT COVER : NP BY 3-SET 7 8 9 14
  10. 10. BY 3-SET •: NP • • X 1 |X| • X Y = {y1 , y2 , ..., y|X| } |X| 0 Y t X Y Y = {y1 , y2 , . . . X |X| } ,y 1 Y • t 0X Y X 1 Y 0 1 t∈X t[A] = 0 t∈YX 3 T1 , T 2 , . . . 1 0 yi 1 1 t ∈ Ti 1 t = yi
  11. 11. 0 t∈Y•X X 3 3 T1 , T 2 , . . . T1 , T 2 , . . . 1 1 0 0 yi• yi 1 1 1 t ∈ Ti 1 t = yi t[Ti ] = , t[yi ] = 0 t ∈ Ti 0 t = yi• Ti yi•• 2 • Ti 9 3 5.5(A) • 5.5(B)
  12. 12. • • |X| |X| |X| 1 + 2 + ··· + + 3 3 • • EXACT COVER BY 3-SET 136 EXACT COVER BY 3-SET5 : :Y T1 T2 T3 yi yi T1 |Y | = |X| 9 5.6(A) 1 T2 1 T3 1 + 2 + · · · + |X| + |X|1 0 (A) (B)
  13. 13. 5.6(A)• Ti yi yi yi |Y| = |X|• 1 + 2 + · · · + |X| + |X|• 2 1 + 2 + ··· + |X| |X| + 3 3• EXACT COVER BY 3-SET EXACT COVER BY 3-SET 1 + 2 + · · · + |X|/3 + |X|/3 NP 5.4. 137 y1 5.4.2 0 y2 y1 y2 y3 NP 0 y9 0 1 (A) (B)
  14. 14. 19
  15. 15. T T’ Yes No Yes No(A) (B) 20
  16. 16. S = {(x1 , c1 ), (x2 , c2 ), . . . , (xN , cN )} H(C) = −p log2 p − p× log2 p× p p×p 21
  17. 17. • 4 6 p = , p× = 10 10• H(C) = −p log2 p − p× log2 p× 4 4 6 6 = − log2 − log2 = 0.971 10 10 10 10 22
  18. 18. • 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 23
  19. 19. T1: 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 2 2 2 2 H(C | T1 = Yes) = − log2 − log2 = 1.0 4 4 4 4 2 2 4 4 H(C | T1 = No) = − log2 − log2 = 0.918 6 6 6 6 • 4 6H(C | T1 ) = H(C | T1 = Yes) + H(C | T1 = No) = 0.951 10 10 24
  20. 20. • T I(T) I(T ) = H(C) − H(C | T )• I(T1 ) = H(C) − H(C | T1 ) = 0.971 − 0.951 = 0.020• I(T2 ) = 0.420, I(T3 ) = 0.091, I(T4 ) = 0.420• T2 T2 • T4 T2 25
  21. 21. T2: Yes No• • ••
  22. 22. • Yes 4 4 2 2 H(C) = − log2 − log2 = 0.918 6 6 6 6 T1 4 2 2 2 2 H(C | T1 ) = − log2 − log2 6 4 4 4 4 2 2 2 − log2 6 2 2 = 0.667 I(T1 ) = 0.918 − 0.667 = 0.251 I(T3 ) = 0, I(T4 ) = 0.918 T4 27
  23. 23.
  24. 24. • • • naive bayes • • •• • • • • 29
  25. 25. •• • • ID3 2• • • CART (Classification And Regression Tree) C4.5 30
  26. 26. • CART • 2 • •• C4.5 • •• • • • Forest 31
  27. 27. (10/21)• sesejun+dm10@sel.is.ocha.ac.jp•• 11/2( )• • http://togodb.sel.is.ocha.ac.jp/ 22 2010 32
  28. 28. 33

×