Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
×

# Datamining 2nd decisiontree

629 views

Published on

• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

### Datamining 2nd decisiontree

1. 1. (1/3)• •• • •
2. 2. (2/3)• ( Classiﬁcation, Pattern Recognition) • • • A
3. 3. (3/3)• (Clustering) • • A B • A B• (Association Rules) • • A B •
4. 4. •• 9
5. 5. • K L 10
6. 6. • • •• • T T’ Yes No Yes No (A) (B) 11
7. 7. • T v∈T cost(v) v T . {cost(x) | x ∈ T is leaf}5.4. 133 X1 X2 X3 1 0 1 X4 : 2 2 2 1 0 3 3 12
8. 8. 13
9. 9. 5.1 EXACT COVER BY 3-SET • NP NP • NP NP EXACT EXACT COVER BY 3-SET COVER BY 3-SET EXACT COVER BY 3-SET • EXACT COVER BY 3-SET 5.2 3 X 3 X S = {T 1, T 2, ...} S1 ⊂ S NP (1) ∪{T |T ∈ S1 } = X 5.4. X S1 135134 5 (2) i=j Ti ∩ Tj = φ S 1 5.4 EXACT COVER BY 3-SET 1 2 3 {{1, 4, 7}, {2, 3, 5}, {6, 8, 9}} X EXACT COVER BY 3-SET 4 5 6 EXACT COVER : NP BY 3-SET 7 8 9 14
10. 10. BY 3-SET •: NP • • X 1 |X| • X Y = {y1 , y2 , ..., y|X| } |X| 0 Y t X Y Y = {y1 , y2 , . . . X |X| } ,y 1 Y • t 0X Y X 1 Y 0 1 t∈X t[A] = 0 t∈YX 3 T1 , T 2 , . . . 1 0 yi 1 1 t ∈ Ti 1 t = yi
11. 11. 0 t∈Y•X X 3 3 T1 , T 2 , . . . T1 , T 2 , . . . 1 1 0 0 yi• yi 1 1 1 t ∈ Ti 1 t = yi t[Ti ] = , t[yi ] = 0 t ∈ Ti 0 t = yi• Ti yi•• 2 • Ti 9 3 5.5(A) • 5.5(B)
12. 12. • • |X| |X| |X| 1 + 2 + ··· + + 3 3 • • EXACT COVER BY 3-SET 136 EXACT COVER BY 3-SET5 : :Y T1 T2 T3 yi yi T1 |Y | = |X| 9 5.6(A) 1 T2 1 T3 1 + 2 + · · · + |X| + |X|1 0 (A) (B)
13. 13. 5.6(A)• Ti yi yi yi |Y| = |X|• 1 + 2 + · · · + |X| + |X|• 2 1 + 2 + ··· + |X| |X| + 3 3• EXACT COVER BY 3-SET EXACT COVER BY 3-SET 1 + 2 + · · · + |X|/3 + |X|/3 NP 5.4. 137 y1 5.4.2 0 y2 y1 y2 y3 NP 0 y9 0 1 (A) (B)
14. 14. 19
15. 15. T T’ Yes No Yes No(A) (B) 20
16. 16. S = {(x1 , c1 ), (x2 , c2 ), . . . , (xN , cN )} H(C) = −p log2 p − p× log2 p× p p×p 21
17. 17. • 4 6 p = , p× = 10 10• H(C) = −p log2 p − p× log2 p× 4 4 6 6 = − log2 − log2 = 0.971 10 10 10 10 22
18. 18. • 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 23
19. 19. T1: 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 2 2 2 2 H(C | T1 = Yes) = − log2 − log2 = 1.0 4 4 4 4 2 2 4 4 H(C | T1 = No) = − log2 − log2 = 0.918 6 6 6 6 • 4 6H(C | T1 ) = H(C | T1 = Yes) + H(C | T1 = No) = 0.951 10 10 24
20. 20. • T I(T) I(T ) = H(C) − H(C | T )• I(T1 ) = H(C) − H(C | T1 ) = 0.971 − 0.951 = 0.020• I(T2 ) = 0.420, I(T3 ) = 0.091, I(T4 ) = 0.420• T2 T2 • T4 T2 25
21. 21. T2: Yes No• • ••
22. 22. • Yes 4 4 2 2 H(C) = − log2 − log2 = 0.918 6 6 6 6 T1 4 2 2 2 2 H(C | T1 ) = − log2 − log2 6 4 4 4 4 2 2 2 − log2 6 2 2 = 0.667 I(T1 ) = 0.918 − 0.667 = 0.251 I(T3 ) = 0, I(T4 ) = 0.918 T4 27
23. 23.
24. 24. • • • naive bayes • • •• • • • • 29
25. 25. •• • • ID3 2• • • CART (Classiﬁcation And Regression Tree) C4.5 30
26. 26. • CART • 2 • •• C4.5 • •• • • • Forest 31
27. 27. (10/21)• sesejun+dm10@sel.is.ocha.ac.jp•• 11/2( )• • http://togodb.sel.is.ocha.ac.jp/ 22 2010 32
28. 28. 33