Datamining 2nd decisiontree
Upcoming SlideShare
Loading in...5
×
 

Datamining 2nd decisiontree

on

  • 600 views

 

Statistics

Views

Total Views
600
Slideshare-icon Views on SlideShare
472
Embed Views
128

Actions

Likes
0
Downloads
3
Comments
0

2 Embeds 128

http://togodb.sel.is.ocha.ac.jp 118
http://togodb.seselab.org 10

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Datamining 2nd decisiontree Datamining 2nd decisiontree Presentation Transcript

    • (1/3)• •• • •
    • (2/3)• ( Classification, Pattern Recognition) • • • A
    • (3/3)• (Clustering) • • A B • A B• (Association Rules) • • A B •
    • •• 9
    • • K L 10
    • • • •• • T T’ Yes No Yes No (A) (B) 11
    • • T v∈T cost(v) v T . ￿ {cost(x) | x ∈ T is leaf}5.4. 133 X1 X2 X3 1 0 1 X4 : 2 2 2 1 0 3 3 12
    • 13
    • 5.1 EXACT COVER BY 3-SET • NP NP • NP NP EXACT EXACT COVER BY 3-SET COVER BY 3-SET EXACT COVER BY 3-SET • EXACT COVER BY 3-SET 5.2 3 X 3 X S = {T 1, T 2, ...} S1 ⊂ S NP (1) ∪{T |T ∈ S1 } = X 5.4. X S1 135134 5 (2) i=j Ti ∩ Tj = φ S 1 5.4 EXACT COVER BY 3-SET 1 2 3 {{1, 4, 7}, {2, 3, 5}, {6, 8, 9}} X EXACT COVER BY 3-SET 4 5 6 EXACT COVER : NP BY 3-SET 7 8 9 14
    • BY 3-SET •: NP • • X 1 |X| • X Y = {y1 , y2 , ..., y|X| } |X| 0 Y t X Y Y = {y1 , y2 , . . . X |X| } ,y 1 Y • t 0X Y X 1 Y 0 1 t∈X t[A] = 0 t∈YX 3 T1 , T 2 , . . . 1 0 yi 1 1 t ∈ Ti 1 t = yi
    • 0 t∈Y•X X 3 3 T1 , T 2 , . . . T1 , T 2 , . . . 1 1 0 0 yi• yi 1 1 1 t ∈ Ti 1 t = yi t[Ti ] = , t[yi ] = 0 t ∈ Ti 0 t = yi• Ti yi•• 2 • Ti 9 3 5.5(A) • 5.5(B)
    • • • |X| |X| |X| 1 + 2 + ··· + + 3 3 • • EXACT COVER BY 3-SET 136 EXACT COVER BY 3-SET5 : :Y T1 T2 T3 yi yi T1 |Y | = |X| 9 5.6(A) 1 T2 1 T3 1 + 2 + · · · + |X| + |X|1 0 (A) (B)
    • 5.6(A)• Ti yi yi yi |Y| = |X|• 1 + 2 + · · · + |X| + |X|• 2 1 + 2 + ··· + |X| |X| + 3 3• EXACT COVER BY 3-SET EXACT COVER BY 3-SET 1 + 2 + · · · + |X|/3 + |X|/3 NP 5.4. 137 y1 5.4.2 0 y2 y1 y2 y3 NP 0 y9 0 1 (A) (B)
    • 19
    • T T’ Yes No Yes No(A) (B) 20
    • S = {(x1 , c1 ), (x2 , c2 ), . . . , (xN , cN )} H(C) = −p￿ log2 p￿ − p× log2 p× p￿ p×p￿ 21
    • • 4 6 p￿ = , p× = 10 10• H(C) = −p￿ log2 p￿ − p× log2 p× 4 4 6 6 = − log2 − log2 = 0.971 10 10 10 10 22
    • • 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 23
    • T1: 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 2 2 2 2 H(C | T1 = Yes) = − log2 − log2 = 1.0 4 4 4 4 2 2 4 4 H(C | T1 = No) = − log2 − log2 = 0.918 6 6 6 6 • 4 6H(C | T1 ) = H(C | T1 = Yes) + H(C | T1 = No) = 0.951 10 10 24
    • • T I(T) I(T ) = H(C) − H(C | T )• I(T1 ) = H(C) − H(C | T1 ) = 0.971 − 0.951 = 0.020• I(T2 ) = 0.420, I(T3 ) = 0.091, I(T4 ) = 0.420• T2 T2 • T4 T2 25
    • T2: Yes No• • ••
    • • Yes 4 4 2 2 H(C) = − log2 − log2 = 0.918 6 6 6 6 ￿ ￿ T1 4 2 2 2 2 H(C | T1 ) = − log2 − log2 6￿ 4 4￿ 4 4 2 2 2 − log2 6 2 2 = 0.667 I(T1 ) = 0.918 − 0.667 = 0.251 I(T3 ) = 0, I(T4 ) = 0.918 T4 27
    • • • • naive bayes • • •• • • • • 29
    • •• • • ID3 2• • • CART (Classification And Regression Tree) C4.5 30
    • • CART • 2 • •• C4.5 • •• • • • Forest 31
    • (10/21)• sesejun+dm10@sel.is.ocha.ac.jp•• 11/2( )• • http://togodb.sel.is.ocha.ac.jp/ 22 2010 32
    • 33