Upcoming SlideShare
×

# Datamining 9th association_rule.key

1,201 views

Published on

1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,201
On SlideShare
0
From Embeds
0
Number of Embeds
98
Actions
Shares
0
6
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Datamining 9th association_rule.key

1. 1. (Association Rule)• • PC• HDD •• 30 500 • • 30 500 2• ALDH• A B A A=C∧D 2
2. 2. • Market Basket Analysis • Frequent Pattern Mining• 1 2 3 4 3
3. 3. • • [ =2%, =60%]• A B • A: (antecedent), B: (consequent)• support • A B •• conﬁdence • A B • 4
4. 4. (1/2) TID item• T100 I1, I2, I5 I = {I1 , I2 , ..., Im } T200 I2, I4• D T300 I2, I4 T T400 I1, I2, I4• T T500 I1, I3 T ⊆I T600 I2, I3 T700 I1, I3• T A T800 I1, I2, I3, I5 A⊆T T900 I1, I2, I3• itemset I = {I1, I2, I3, I4, I5} T100 : {I1, I2, I5}• itemset k k-itemset 5
5. 5. (2/2) TID item • A⇒B T100 T200 I1, I2, I5 I2, I4 A ⊂ I, B ⊂ I, A ∩ B = φ T300 I2, I4 • A⇒B T400 I1, I2, I4 support(A ⇒ B) = P (A ∪ B) T500 I1, I3 conf idence(A ⇒ B) = P (B | A) T600 I2, I3 T700 I1, I3 A = {I1} , B = {I2} , A ∪ B = {I1, I2} T800 I1, I2, I3, I5 P (A ∪ B) = 4/9 P (B | A) = 4/6 T900 I1, I2, I3 • support(A ∪ B) support count(A ∪ B)conf idence(A ⇒ B) = P (B | A) = = support(A) support count(A) 6
6. 6. •• A∪B A B, B A •• (min_sup) • min_sup itemset• itemset 1. item 100 2^100-1 2. 9-itemset {a1, a2, .., a9} min_sup {a1} {a2} {a1,a2} {a1, a9} {a1, a2, a3} ... min_sup • itemset 7
7. 7. ••••• ••• • •• 8
8. 8. Apriori: Overview-1• min_sup itemset • Agrawal & Srikant 1994•• min_sup = 2 TID item 1. D T100 I1, I2, I5 1-itemset T200 I2, I4 C1 T300 I2, I4 Itemset Sup. count T400 I1, I2, I4 {I1} 6 T500 I1, I3 {I2} 7 T600 I2, I3 T700 I1, I3 {I3} 6 T800 I1, I2, I3, I5 {I4} 2 T900 I1, I2, I3 {I5} 2 9
9. 9. Apriori: Overview-2 2. min_sup Itemset 3. k-itemset (k+1)-itemsetC1 L1 C2Itemset Sup. count Itemset Sup. count Itemset {I1} 6 {I1} 6 {I1,I2} {I2} 7 {I2} 7 {I1,I3} {I3} 6 {I3} 6 {I1, I4} {I4} 2 {I4} 2 {I1, I5} {I5} 2 {I5} 2 {I2, I3} {I2, I4} {I2, I5} {I3, I4} {I3, I5} {I4, I5} 10
10. 10. Apriori: Overview-3 4. Itemset • DB HDD • 5. min_sup itemsetC2 L2 Itemset Itemset Sup. Count Itemset Sup. Count {I1,I2} {I1,I2} 4 {I1,I2} 4 {I1,I3} {I1,I3} 4 {I1,I3} 4 {I1, I4} {I1, I4} 1 {I1, I5} 2 {I1, I5} {I1, I5} 2 {I2, I3} 4 {I2, I3} {I2, I3} 4 {I2, I4} 2 {I2, I4} {I2, I4} 2 {I2, I5} 2 {I2, I5} {I2, I5} 2 {I3, I4} {I3, I4} 0 {I3, I5} {I3, I5} 1 {I4, I5} {I4, I5} 0 11
11. 11. Apriori: Overview-4 • L 1, L 2, L 3 min_sup itemset • L1 C2, L2 C3, L3 C4L2 C3 L3 Itemset Sup. Count Itemset Itemset Sup. Count{I1,I2} 4 {I1,I2, I3} {I1,I2, I3} 2{I1,I3} 4 {I1,I2, I5} {I1,I2, I5} 2{I1, I5} 2{I2, I3} 4 C4{I2, I4} 2 Itemset{I2, I5} 2 12
12. 12. Apriori {I1, I2, I3, I5} 2 2 {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5} 4 4 1 2 4 2 2 0 1 0{I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 6 7 6 2 2 {I1} {I2} {I3} {I4} {I5} {} • × DB min_sup itemset • × k-itemset itemset itemset 13
13. 13. Apriori: Pruning Phase • k-itemset (k+1)-itemset itemset k-1 item 2 (k+1)-itemset • {I1, I2}, {I1,I3} I1 {I1, I2, I3} • {I1, I2, I3}, {I1, I2, I5} I1,I2 {I1, I2, I3, I5} • (k+1)-itemset k-itemset({I1, I2, I3} {I1, I2}, {I1, I3}, {I2, I3}) k-itemset • • {I1, I3, I5} {I3, I5} {I1, I3, I5} min_sup {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5}{I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 4 4 1 2 4 2 2 0 1 0 14
14. 14. • 1-itemset DIC 2-itemsetS. Brin, R. Motowani, J. Ullman, and S. Tsur. 1997 • {I2} {I4} min_sup {12,14} . {12,I4} min_sup • DB TID item Apriori DIC T100 I1, I2, I5 T200 I2, I4 1-itemset 2-itemset 3-itemset T300 I2, I4 1-itemset 2-itemset 3-itemset T400 I1, I2, I4 T500 I1, I3 T600 I2, I3 T700 I1, I3 T800 I1, I2, I3, I5 T900 I1, I2, I3 15
15. 15. • • Hash • (k+1)-itemset k-itemset • • PC • Heap itemset • FP-tree (J.Han, J. Pei and Y.Yin. 2000)• • • • S.Brin, R. Motwani and C. Silverstein. 1997 • S. Morishita and J. Sese. 2000 16