Your SlideShare is downloading. ×
Datamining 9th association_rule.key
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Datamining 9th association_rule.key

926
views

Published on


0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
926
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. (Association Rule)• • PC• HDD •• 30 500 • • 30 500 2• ALDH• A B A A=C∧D 2
  • 2. • Market Basket Analysis • Frequent Pattern Mining• 1 2 3 4 3
  • 3. • • [ =2%, =60%]• A B • A: (antecedent), B: (consequent)• support • A B •• confidence • A B • 4
  • 4. (1/2) TID item• T100 I1, I2, I5 I = {I1 , I2 , ..., Im } T200 I2, I4• D T300 I2, I4 T T400 I1, I2, I4• T T500 I1, I3 T ⊆I T600 I2, I3 T700 I1, I3• T A T800 I1, I2, I3, I5 A⊆T T900 I1, I2, I3• itemset I = {I1, I2, I3, I4, I5} T100 : {I1, I2, I5}• itemset k k-itemset 5
  • 5. (2/2) TID item • A⇒B T100 T200 I1, I2, I5 I2, I4 A ⊂ I, B ⊂ I, A ∩ B = φ T300 I2, I4 • A⇒B T400 I1, I2, I4 support(A ⇒ B) = P (A ∪ B) T500 I1, I3 conf idence(A ⇒ B) = P (B | A) T600 I2, I3 T700 I1, I3 A = {I1} , B = {I2} , A ∪ B = {I1, I2} T800 I1, I2, I3, I5 P (A ∪ B) = 4/9 P (B | A) = 4/6 T900 I1, I2, I3 • support(A ∪ B) support count(A ∪ B)conf idence(A ⇒ B) = P (B | A) = = support(A) support count(A) 6
  • 6. •• A∪B A B, B A •• (min_sup) • min_sup itemset• itemset 1. item 100 2^100-1 2. 9-itemset {a1, a2, .., a9} min_sup {a1} {a2} {a1,a2} {a1, a9} {a1, a2, a3} ... min_sup • itemset 7
  • 7. ••••• ••• • •• 8
  • 8. Apriori: Overview-1• min_sup itemset • Agrawal & Srikant 1994•• min_sup = 2 TID item 1. D T100 I1, I2, I5 1-itemset T200 I2, I4 C1 T300 I2, I4 Itemset Sup. count T400 I1, I2, I4 {I1} 6 T500 I1, I3 {I2} 7 T600 I2, I3 T700 I1, I3 {I3} 6 T800 I1, I2, I3, I5 {I4} 2 T900 I1, I2, I3 {I5} 2 9
  • 9. Apriori: Overview-2 2. min_sup Itemset 3. k-itemset (k+1)-itemsetC1 L1 C2Itemset Sup. count Itemset Sup. count Itemset {I1} 6 {I1} 6 {I1,I2} {I2} 7 {I2} 7 {I1,I3} {I3} 6 {I3} 6 {I1, I4} {I4} 2 {I4} 2 {I1, I5} {I5} 2 {I5} 2 {I2, I3} {I2, I4} {I2, I5} {I3, I4} {I3, I5} {I4, I5} 10
  • 10. Apriori: Overview-3 4. Itemset • DB HDD • 5. min_sup itemsetC2 L2 Itemset Itemset Sup. Count Itemset Sup. Count {I1,I2} {I1,I2} 4 {I1,I2} 4 {I1,I3} {I1,I3} 4 {I1,I3} 4 {I1, I4} {I1, I4} 1 {I1, I5} 2 {I1, I5} {I1, I5} 2 {I2, I3} 4 {I2, I3} {I2, I3} 4 {I2, I4} 2 {I2, I4} {I2, I4} 2 {I2, I5} 2 {I2, I5} {I2, I5} 2 {I3, I4} {I3, I4} 0 {I3, I5} {I3, I5} 1 {I4, I5} {I4, I5} 0 11
  • 11. Apriori: Overview-4 • L 1, L 2, L 3 min_sup itemset • L1 C2, L2 C3, L3 C4L2 C3 L3 Itemset Sup. Count Itemset Itemset Sup. Count{I1,I2} 4 {I1,I2, I3} {I1,I2, I3} 2{I1,I3} 4 {I1,I2, I5} {I1,I2, I5} 2{I1, I5} 2{I2, I3} 4 C4{I2, I4} 2 Itemset{I2, I5} 2 12
  • 12. Apriori {I1, I2, I3, I5} 2 2 {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5} 4 4 1 2 4 2 2 0 1 0{I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 6 7 6 2 2 {I1} {I2} {I3} {I4} {I5} {} • × DB min_sup itemset • × k-itemset itemset itemset 13
  • 13. Apriori: Pruning Phase • k-itemset (k+1)-itemset itemset k-1 item 2 (k+1)-itemset • {I1, I2}, {I1,I3} I1 {I1, I2, I3} • {I1, I2, I3}, {I1, I2, I5} I1,I2 {I1, I2, I3, I5} • (k+1)-itemset k-itemset({I1, I2, I3} {I1, I2}, {I1, I3}, {I2, I3}) k-itemset • • {I1, I3, I5} {I3, I5} {I1, I3, I5} min_sup {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5}{I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 4 4 1 2 4 2 2 0 1 0 14
  • 14. • 1-itemset DIC 2-itemsetS. Brin, R. Motowani, J. Ullman, and S. Tsur. 1997 • {I2} {I4} min_sup {12,14} . {12,I4} min_sup • DB TID item Apriori DIC T100 I1, I2, I5 T200 I2, I4 1-itemset 2-itemset 3-itemset T300 I2, I4 1-itemset 2-itemset 3-itemset T400 I1, I2, I4 T500 I1, I3 T600 I2, I3 T700 I1, I3 T800 I1, I2, I3, I5 T900 I1, I2, I3 15
  • 15. • • Hash • (k+1)-itemset k-itemset • • PC • Heap itemset • FP-tree (J.Han, J. Pei and Y.Yin. 2000)• • • • S.Brin, R. Motwani and C. Silverstein. 1997 • S. Morishita and J. Sese. 2000 16