Datamining 9th association_rule.key
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Datamining 9th association_rule.key

  • 1,170 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,170
On Slideshare
1,074
From Embeds
96
Number of Embeds
2

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 96

http://togodb.sel.is.ocha.ac.jp 86
http://togodb.seselab.org 10

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. (Association Rule)• • PC• HDD •• 30 500 • • 30 500 2• ALDH• A B A A=C∧D 2
  • 2. • Market Basket Analysis • Frequent Pattern Mining• 1 2 3 4 3
  • 3. • • [ =2%, =60%]• A B • A: (antecedent), B: (consequent)• support • A B •• confidence • A B • 4
  • 4. (1/2) TID item• T100 I1, I2, I5 I = {I1 , I2 , ..., Im } T200 I2, I4• D T300 I2, I4 T T400 I1, I2, I4• T T500 I1, I3 T ⊆I T600 I2, I3 T700 I1, I3• T A T800 I1, I2, I3, I5 A⊆T T900 I1, I2, I3• itemset I = {I1, I2, I3, I4, I5} T100 : {I1, I2, I5}• itemset k k-itemset 5
  • 5. (2/2) TID item • A⇒B T100 T200 I1, I2, I5 I2, I4 A ⊂ I, B ⊂ I, A ∩ B = φ T300 I2, I4 • A⇒B T400 I1, I2, I4 support(A ⇒ B) = P (A ∪ B) T500 I1, I3 conf idence(A ⇒ B) = P (B | A) T600 I2, I3 T700 I1, I3 A = {I1} , B = {I2} , A ∪ B = {I1, I2} T800 I1, I2, I3, I5 P (A ∪ B) = 4/9 P (B | A) = 4/6 T900 I1, I2, I3 • support(A ∪ B) support count(A ∪ B)conf idence(A ⇒ B) = P (B | A) = = support(A) support count(A) 6
  • 6. •• A∪B A B, B A •• (min_sup) • min_sup itemset• itemset 1. item 100 2^100-1 2. 9-itemset {a1, a2, .., a9} min_sup {a1} {a2} {a1,a2} {a1, a9} {a1, a2, a3} ... min_sup • itemset 7
  • 7. ••••• ••• • •• 8
  • 8. Apriori: Overview-1• min_sup itemset • Agrawal & Srikant 1994•• min_sup = 2 TID item 1. D T100 I1, I2, I5 1-itemset T200 I2, I4 C1 T300 I2, I4 Itemset Sup. count T400 I1, I2, I4 {I1} 6 T500 I1, I3 {I2} 7 T600 I2, I3 T700 I1, I3 {I3} 6 T800 I1, I2, I3, I5 {I4} 2 T900 I1, I2, I3 {I5} 2 9
  • 9. Apriori: Overview-2 2. min_sup Itemset 3. k-itemset (k+1)-itemsetC1 L1 C2Itemset Sup. count Itemset Sup. count Itemset {I1} 6 {I1} 6 {I1,I2} {I2} 7 {I2} 7 {I1,I3} {I3} 6 {I3} 6 {I1, I4} {I4} 2 {I4} 2 {I1, I5} {I5} 2 {I5} 2 {I2, I3} {I2, I4} {I2, I5} {I3, I4} {I3, I5} {I4, I5} 10
  • 10. Apriori: Overview-3 4. Itemset • DB HDD • 5. min_sup itemsetC2 L2 Itemset Itemset Sup. Count Itemset Sup. Count {I1,I2} {I1,I2} 4 {I1,I2} 4 {I1,I3} {I1,I3} 4 {I1,I3} 4 {I1, I4} {I1, I4} 1 {I1, I5} 2 {I1, I5} {I1, I5} 2 {I2, I3} 4 {I2, I3} {I2, I3} 4 {I2, I4} 2 {I2, I4} {I2, I4} 2 {I2, I5} 2 {I2, I5} {I2, I5} 2 {I3, I4} {I3, I4} 0 {I3, I5} {I3, I5} 1 {I4, I5} {I4, I5} 0 11
  • 11. Apriori: Overview-4 • L 1, L 2, L 3 min_sup itemset • L1 C2, L2 C3, L3 C4L2 C3 L3 Itemset Sup. Count Itemset Itemset Sup. Count{I1,I2} 4 {I1,I2, I3} {I1,I2, I3} 2{I1,I3} 4 {I1,I2, I5} {I1,I2, I5} 2{I1, I5} 2{I2, I3} 4 C4{I2, I4} 2 Itemset{I2, I5} 2 12
  • 12. Apriori {I1, I2, I3, I5} 2 2 {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5} 4 4 1 2 4 2 2 0 1 0{I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 6 7 6 2 2 {I1} {I2} {I3} {I4} {I5} {} • × DB min_sup itemset • × k-itemset itemset itemset 13
  • 13. Apriori: Pruning Phase • k-itemset (k+1)-itemset itemset k-1 item 2 (k+1)-itemset • {I1, I2}, {I1,I3} I1 {I1, I2, I3} • {I1, I2, I3}, {I1, I2, I5} I1,I2 {I1, I2, I3, I5} • (k+1)-itemset k-itemset({I1, I2, I3} {I1, I2}, {I1, I3}, {I2, I3}) k-itemset • • {I1, I3, I5} {I3, I5} {I1, I3, I5} min_sup {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5}{I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 4 4 1 2 4 2 2 0 1 0 14
  • 14. • 1-itemset DIC 2-itemsetS. Brin, R. Motowani, J. Ullman, and S. Tsur. 1997 • {I2} {I4} min_sup {12,14} . {12,I4} min_sup • DB TID item Apriori DIC T100 I1, I2, I5 T200 I2, I4 1-itemset 2-itemset 3-itemset T300 I2, I4 1-itemset 2-itemset 3-itemset T400 I1, I2, I4 T500 I1, I3 T600 I2, I3 T700 I1, I3 T800 I1, I2, I3, I5 T900 I1, I2, I3 15
  • 15. • • Hash • (k+1)-itemset k-itemset • • PC • Heap itemset • FP-tree (J.Han, J. Pei and Y.Yin. 2000)• • • • S.Brin, R. Motwani and C. Silverstein. 1997 • S. Morishita and J. Sese. 2000 16