Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
(Association Rule)
•
    •    PC
•                        HDD
    •
•   30        500
    •
    •           30         500...
•   Market Basket Analysis
    •   Frequent Pattern Mining
•



           1                      2




           3      ...
•
    •                               [        =2%,   =60%]
•   A B
    • A:      (antecedent), B:     (consequent)
•     ...
(1/2)
                                     TID       item
•                                    T100    I1, I2, I5
        ...
(2/2)                        TID      item

 •             A⇒B                           T100
                            ...
•
•                          A∪B                                     A B,
    B A


    •
•                     (min_sup)
...
•
•
•
•
•
    •
•
•
    •
    •
•
Apriori: Overview-1
•             min_sup          itemset
    •   Agrawal & Srikant 1994
•
•              min_sup = 2
   ...
Apriori: Overview-2
        2. min_sup                     Itemset
        3.            k-itemset           (k+1)-itemset...
Apriori: Overview-3
 4.                                    Itemset
       •   DB                HDD
       •
 5. min_sup  ...
Apriori: Overview-4
     •   L1, L2, L3        min_sup             itemset
     •   L1      C2, L2     C3, L3      C4


L2...
Apriori                     {I1, I2, I3, I5}

                  2            2
      {I1, I2, I3} {I1, I2, I5} {I1, I3, I5...
Apriori: Pruning Phase
  • k-itemset            (k+1)-itemset
            itemset              k-1 item             2     ...
•   1-itemset
              DIC                               2-itemset
S. Brin, R. Motowani, J. Ullman, and
             ...
•
    •   Hash
        •   (k+1)-itemset             k-itemset
    •
        •           PC
    •   Heap                  ...
Datamining 9th Association Rule
Upcoming SlideShare
Loading in …5
×

Datamining 9th Association Rule

1,766 views

Published on

  • Be the first to comment

  • Be the first to like this

Datamining 9th Association Rule

  1. 1. (Association Rule) • • PC • HDD • • 30 500 • • 30 500 2 • ALDH • A B A A=C∧D
  2. 2. • Market Basket Analysis • Frequent Pattern Mining • 1 2 3 4
  3. 3. • • [ =2%, =60%] • A B • A: (antecedent), B: (consequent) • support • A B • • confidence • A B •
  4. 4. (1/2) TID item • T100 I1, I2, I5 I = {I1 , I2 , ..., Im } T200 I2, I4 • D T300 I2, I4 T T400 I1, I2, I4 • T T500 I1, I3 T ⊆I T600 I2, I3 T700 I1, I3 • T A T800 I1, I2, I3, I5 A⊆T T900 I1, I2, I3 I = {I1, I2, I3, I4, I5} • itemset T100 : {I1, I2, I5} • itemset k k-itemset
  5. 5. (2/2) TID item • A⇒B T100 T200 I1, I2, I5 I2, I4 A ⊂ I, B ⊂ I, A ∩ B = φ T300 I2, I4 • A⇒B T400 I1, I2, I4 support(A ⇒ B) = P (A ∪ B) T500 I1, I3 conf idence(A ⇒ B) = P (B | A) T600 I2, I3 T700 I1, I3 A = {I1} , B = {I2} , A ∪ B = {I1, I2} T800 I1, I2, I3, I5 P (A ∪ B) = 4/9 P (B | A) = 4/6 T900 I1, I2, I3 • support(A ∪ B) support count(A ∪ B) conf idence(A ⇒ B) = P (B | A) = = support(A) support count(A)
  6. 6. • • A∪B A B, B A • • (min_sup) • min_sup itemset • itemset 1. item 100 2^100-1 2. 9-itemset {a1, a2, .., a9} min_sup {a1} {a2} {a1,a2} {a1, a9} {a1, a2, a3} ... min_sup • itemset
  7. 7. • • • • • • • • • • •
  8. 8. Apriori: Overview-1 • min_sup itemset • Agrawal & Srikant 1994 • • min_sup = 2 TID item 1. D T100 I1, I2, I5 1-itemset T200 I2, I4 C1 T300 I2, I4 Itemset Sup. count T400 I1, I2, I4 {I1} 6 T500 I1, I3 {I2} 7 T600 I2, I3 T700 I1, I3 {I3} 6 T800 I1, I2, I3, I5 {I4} 2 T900 I1, I2, I3 {I5} 2
  9. 9. Apriori: Overview-2 2. min_sup Itemset 3. k-itemset (k+1)-itemset C1 L1 C2 Itemset Sup. count Itemset Sup. count Itemset {I1} 6 {I1} 6 {I1,I2} {I2} 7 {I2} 7 {I1,I3} {I3} 6 {I3} 6 {I1, I4} {I4} 2 {I4} 2 {I1, I5} {I5} 2 {I5} 2 {I2, I3} {I2, I4} {I2, I5} {I3, I4} {I3, I5} {I4, I5}
  10. 10. Apriori: Overview-3 4. Itemset • DB HDD • 5. min_sup itemset C2 L2 Itemset Itemset Sup. Count Itemset Sup. Count {I1,I2} {I1,I2} 4 {I1,I2} 4 {I1,I3} {I1,I3} 4 {I1,I3} 4 {I1, I4} {I1, I4} 1 {I1, I5} 2 {I1, I5} {I1, I5} 2 {I2, I3} 4 {I2, I3} {I2, I3} 4 {I2, I4} 2 {I2, I4} {I2, I4} 2 {I2, I5} 2 {I2, I5} {I2, I5} 2 {I3, I4} {I3, I4} 0 {I3, I5} {I3, I5} 1 {I4, I5} {I4, I5} 0
  11. 11. Apriori: Overview-4 • L1, L2, L3 min_sup itemset • L1 C2, L2 C3, L3 C4 L2 C3 L3 Itemset Sup. Count Itemset Itemset Sup. Count {I1,I2} 4 {I1,I2, I3} {I1,I2, I3} 2 {I1,I3} 4 {I1,I2, I5} {I1,I2, I5} 2 {I1, I5} 2 {I2, I3} 4 C4 {I2, I4} 2 Itemset {I2, I5} 2
  12. 12. Apriori {I1, I2, I3, I5} 2 2 {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5} 4 4 1 2 4 2 2 0 1 0 {I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 6 7 6 2 2 {I1} {I2} {I3} {I4} {I5} {} • × DB min_sup itemset • × k-itemset itemset itemset
  13. 13. Apriori: Pruning Phase • k-itemset (k+1)-itemset itemset k-1 item 2 (k+1)-itemset • {I1, I2}, {I1,I3} I1 {I1, I2, I3} • {I1, I2, I3}, {I1, I2, I5} I1,I2 {I1, I2, I3, I5} • (k+1)-itemset k-itemset({I1, I2, I3} {I1, I2}, {I1, I3}, {I2, I3}) k-itemset • • {I1, I3, I5} {I3, I5} {I1, I3, I5} min_sup {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5} {I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 4 4 1 2 4 2 2 0 1 0
  14. 14. • 1-itemset DIC 2-itemset S. Brin, R. Motowani, J. Ullman, and S. Tsur. 1997 • {I2} {I4} min_sup {12,14} . {12,I4} min_sup • DB TID item Apriori DIC T100 I1, I2, I5 T200 I2, I4 1-itemset 2-itemset 3-itemset T300 I2, I4 1-itemset 2-itemset 3-itemset T400 I1, I2, I4 T500 I1, I3 T600 I2, I3 T700 I1, I3 T800 I1, I2, I3, I5 T900 I1, I2, I3
  15. 15. • • Hash • (k+1)-itemset k-itemset • • PC • Heap itemset • FP-tree (J.Han, J. Pei and Y. Yin. 2000) • • • • S.Brin, R. Motwani and C. Silverstein. 1997 • S. Morishita and J. Sese. 2000

×