(Association Rule)
•
    •    PC
•                        HDD
    •
•   30        500
    •
    •           30         500...
•   Market Basket Analysis
    •   Frequent Pattern Mining
•



           1                      2




           3      ...
•
    •                               [        =2%,   =60%]
•   A B
    • A:      (antecedent), B:     (consequent)
•     ...
(1/2)
                                     TID       item
•                                    T100    I1, I2, I5
        ...
(2/2)                        TID      item

 •             A⇒B                           T100
                            ...
•
•                          A∪B                                     A B,
    B A


    •
•                     (min_sup)
...
•
•
•
•
•
    •
•
•
    •
    •
•
Apriori: Overview-1
•             min_sup          itemset
    •   Agrawal & Srikant 1994
•
•              min_sup = 2
   ...
Apriori: Overview-2
        2. min_sup                     Itemset
        3.            k-itemset           (k+1)-itemset...
Apriori: Overview-3
 4.                                   Itemset
       • DB                 HDD
       •
 5. min_sup    ...
Apriori: Overview-4
     •   L1, L2, L3        min_sup             itemset
     •   L1      C2, L2     C3, L3      C4


L2...
Apriori                     {I1, I2, I3, I5}

                  2            2
      {I1, I2, I3} {I1, I2, I5} {I1, I3, I5...
Apriori: Pruning Phase
  • k-itemset            (k+1)-itemset
            itemset              k-1 item             2     ...
•   1-itemset
              DIC                               2-itemset
S. Brin, R. Motowani, J. Ullman, and
             ...
•
    •   Hash
        •   (k+1)-itemset             k-itemset
    •
        •           PC
    •   Heap                  ...
Datamining 9th Association Rule
Upcoming SlideShare
Loading in …5
×

Datamining 9th Association Rule

1,636 views

Published on

  • Be the first to comment

  • Be the first to like this

Datamining 9th Association Rule

  1. 1. (Association Rule) • • PC • HDD • • 30 500 • • 30 500 2 • ALDH • A B A A=C∧D
  2. 2. • Market Basket Analysis • Frequent Pattern Mining • 1 2 3 4
  3. 3. • • [ =2%, =60%] • A B • A: (antecedent), B: (consequent) • support • A B • • confidence • A B •
  4. 4. (1/2) TID item • T100 I1, I2, I5 I = {I1 , I2 , ..., Im } T200 I2, I4 • D T300 I2, I4 T T400 I1, I2, I4 • T T500 I1, I3 T ⊆I T600 I2, I3 T700 I1, I3 • T A T800 I1, I2, I3, I5 A⊆T T900 I1, I2, I3 I = {I1, I2, I3, I4, I5} • itemset T100 : {I1, I2, I5} • itemset k k-itemset
  5. 5. (2/2) TID item • A⇒B T100 T200 I1, I2, I5 I2, I4 A ⊂ I, B ⊂ I, A ∩ B = φ T300 I2, I4 • A⇒B T400 I1, I2, I4 support(A ⇒ B) = P (A ∪ B) T500 I1, I3 conf idence(A ⇒ B) = P (B | A) T600 I2, I3 T700 I1, I3 A = {I1} , B = {I2} , A ∪ B = {I1, I2} T800 I1, I2, I3, I5 P (A ∪ B) = 4/9 P (B | A) = 4/6 T900 I1, I2, I3 • support(A ∪ B) support count(A ∪ B) conf idence(A ⇒ B) = P (B | A) = = support(A) support count(A)
  6. 6. • • A∪B A B, B A • • (min_sup) • min_sup itemset • itemset 1. item 100 2^100-1 2. 9-itemset {a1, a2, .., a9} min_sup {a1} {a2} {a1,a2} {a1, a9} {a1, a2, a3} ... min_sup • itemset
  7. 7. • • • • • • • • • • •
  8. 8. Apriori: Overview-1 • min_sup itemset • Agrawal & Srikant 1994 • • min_sup = 2 TID item 1. D T100 I1, I2, I5 1-itemset T200 I2, I4 C1 T300 I2, I4 Itemset Sup. count T400 I1, I2, I4 {I1} 6 T500 I1, I3 {I2} 7 T600 I2, I3 T700 I1, I3 {I3} 6 T800 I1, I2, I3, I5 {I4} 2 T900 I1, I2, I3 {I5} 2
  9. 9. Apriori: Overview-2 2. min_sup Itemset 3. k-itemset (k+1)-itemset C1 L1 C2 Itemset Sup. count Itemset Sup. count Itemset {I1} 6 {I1} 6 {I1,I2} {I2} 7 {I2} 7 {I1,I3} {I3} 6 {I3} 6 {I1, I4} {I4} 2 {I4} 2 {I1, I5} {I5} 2 {I5} 2 {I2, I3} {I2, I4} {I2, I5} {I3, I4} {I3, I5} {I4, I5}
  10. 10. Apriori: Overview-3 4. Itemset • DB HDD • 5. min_sup itemset C2 L2 Itemset Itemset Sup. Count Itemset Sup. Count {I1,I2} {I1,I2} 4 {I1,I2} 4 {I1,I3} {I1,I3} 4 {I1,I3} 4 {I1, I4} {I1, I4} 1 {I1, I5} 2 {I1, I5} {I1, I5} 2 {I2, I3} 4 {I2, I3} {I2, I3} 4 {I2, I4} 2 {I2, I4} {I2, I4} 2 {I2, I5} 2 {I2, I5} {I2, I5} 2 {I3, I4} {I3, I4} 0 {I3, I5} {I3, I5} 1 {I4, I5} {I4, I5} 0
  11. 11. Apriori: Overview-4 • L1, L2, L3 min_sup itemset • L1 C2, L2 C3, L3 C4 L2 C3 L3 Itemset Sup. Count Itemset Itemset Sup. Count {I1,I2} 4 {I1,I2, I3} {I1,I2, I3} 2 {I1,I3} 4 {I1,I2, I5} {I1,I2, I5} 2 {I1, I5} 2 {I2, I3} 4 C4 {I2, I4} 2 Itemset {I2, I5} 2
  12. 12. Apriori {I1, I2, I3, I5} 2 2 {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5} 4 4 1 2 4 2 2 0 1 0 {I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 6 7 6 2 2 {I1} {I2} {I3} {I4} {I5} {} • × DB min_sup itemset • × k-itemset itemset itemset
  13. 13. Apriori: Pruning Phase • k-itemset (k+1)-itemset itemset k-1 item 2 (k+1)-itemset • {I1, I2}, {I1,I3} I1 {I1, I2, I3} • {I1, I2, I3}, {I1, I2, I5} I1,I2 {I1, I2, I3, I5} • (k+1)-itemset k-itemset({I1, I2, I3} {I1, I2}, {I1, I3}, {I2, I3}) k-itemset • • {I1, I3, I5} {I3, I5} {I1, I3, I5} min_sup {I1, I2, I3} {I1, I2, I5} {I1, I3, I5} {I2, I3, I4} {I2, I3, I5} {I1, I2} {I1,I3} {I1, I4} {I1, I5} {I2, I3} {I2, I4} {I2,I5} {I3,I4} {I3,I5} {I4,I5} 4 4 1 2 4 2 2 0 1 0
  14. 14. • 1-itemset DIC 2-itemset S. Brin, R. Motowani, J. Ullman, and S. Tsur. 1997 • {I2} {I4} min_sup {12,14} . {12,I4} min_sup • DB TID item Apriori DIC T100 I1, I2, I5 T200 I2, I4 1-itemset 2-itemset 3-itemset T300 I2, I4 1-itemset 2-itemset 3-itemset T400 I1, I2, I4 T500 I1, I3 T600 I2, I3 T700 I1, I3 T800 I1, I2, I3, I5 T900 I1, I2, I3
  15. 15. • • Hash • (k+1)-itemset k-itemset • • PC • Heap itemset • FP-tree (J.Han, J. Pei and Y. Yin. 2000) • • • • S.Brin, R. Motwani and C. Silverstein. 1997 • S. Morishita and J. Sese. 2000

×