SlideShare a Scribd company logo
1 of 21
Download to read offline
Introduction to Machine
       Learning
                      Lecture 14
 Advanced Topics in Association Rules Mining

                    Albert Orriols i Puig
                   aorriols@salle.url.edu
                       i l @ ll       ld

          Artificial Intelligence – Machine Learning
              Enginyeria i Arquitectura La Salle
                  gy           q
                     Universitat Ramon Llull
Recap of Lecture 13
        Ideas come from the market basket analysis (
                                              y    (MBA)
                                                       )
                Let’s go shopping!

           Milk, eggs, sugar,
                 bread
                                 Milk, eggs, cereal,        Eggs, sugar
                                        bread
                                        bd




              Customer1

                                     Customer2               Customer3

                What do my customer buy? Which product are bought together?
                Aim: Find associations and correlations between t e d e e t
                         d assoc at o s a d co e at o s bet ee the different
                items that customers place in their shopping basket
                                                                          Slide 2
Artificial Intelligence                Machine Learning
Recap of Lecture 13
                                          Itemset         sup
                                                                               Itemset        sup
Database TDB
Dtb                                           {A}          2        L1           {A}            2
                                   C1
Tid          Items                            {B}          3
                                                                                 {B}            3
10          A, C
            A C, D                            {C}          3
                                                                                 {C}            3
                              1st scan
20           B, C, E                          {D}          1
                                                                                    {E}         3
30        A, B, C, E                          {E}          3
40             B, E
                                         Itemset         sup
                                   C2                                          C2
                                                                                          Itemset
                                                                                           te set
                                          {A,
                                          {A B}           1
L2                                                                2nd   scan
         Itemset            sup                                                            {A, B}
                                          {A, C}          2
          {A, C}             2                                                             {A, C}
                                          {A, E}          1
          {B,
          {B C}              2
                                                                                           {A, E}
                                          {B, C}          2
          {B, E}             3
                                                                                           {B, C}
                                          {B, E}          3
          {C, E}             2
                                          {C, E}          2                                {B,
                                                                                           {B E}
                                                                                           {C, E}

                Itemset
                 te set                             L3
      C3                           3rd scan                Itemset
                                                           It      t     sup
                {B, C, E}
                                                           {B, C, E}      2
                                                                                                    Slide 3
  Artificial Intelligence                      Machine Learning
Recap of Lecture 13
        Challenges
               g
                Apriori scans the data base multiple times
                Most ft
                M t often, there is a high number of candidates
                           th    i    hi h    b    f    did t
                Support counting for candidates can be time expensive


        Several methods try to improve this points by
                Reduce the number of scans of the data base
                Shrink the number of candidates
                Counting the support of candidates more efficiently




                                                                        Slide 4
Artificial Intelligence                Machine Learning
Today’s Agenda
        Starting a journey through some advanced
        topics in ARM
                Mining frequent patterns without candidate
                generation
                Multiple Level AR
                Sequential Pattern Mining
                Quantitative association rules
                Mining class association rules
                Beyond support & confidence
                B    d       t      fid
                Applications

                                                             Slide 5
Artificial Intelligence             Machine Learning
Revisiting Candidate Generation
        Remember A priori?
                   p
                Use the previous frequent itemsets (k-1) to generate the k-
                itemsets
                 te sets
                Count itemsets support by scanning the data base


        Bottleneck in the process: Candidate generation
                Suppose 100 items
                First level of the tree    100 nodes
                                                    ⎛100 ⎞
                Second level of the tree            ⎜
                                                    ⎜2⎟  ⎟
                                                    ⎝    ⎠
                                                             ⎛100 ⎞
                                                             ⎜
                                                             ⎜k⎟
                In general, number of k-itemsets:
                                                                  ⎟
                                                             ⎝    ⎠
                                                                              Slide 6
Artificial Intelligence                   Machine Learning
Can We Avoid Generation?
        Build an auxiliar structure to get statistics about the
                                       g
        itemsets in order to avoid candidate generation
                Use an FP-tree
                       FP tree
                Avoid multiple scans of the data


        Divide-and-conquer methodology
        Avoid candidate generation
        Outline of the process:
                Generate an FP-Tree
                Mine the FP-tree



                                                                  Slide 7
Artificial Intelligence                Machine Learning
Building the FP-Tree

                             TID        Items            Sorted FIS

                              1    {F,A,C,D,G,I,M,P}    {F,C,A,M,P}

                              2    {A,B,C,F,L,M,O}      {F,C,A,B,M}

                              3       {B,F,H,J,O}             {F,B}

                              4      {B,C,K,S,P}             {C,B,P}

                              5    {A,F,C,E,L,P,M,N}    {F,C,A,M,P}




        Scan the DB for the first time and identify frequent itemsets. They
        are: <(f:4),(c:4), (a:3),(b:3),(m:3),(p:3)>
        We sort the items according to their frequency in the last column




                                                                            Slide 8
Artificial Intelligence                   Machine Learning
Building the FP-Tree
                                                       After reading TID1:
TID             Items       Sorted FIS
                                                                   root
1       {F,A,C,D,G,I,M,P}   {F,C,A,M,P}
                                                             F:1
2        {A,B,C,F,L,M,O}    {F,C,A,B,M}

3           {B,F,H,J,O}        {F,B}                         C:1
4           {B,C,K,S,P}       {C,B,P}

                                                             A:1
5       {A,F,C,E,L,P,M,N}   {F,C,A,M,P}


                                                             M:1

                                                             P:1




        Scan again the DB to build the tree
              g


                                                                             Slide 9
Artificial Intelligence                   Machine Learning
Building the FP-Tree
                                                       After reading TID2:
TID             Items       Sorted FIS
                                                                   root
1       {F,A,C,D,G,I,M,P}   {F,C,A,M,P}
                                                             F:2
2        {A,B,C,F,L,M,O}    {F,C,A,B,M}

3           {B,F,H,J,O}        {F,B}                         C:2
4           {B,C,K,S,P}       {C,B,P}

                                                             A:2
5       {A,F,C,E,L,P,M,N}   {F,C,A,M,P}

                                                                     B:1
                                                             M:1
                                                                     B:1
                                                             P:1




                                                                             Slide 10
Artificial Intelligence                   Machine Learning
Building the FP-Tree
                                                       After reading TID3:
TID             Items       Sorted FIS
                                                                   root
1       {F,A,C,D,G,I,M,P}   {F,C,A,M,P}
                                                             F:3
2        {A,B,C,F,L,M,O}    {F,C,A,B,M}
                                                                     B:1
3           {B,F,H,J,O}        {F,B}                         C:2
4           {B,C,K,S,P}       {C,B,P}

                                                             A:2
5       {A,F,C,E,L,P,M,N}   {F,C,A,M,P}

                                                                     B:1
                                                             M:1
                                                                     M:1
                                                             P:1




                                                                             Slide 11
Artificial Intelligence                   Machine Learning
Building the FP-Tree
                                                       After reading TID4:
TID             Items       Sorted FIS
                                                                   root
1       {F,A,C,D,G,I,M,P}   {F,C,A,M,P}
                                                             F:3             C:1
2        {A,B,C,F,L,M,O}    {F,C,A,B,M}
                                                                     B:1
3           {B,F,H,J,O}        {F,B}                         C:2             B:1
4           {B,C,K,S,P}       {C,B,P}

                                                             A:2             P:1
5       {A,F,C,E,L,P,M,N}   {F,C,A,M,P}

                                                                     B:1
                                                             M:1
                                                                     M:1
                                                             P:1




                                                                                   Slide 12
Artificial Intelligence                   Machine Learning
Building the FP-Tree
                                                       After reading TID5:
TID             Items       Sorted FIS
                                                                   root
1       {F,A,C,D,G,I,M,P}   {F,C,A,M,P}
                                                             F:4             C:1
2        {A,B,C,F,L,M,O}    {F,C,A,B,M}
                                                                     B:1
3           {B,F,H,J,O}        {F,B}                         C:3             B:1
4           {B,C,K,S,P}       {C,B,P}

                                                             A:3             P:1
5       {A,F,C,E,L,P,M,N}   {F,C,A,M,P}

                                                                     B:1
                                                             M:2
                                                                     M:1
                                                             P:2




                                                                                   Slide 13
Artificial Intelligence                   Machine Learning
Building the FP-Tree
TID             Items        Sorted FIS

1      {F,A,C,D,G,I,M,P}     {F,C,A,M,P}
                                                                    root
2        {A,B,C,F,L,M,O}     {F,C,A,B,M}
                                                              F:4               C:1
3           {B,F,H,J,O}         {F,B}          Item
                                                                      B:1
4           {B,C,K,S,P}        {C,B,P}           F
                                                              C:3
                                                              C3                B:1
                                                                                B1
5      {A,F,C,E,L,P,M,N}     {F,C,A,M,P}        C

                                                 A
                                                              A:3               P:1
                                                 B
                                                                      B:1
                                                M             M:2
                                                 P
                                                                      M:1
                                                              P:2


         Build and index to access quickly to the nodes and traverse the tree
                                   q     y


                                                                            Slide 14
 Artificial Intelligence                   Machine Learning
Mining the FP-Tree
        Properties to mine the FP-tree
           p
                Node-link prop.: All possible itemsets in which the frequent item
                a is included can be found by following a’s node-links
                   s c uded ca        ou d       oo    g a s ode      s
                                       root
                             F:4                          C:1
               Item                                              P has support of 3
                                           B:1
                 F                                               Two paths in the FP-
                             C:3                          B:1
                                                                 tree for node P
                 C
                                                                      {F,C,A,M}
                                                                 1.
                 A
                             A:3                          P:1
                                                                      {C,B,P}
                                                                      {C B P}
                                                                 2.
                                                                 2
                 B
                                           B:1
                 M           M:2
                 P
                                           M:1
                             P:2



                                                                                Slide 15
Artificial Intelligence                Machine Learning
Mining the FP-Tree
                Prefix path p p To calculate the frequent p
                       p    prop.:                     q     patterns for a node
                a in path P, only the prefix subpath of node of node a in P
                needs to be accumulated, and the frequency count of every
                node in the prefix path should carry the same count as node a

                                       root
                                                                Node i i
                                                                N d P is involved in:
                                                                             l di
                             F:4                          C:1
               Item                                             (F:4,C:3,A:3,M:2,P:2)
                                           B:1
                 F                                              Take the prefix of the
                             C:3                          B:1   path until M
                 C
                                                                    (F:4,C:3,A:3)
                 A
                             A:3                          P:1   Adjust counts to 2
                 B
                                           B:1                      (F:2,C:2,A:2)
                 M           M:2
                                                                So, F, C, and A co-ocur
                 P
                                           M:1                  with M
                             P:2


                                                                                 Slide 16
Artificial Intelligence                Machine Learning
Mining the FP-Tree
                Fragment g
                   g      growth: Let α be an itemset in DB, B be α’s
                                                             ,
                conditional pattern base, and β be an itemset in B. Then, the
                support α U β is equivalent to the support of β in B.



                                root
                                   t
                          F:2
                                                           For M, we had
                                                                ,

                                                                (F:2,C:2,A:2)
                          C:2
                                                                (F:1,C:1,A:1,B:1)

                                                           Therefore,
                          A:2
                                                                {(F,C,A,M):2},{(F,C,M}:2},
                                  B:1                           …




                                                                                      Slide 17
Artificial Intelligence                 Machine Learning
Is FP-growth Faster than Apriori?




                As the support threshold goes down, the number of itemsets
                increases dramatically. FP-growth does not need to generate
                candidates and test them
                                    them.

                                                                         Slide 18
Artificial Intelligence               Machine Learning
Is FP-growth Faster than Apriori?




                Both FP-growth and A priori scale linearly with the number of
                transactions. But FP-growth is more efficient


                                                                           Slide 19
Artificial Intelligence                Machine Learning
Next Class



        Advanced topics in association rule mining




                                                     Slide 20
Artificial Intelligence      Machine Learning
Introduction to Machine
       Learning
                      Lecture 14
 Advanced Topics in Association Rules Mining

                    Albert Orriols i Puig
                   aorriols@salle.url.edu
                       i l @ ll       ld

          Artificial Intelligence – Machine Learning
              Enginyeria i Arquitectura La Salle
                  gy           q
                     Universitat Ramon Llull

More Related Content

What's hot

What's hot (19)

June 2008
June 2008June 2008
June 2008
 
June 2006
June 2006June 2006
June 2006
 
cxc.Mathsexam1
cxc.Mathsexam1cxc.Mathsexam1
cxc.Mathsexam1
 
June 2007
June 2007June 2007
June 2007
 
January 2007
January 2007January 2007
January 2007
 
BukitPanjang GovtHigh Emath Paper2_printed
BukitPanjang GovtHigh Emath Paper2_printedBukitPanjang GovtHigh Emath Paper2_printed
BukitPanjang GovtHigh Emath Paper2_printed
 
January 2011
January 2011January 2011
January 2011
 
cxc June 2010 math
 cxc June 2010 math cxc June 2010 math
cxc June 2010 math
 
Basic idea of set theory
Basic idea of set theoryBasic idea of set theory
Basic idea of set theory
 
White-box texting of (ATL) model transformations
White-box texting of (ATL) model transformationsWhite-box texting of (ATL) model transformations
White-box texting of (ATL) model transformations
 
Algebra review-olga lednichenko math turtoring, math,
Algebra review-olga lednichenko math turtoring, math,Algebra review-olga lednichenko math turtoring, math,
Algebra review-olga lednichenko math turtoring, math,
 
June 09 P3
June 09 P3June 09 P3
June 09 P3
 
Nov 09 P32
Nov 09 P32Nov 09 P32
Nov 09 P32
 
Annual Planning for Mathematics Form 4 2011
Annual Planning for Mathematics Form 4 2011Annual Planning for Mathematics Form 4 2011
Annual Planning for Mathematics Form 4 2011
 
Monfort Emath Paper2_printed
Monfort Emath Paper2_printedMonfort Emath Paper2_printed
Monfort Emath Paper2_printed
 
C O M P U T E R G R A P H I C S J N T U M O D E L P A P E R{Www
C O M P U T E R  G R A P H I C S  J N T U  M O D E L  P A P E R{WwwC O M P U T E R  G R A P H I C S  J N T U  M O D E L  P A P E R{Www
C O M P U T E R G R A P H I C S J N T U M O D E L P A P E R{Www
 
Annual plan for maths form5 2011
Annual plan for maths form5 2011Annual plan for maths form5 2011
Annual plan for maths form5 2011
 
006 hyperbola
006 hyperbola006 hyperbola
006 hyperbola
 
Gasps regional slides draft 10
Gasps  regional slides draft 10Gasps  regional slides draft 10
Gasps regional slides draft 10
 

Viewers also liked (11)

Lecture21
Lecture21Lecture21
Lecture21
 
Lecture24
Lecture24Lecture24
Lecture24
 
Lecture18
Lecture18Lecture18
Lecture18
 
Lecture17
Lecture17Lecture17
Lecture17
 
Lecture20
Lecture20Lecture20
Lecture20
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasets
 
Lecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIILecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART III
 
Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART II
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 

More from Albert Orriols-Puig

Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
Albert Orriols-Puig
 
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
Albert Orriols-Puig
 
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
Albert Orriols-Puig
 
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
Albert Orriols-Puig
 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
Albert Orriols-Puig
 
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
Albert Orriols-Puig
 

More from Albert Orriols-Puig (19)

Lecture23
Lecture23Lecture23
Lecture23
 
Lecture22
Lecture22Lecture22
Lecture22
 
Lecture19
Lecture19Lecture19
Lecture19
 
Lecture11 - neural networks
Lecture11 - neural networksLecture11 - neural networks
Lecture11 - neural networks
 
Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
 
Lecture7 - IBk
Lecture7 - IBkLecture7 - IBk
Lecture7 - IBk
 
Lecture8 - From CBR to IBk
Lecture8 - From CBR to IBkLecture8 - From CBR to IBk
Lecture8 - From CBR to IBk
 
Lecture6 - C4.5
Lecture6 - C4.5Lecture6 - C4.5
Lecture6 - C4.5
 
Lecture5 - C4.5
Lecture5 - C4.5Lecture5 - C4.5
Lecture5 - C4.5
 
Lecture4 - Machine Learning
Lecture4 - Machine LearningLecture4 - Machine Learning
Lecture4 - Machine Learning
 
Lecture3 - Machine Learning
Lecture3 - Machine LearningLecture3 - Machine Learning
Lecture3 - Machine Learning
 
Lecture2 - Machine Learning
Lecture2 - Machine LearningLecture2 - Machine Learning
Lecture2 - Machine Learning
 
Lecture1 - Machine Learning
Lecture1 - Machine LearningLecture1 - Machine Learning
Lecture1 - Machine Learning
 
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
 
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
 
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
 
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Recently uploaded (20)

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 

Lecture14 - Advanced topics in association rules

  • 1. Introduction to Machine Learning Lecture 14 Advanced Topics in Association Rules Mining Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull
  • 2. Recap of Lecture 13 Ideas come from the market basket analysis ( y (MBA) ) Let’s go shopping! Milk, eggs, sugar, bread Milk, eggs, cereal, Eggs, sugar bread bd Customer1 Customer2 Customer3 What do my customer buy? Which product are bought together? Aim: Find associations and correlations between t e d e e t d assoc at o s a d co e at o s bet ee the different items that customers place in their shopping basket Slide 2 Artificial Intelligence Machine Learning
  • 3. Recap of Lecture 13 Itemset sup Itemset sup Database TDB Dtb {A} 2 L1 {A} 2 C1 Tid Items {B} 3 {B} 3 10 A, C A C, D {C} 3 {C} 3 1st scan 20 B, C, E {D} 1 {E} 3 30 A, B, C, E {E} 3 40 B, E Itemset sup C2 C2 Itemset te set {A, {A B} 1 L2 2nd scan Itemset sup {A, B} {A, C} 2 {A, C} 2 {A, C} {A, E} 1 {B, {B C} 2 {A, E} {B, C} 2 {B, E} 3 {B, C} {B, E} 3 {C, E} 2 {C, E} 2 {B, {B E} {C, E} Itemset te set L3 C3 3rd scan Itemset It t sup {B, C, E} {B, C, E} 2 Slide 3 Artificial Intelligence Machine Learning
  • 4. Recap of Lecture 13 Challenges g Apriori scans the data base multiple times Most ft M t often, there is a high number of candidates th i hi h b f did t Support counting for candidates can be time expensive Several methods try to improve this points by Reduce the number of scans of the data base Shrink the number of candidates Counting the support of candidates more efficiently Slide 4 Artificial Intelligence Machine Learning
  • 5. Today’s Agenda Starting a journey through some advanced topics in ARM Mining frequent patterns without candidate generation Multiple Level AR Sequential Pattern Mining Quantitative association rules Mining class association rules Beyond support & confidence B d t fid Applications Slide 5 Artificial Intelligence Machine Learning
  • 6. Revisiting Candidate Generation Remember A priori? p Use the previous frequent itemsets (k-1) to generate the k- itemsets te sets Count itemsets support by scanning the data base Bottleneck in the process: Candidate generation Suppose 100 items First level of the tree 100 nodes ⎛100 ⎞ Second level of the tree ⎜ ⎜2⎟ ⎟ ⎝ ⎠ ⎛100 ⎞ ⎜ ⎜k⎟ In general, number of k-itemsets: ⎟ ⎝ ⎠ Slide 6 Artificial Intelligence Machine Learning
  • 7. Can We Avoid Generation? Build an auxiliar structure to get statistics about the g itemsets in order to avoid candidate generation Use an FP-tree FP tree Avoid multiple scans of the data Divide-and-conquer methodology Avoid candidate generation Outline of the process: Generate an FP-Tree Mine the FP-tree Slide 7 Artificial Intelligence Machine Learning
  • 8. Building the FP-Tree TID Items Sorted FIS 1 {F,A,C,D,G,I,M,P} {F,C,A,M,P} 2 {A,B,C,F,L,M,O} {F,C,A,B,M} 3 {B,F,H,J,O} {F,B} 4 {B,C,K,S,P} {C,B,P} 5 {A,F,C,E,L,P,M,N} {F,C,A,M,P} Scan the DB for the first time and identify frequent itemsets. They are: <(f:4),(c:4), (a:3),(b:3),(m:3),(p:3)> We sort the items according to their frequency in the last column Slide 8 Artificial Intelligence Machine Learning
  • 9. Building the FP-Tree After reading TID1: TID Items Sorted FIS root 1 {F,A,C,D,G,I,M,P} {F,C,A,M,P} F:1 2 {A,B,C,F,L,M,O} {F,C,A,B,M} 3 {B,F,H,J,O} {F,B} C:1 4 {B,C,K,S,P} {C,B,P} A:1 5 {A,F,C,E,L,P,M,N} {F,C,A,M,P} M:1 P:1 Scan again the DB to build the tree g Slide 9 Artificial Intelligence Machine Learning
  • 10. Building the FP-Tree After reading TID2: TID Items Sorted FIS root 1 {F,A,C,D,G,I,M,P} {F,C,A,M,P} F:2 2 {A,B,C,F,L,M,O} {F,C,A,B,M} 3 {B,F,H,J,O} {F,B} C:2 4 {B,C,K,S,P} {C,B,P} A:2 5 {A,F,C,E,L,P,M,N} {F,C,A,M,P} B:1 M:1 B:1 P:1 Slide 10 Artificial Intelligence Machine Learning
  • 11. Building the FP-Tree After reading TID3: TID Items Sorted FIS root 1 {F,A,C,D,G,I,M,P} {F,C,A,M,P} F:3 2 {A,B,C,F,L,M,O} {F,C,A,B,M} B:1 3 {B,F,H,J,O} {F,B} C:2 4 {B,C,K,S,P} {C,B,P} A:2 5 {A,F,C,E,L,P,M,N} {F,C,A,M,P} B:1 M:1 M:1 P:1 Slide 11 Artificial Intelligence Machine Learning
  • 12. Building the FP-Tree After reading TID4: TID Items Sorted FIS root 1 {F,A,C,D,G,I,M,P} {F,C,A,M,P} F:3 C:1 2 {A,B,C,F,L,M,O} {F,C,A,B,M} B:1 3 {B,F,H,J,O} {F,B} C:2 B:1 4 {B,C,K,S,P} {C,B,P} A:2 P:1 5 {A,F,C,E,L,P,M,N} {F,C,A,M,P} B:1 M:1 M:1 P:1 Slide 12 Artificial Intelligence Machine Learning
  • 13. Building the FP-Tree After reading TID5: TID Items Sorted FIS root 1 {F,A,C,D,G,I,M,P} {F,C,A,M,P} F:4 C:1 2 {A,B,C,F,L,M,O} {F,C,A,B,M} B:1 3 {B,F,H,J,O} {F,B} C:3 B:1 4 {B,C,K,S,P} {C,B,P} A:3 P:1 5 {A,F,C,E,L,P,M,N} {F,C,A,M,P} B:1 M:2 M:1 P:2 Slide 13 Artificial Intelligence Machine Learning
  • 14. Building the FP-Tree TID Items Sorted FIS 1 {F,A,C,D,G,I,M,P} {F,C,A,M,P} root 2 {A,B,C,F,L,M,O} {F,C,A,B,M} F:4 C:1 3 {B,F,H,J,O} {F,B} Item B:1 4 {B,C,K,S,P} {C,B,P} F C:3 C3 B:1 B1 5 {A,F,C,E,L,P,M,N} {F,C,A,M,P} C A A:3 P:1 B B:1 M M:2 P M:1 P:2 Build and index to access quickly to the nodes and traverse the tree q y Slide 14 Artificial Intelligence Machine Learning
  • 15. Mining the FP-Tree Properties to mine the FP-tree p Node-link prop.: All possible itemsets in which the frequent item a is included can be found by following a’s node-links s c uded ca ou d oo g a s ode s root F:4 C:1 Item P has support of 3 B:1 F Two paths in the FP- C:3 B:1 tree for node P C {F,C,A,M} 1. A A:3 P:1 {C,B,P} {C B P} 2. 2 B B:1 M M:2 P M:1 P:2 Slide 15 Artificial Intelligence Machine Learning
  • 16. Mining the FP-Tree Prefix path p p To calculate the frequent p p prop.: q patterns for a node a in path P, only the prefix subpath of node of node a in P needs to be accumulated, and the frequency count of every node in the prefix path should carry the same count as node a root Node i i N d P is involved in: l di F:4 C:1 Item (F:4,C:3,A:3,M:2,P:2) B:1 F Take the prefix of the C:3 B:1 path until M C (F:4,C:3,A:3) A A:3 P:1 Adjust counts to 2 B B:1 (F:2,C:2,A:2) M M:2 So, F, C, and A co-ocur P M:1 with M P:2 Slide 16 Artificial Intelligence Machine Learning
  • 17. Mining the FP-Tree Fragment g g growth: Let α be an itemset in DB, B be α’s , conditional pattern base, and β be an itemset in B. Then, the support α U β is equivalent to the support of β in B. root t F:2 For M, we had , (F:2,C:2,A:2) C:2 (F:1,C:1,A:1,B:1) Therefore, A:2 {(F,C,A,M):2},{(F,C,M}:2}, B:1 … Slide 17 Artificial Intelligence Machine Learning
  • 18. Is FP-growth Faster than Apriori? As the support threshold goes down, the number of itemsets increases dramatically. FP-growth does not need to generate candidates and test them them. Slide 18 Artificial Intelligence Machine Learning
  • 19. Is FP-growth Faster than Apriori? Both FP-growth and A priori scale linearly with the number of transactions. But FP-growth is more efficient Slide 19 Artificial Intelligence Machine Learning
  • 20. Next Class Advanced topics in association rule mining Slide 20 Artificial Intelligence Machine Learning
  • 21. Introduction to Machine Learning Lecture 14 Advanced Topics in Association Rules Mining Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull