SlideShare a Scribd company logo
1 of 16
Association Rule Mining
Ayesha Ali
Association Analysis
• Discovery of Association Rules
– showing attribute-value conditions that occur
frequently together in a set of data, e.g. market
basket
– Given a set of data, find rules that will predict the
occurrence of a data item based on the
occurrences of other items in the data
• A rule has the form body ⇒head
– buys(Omar, “milk”) ⇒ buys(Omar, “sugar”)
Association Analysis
Association Analysis
Location Business Type
1 Barber, Bakery, Convenience Store, Meat Shop, Fast Food
2 Bakery, Bookstore, Petrol Pump, Convenience Store, Library, Fast Food
3 Carpenter, Electrician, Barber, Hardware Store,
4 Bakery, Vegetable Market, Flower Shop, Sweets Shop, Meat Shop
5 Convenience Store, Hospital, Pharmacy, Sports Shop, Gym, Fast Food
6 Internet Café, Gym, Games Shop, Shorts Shop, Fast Food, Bakery
Association Rule: X Y ; (Fast Food, Bakery)  (Convenience Store)
Support S: Fraction of items that contain both X and Y = P(X U Y)
S(Fast Food, Bakery, Convenience Store) = 2/6 = .33
Confidence C: how often items in Y appear in locations that contain X = P(X U Y)
C[(Fast Food, Bakery)  (Convenience Store)] = P(X U Y) / P(X)
= 0.33/0.50 = .66
Association Analysis
• Given a set of transactions T, the goal of
association rule mining is to find all rules having
– support ≥ minsup threshold
– confidence ≥ minconf threshold
• Brute-force approach:
– List all possible association rules
– Compute the support and confidence for each rule
– Prune rules that fail the minsup and minconf
thresholds
⇒ Computationally prohibitive!
Association Analysis
Location Business Type
1 Barber, Bakery, Convenience Store, Meat Shop, Fast Food, Meat Shop
2 Bakery, Bookstore, Petrol Pump, Convenience Store, Library, Fast Food
3 Carpenter, Electrician, Barber, Hardware Store, Meat Shop
4 Bakery, Vegetable Market, Flower Shop, Sweets Shop, Meat Shop
5 Convenience Store, Hospital, Pharmacy, Sports Shop, Gym, Fast Food
6 Internet Café, Gym, Sweets Shop, Shorts Shop, Fast Food, Bakery
Association Rules:
(Fast Food, Bakery)  (Convenience Store) Support S: .33 Confidence C: .66
(Convenience Store, Bakery)  (Fast Food) Support S: .33 Confidence C: .50
(Fast Food, Convenience Store)  (Bakery) Support S: .33 Confidence C: .55
(Convenience Store)  (Fast Food, Bakery) Support S: .33 Confidence C: .66
(Fast Food)  (Convenience Store, Bakery) Support S: .33 Confidence C: 1
(Bakery)  (Fast Food, Convenience Store) Support S: .33 Confidence C: .66
Association Analysis
Association Rules:
(Fast Food, Bakery)  (Convenience Store) Support S: .33 Confidence C: .66
(Convenience Store, Bakery)  (Fast Food) Support S: .33 Confidence C: .50
(Fast Food, Convenience Store)  (Bakery) Support S: .33 Confidence C: .66
(Convenience Store)  (Fast Food, Bakery) Support S: .33 Confidence C: .66
(Fast Food)  (Convenience Store, Bakery) Support S: .33 Confidence C: 1
(Bakery)  (Fast Food, Convenience Store) Support S: .33 Confidence C: .66
Observations
 Above rules are binary partitions of given item set
 Identical Support but different Confidence
 Support and Confidence thresholds may be different
Mining Association Rules
• Two-step approach:
Step 1. Frequent Itemset Generation
Generate all itemsets whose support ≥ minsup
Step 2. Rule Generation
Generate high confidence rules from each frequent itemset,
where each rule is a binary partitioning of a frequent itemset
Note: Frequent itemset generation is still computationally expensive
Mining Association Rules
• Frequent Item Generation
Lattice Graph of possible item sets
Mining Association Rules
• Brute-force approach:
– Each node in the lattice graphs is a candidate frequent itemset
– Count the support of each candidate by scanning the database
– N = 6
– w = (Barber, Bakery, Convenience Store, Meat Shop, Fast Food, Bookstore, Petrol Pump, Library,
Carpenter, Electrician, Hardware Store, Vegetable Market, Flower Shop, Sweets Shop, Meat Shop,
Hospital, Pharmacy, Sports Shop, Gym, Internet Café) = 20
– M = 220 = 1048576
– Complexity ~ O (NMw)
Mining Association Rules
W Unique Items in Item set
Mining Association Rules
• Frequent Itemset Generation
– Reduce the number of candidates (M)
– Reduce the number of transactions/locations (N)
– Reduce the number of comparisons (NM)
• Use efficient data structures to store the candidates
• No need to match every candidate against every
transaction/location
Reducing the number of candidates
• Apriori principle:
– If an itemset is frequent, then all of its subsets must
also be frequent
• Important Support property:
– Support of an itemset never exceeds the support of its
subsets
– This is known as the anti-monotone property of
support
Reducing the number of candidates
Applying Apriori principle
Reducing the number of candidates
• N = 20
• All Possible candidate sets;
– NC1 + NC2 + NC3 + … + NCN
• Minimum Occurrence Based Filtering
Set m= 2 and L = 1
While (L < N){
Scan DB:
List = Create Occurrence Frequency Table of candidate sets of Length L
If no candidate in List then Break;
Filter all candidate sets with Occurrence Frequency < m
Create new candidate set of Length (L=L+1) from List
}
Filter Minimum Occurrences
m < 2
Reducing the number of candidates
Business Type Count
Barber 2
Bakery 2
Book tore 1
Carpenter 1
Convenience
Store
3
Electrician 1
Fast Food 3
Flower Shop 1
Gym 1
Games Shop 1
Hardware Store 1
Hospital 1
Internet Café 1
Library 1
Meat Shop 1
Petrol Pump 1
Pharmacy 1
Sports Shop 1
Sweets Shop 1
Vegetable Market 1
Business Type Count
Barber 2
Bakery 2
Convenience Store 3
Fast Food 3
Filter
Scan 1
Business Type Count
(Barber, Bakery) 1
(Barber, Convenience Store) 1
(Barber, Fast Food) 1
(Bakery, Convenience Store) 2
(Bakery, Fast Food) 3
(Convenience Store, Fast Food) 3
Pairs of Two Items; 4C2 = 6
Business Type Count
(Bakery, Convenience Store) 2
(Bakery, Fast Food) 3
(Convenience Store, Fast Food) 3
Filter Minimum Occurrences
m < 2
L1
L2

More Related Content

What's hot

11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMSkoolkampus
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
 
Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.Meghaj Mallick
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmhktripathy
 
Fp growth algorithm
Fp growth algorithmFp growth algorithm
Fp growth algorithmPradip Kumar
 
Binary Search Tree
Binary Search TreeBinary Search Tree
Binary Search Treesagar yadav
 
Eclat algorithm in association rule mining
Eclat algorithm in association rule miningEclat algorithm in association rule mining
Eclat algorithm in association rule miningDeepa Jeya
 
Dag representation of basic blocks
Dag representation of basic blocksDag representation of basic blocks
Dag representation of basic blocksJothi Lakshmi
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 
Relational algebra ppt
Relational algebra pptRelational algebra ppt
Relational algebra pptGirdharRatne
 
Dbms Notes Lecture 9 : Specialization, Generalization and Aggregation
Dbms Notes Lecture 9 : Specialization, Generalization and AggregationDbms Notes Lecture 9 : Specialization, Generalization and Aggregation
Dbms Notes Lecture 9 : Specialization, Generalization and AggregationBIT Durg
 
Understanding Association Rule Mining
Understanding Association Rule MiningUnderstanding Association Rule Mining
Understanding Association Rule MiningMohit Rajput
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 

What's hot (20)

11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS
 
Association rules apriori algorithm
Association rules   apriori algorithmAssociation rules   apriori algorithm
Association rules apriori algorithm
 
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & RecoveryDistributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithm
 
Fp growth algorithm
Fp growth algorithmFp growth algorithm
Fp growth algorithm
 
Binary Search Tree
Binary Search TreeBinary Search Tree
Binary Search Tree
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
Eclat algorithm in association rule mining
Eclat algorithm in association rule miningEclat algorithm in association rule mining
Eclat algorithm in association rule mining
 
Dag representation of basic blocks
Dag representation of basic blocksDag representation of basic blocks
Dag representation of basic blocks
 
SPADE -
SPADE - SPADE -
SPADE -
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Relational algebra ppt
Relational algebra pptRelational algebra ppt
Relational algebra ppt
 
Dbms Notes Lecture 9 : Specialization, Generalization and Aggregation
Dbms Notes Lecture 9 : Specialization, Generalization and AggregationDbms Notes Lecture 9 : Specialization, Generalization and Aggregation
Dbms Notes Lecture 9 : Specialization, Generalization and Aggregation
 
Distributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query ProcessingDistributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query Processing
 
Understanding Association Rule Mining
Understanding Association Rule MiningUnderstanding Association Rule Mining
Understanding Association Rule Mining
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 

Viewers also liked

Data mining- Association Analysis -market basket
Data mining- Association Analysis -market basketData mining- Association Analysis -market basket
Data mining- Association Analysis -market basketSwapnil Soni
 
Machine Learning and Data Mining: 04 Association Rule Mining
Machine Learning and Data Mining: 04 Association Rule MiningMachine Learning and Data Mining: 04 Association Rule Mining
Machine Learning and Data Mining: 04 Association Rule MiningPier Luca Lanzi
 
Bord Pillar
Bord PillarBord Pillar
Bord PillarVR M
 
The comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmThe comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmdeepti92pawar
 
Data mining fp growth
Data mining fp growthData mining fp growth
Data mining fp growthShihab Rahman
 
Association rule mining
Association rule miningAssociation rule mining
Association rule miningAcad
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
Design of Bord and Pillar method in coal mines
Design of Bord and Pillar method in coal minesDesign of Bord and Pillar method in coal mines
Design of Bord and Pillar method in coal minesaashutosh chhirolya
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Association Rule Mining with R
Association Rule Mining with RAssociation Rule Mining with R
Association Rule Mining with RYanchang Zhao
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDataminingTools Inc
 
TEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkTEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkVolker Hirsch
 

Viewers also liked (19)

Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
 
Data mining- Association Analysis -market basket
Data mining- Association Analysis -market basketData mining- Association Analysis -market basket
Data mining- Association Analysis -market basket
 
Machine Learning and Data Mining: 04 Association Rule Mining
Machine Learning and Data Mining: 04 Association Rule MiningMachine Learning and Data Mining: 04 Association Rule Mining
Machine Learning and Data Mining: 04 Association Rule Mining
 
Apriori
AprioriApriori
Apriori
 
Bord Pillar
Bord PillarBord Pillar
Bord Pillar
 
Chitwan sand mining
Chitwan sand miningChitwan sand mining
Chitwan sand mining
 
depllaring in coal mines
depllaring in coal minesdepllaring in coal mines
depllaring in coal mines
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
The comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmThe comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithm
 
Data mining fp growth
Data mining fp growthData mining fp growth
Data mining fp growth
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
Design of Bord and Pillar method in coal mines
Design of Bord and Pillar method in coal minesDesign of Bord and Pillar method in coal mines
Design of Bord and Pillar method in coal mines
 
Mining ppt 2014
Mining ppt 2014Mining ppt 2014
Mining ppt 2014
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Association Rule Mining with R
Association Rule Mining with RAssociation Rule Mining with R
Association Rule Mining with R
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
TEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkTEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of Work
 

Similar to Association Rule Mining in Data Mining

DM -Unit 2-PPT.ppt
DM -Unit 2-PPT.pptDM -Unit 2-PPT.ppt
DM -Unit 2-PPT.pptraju980973
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data miningSulman Ahmed
 
Lec6_Association.ppt
Lec6_Association.pptLec6_Association.ppt
Lec6_Association.pptprema370155
 
AssociationRule.pdf
AssociationRule.pdfAssociationRule.pdf
AssociationRule.pdfWailaBaba
 
MODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptxMODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptxnikshaikh786
 
Association 04.03.14
Association   04.03.14Association   04.03.14
Association 04.03.14rahulmath80
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningWan Aezwani Wab
 
UNIT 3.2 -Mining Frquent Patterns (part1).ppt
UNIT 3.2 -Mining Frquent Patterns (part1).pptUNIT 3.2 -Mining Frquent Patterns (part1).ppt
UNIT 3.2 -Mining Frquent Patterns (part1).pptRaviKiranVarma4
 
Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...
Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...
Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...John Blue
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data miningSulman Ahmed
 
What goes with what (Market Basket Analysis)
What goes with what (Market Basket Analysis)What goes with what (Market Basket Analysis)
What goes with what (Market Basket Analysis)Kumar P
 

Similar to Association Rule Mining in Data Mining (14)

Data mining arm-2009-v0
Data mining arm-2009-v0Data mining arm-2009-v0
Data mining arm-2009-v0
 
DM -Unit 2-PPT.ppt
DM -Unit 2-PPT.pptDM -Unit 2-PPT.ppt
DM -Unit 2-PPT.ppt
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
 
Lec6_Association.ppt
Lec6_Association.pptLec6_Association.ppt
Lec6_Association.ppt
 
AssociationRule.pdf
AssociationRule.pdfAssociationRule.pdf
AssociationRule.pdf
 
MODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptxMODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptx
 
Association 04.03.14
Association   04.03.14Association   04.03.14
Association 04.03.14
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule Mining
 
UNIT 3.2 -Mining Frquent Patterns (part1).ppt
UNIT 3.2 -Mining Frquent Patterns (part1).pptUNIT 3.2 -Mining Frquent Patterns (part1).ppt
UNIT 3.2 -Mining Frquent Patterns (part1).ppt
 
apriori.pptx
apriori.pptxapriori.pptx
apriori.pptx
 
Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...
Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...
Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...
 
BAS 250 Lecture 4
BAS 250 Lecture 4BAS 250 Lecture 4
BAS 250 Lecture 4
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
 
What goes with what (Market Basket Analysis)
What goes with what (Market Basket Analysis)What goes with what (Market Basket Analysis)
What goes with what (Market Basket Analysis)
 

Recently uploaded

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 

Recently uploaded (20)

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 

Association Rule Mining in Data Mining

  • 2. Association Analysis • Discovery of Association Rules – showing attribute-value conditions that occur frequently together in a set of data, e.g. market basket – Given a set of data, find rules that will predict the occurrence of a data item based on the occurrences of other items in the data • A rule has the form body ⇒head – buys(Omar, “milk”) ⇒ buys(Omar, “sugar”)
  • 4. Association Analysis Location Business Type 1 Barber, Bakery, Convenience Store, Meat Shop, Fast Food 2 Bakery, Bookstore, Petrol Pump, Convenience Store, Library, Fast Food 3 Carpenter, Electrician, Barber, Hardware Store, 4 Bakery, Vegetable Market, Flower Shop, Sweets Shop, Meat Shop 5 Convenience Store, Hospital, Pharmacy, Sports Shop, Gym, Fast Food 6 Internet Café, Gym, Games Shop, Shorts Shop, Fast Food, Bakery Association Rule: X Y ; (Fast Food, Bakery)  (Convenience Store) Support S: Fraction of items that contain both X and Y = P(X U Y) S(Fast Food, Bakery, Convenience Store) = 2/6 = .33 Confidence C: how often items in Y appear in locations that contain X = P(X U Y) C[(Fast Food, Bakery)  (Convenience Store)] = P(X U Y) / P(X) = 0.33/0.50 = .66
  • 5. Association Analysis • Given a set of transactions T, the goal of association rule mining is to find all rules having – support ≥ minsup threshold – confidence ≥ minconf threshold • Brute-force approach: – List all possible association rules – Compute the support and confidence for each rule – Prune rules that fail the minsup and minconf thresholds ⇒ Computationally prohibitive!
  • 6. Association Analysis Location Business Type 1 Barber, Bakery, Convenience Store, Meat Shop, Fast Food, Meat Shop 2 Bakery, Bookstore, Petrol Pump, Convenience Store, Library, Fast Food 3 Carpenter, Electrician, Barber, Hardware Store, Meat Shop 4 Bakery, Vegetable Market, Flower Shop, Sweets Shop, Meat Shop 5 Convenience Store, Hospital, Pharmacy, Sports Shop, Gym, Fast Food 6 Internet Café, Gym, Sweets Shop, Shorts Shop, Fast Food, Bakery Association Rules: (Fast Food, Bakery)  (Convenience Store) Support S: .33 Confidence C: .66 (Convenience Store, Bakery)  (Fast Food) Support S: .33 Confidence C: .50 (Fast Food, Convenience Store)  (Bakery) Support S: .33 Confidence C: .55 (Convenience Store)  (Fast Food, Bakery) Support S: .33 Confidence C: .66 (Fast Food)  (Convenience Store, Bakery) Support S: .33 Confidence C: 1 (Bakery)  (Fast Food, Convenience Store) Support S: .33 Confidence C: .66
  • 7. Association Analysis Association Rules: (Fast Food, Bakery)  (Convenience Store) Support S: .33 Confidence C: .66 (Convenience Store, Bakery)  (Fast Food) Support S: .33 Confidence C: .50 (Fast Food, Convenience Store)  (Bakery) Support S: .33 Confidence C: .66 (Convenience Store)  (Fast Food, Bakery) Support S: .33 Confidence C: .66 (Fast Food)  (Convenience Store, Bakery) Support S: .33 Confidence C: 1 (Bakery)  (Fast Food, Convenience Store) Support S: .33 Confidence C: .66 Observations  Above rules are binary partitions of given item set  Identical Support but different Confidence  Support and Confidence thresholds may be different
  • 8. Mining Association Rules • Two-step approach: Step 1. Frequent Itemset Generation Generate all itemsets whose support ≥ minsup Step 2. Rule Generation Generate high confidence rules from each frequent itemset, where each rule is a binary partitioning of a frequent itemset Note: Frequent itemset generation is still computationally expensive
  • 9. Mining Association Rules • Frequent Item Generation Lattice Graph of possible item sets
  • 10. Mining Association Rules • Brute-force approach: – Each node in the lattice graphs is a candidate frequent itemset – Count the support of each candidate by scanning the database – N = 6 – w = (Barber, Bakery, Convenience Store, Meat Shop, Fast Food, Bookstore, Petrol Pump, Library, Carpenter, Electrician, Hardware Store, Vegetable Market, Flower Shop, Sweets Shop, Meat Shop, Hospital, Pharmacy, Sports Shop, Gym, Internet Café) = 20 – M = 220 = 1048576 – Complexity ~ O (NMw)
  • 11. Mining Association Rules W Unique Items in Item set
  • 12. Mining Association Rules • Frequent Itemset Generation – Reduce the number of candidates (M) – Reduce the number of transactions/locations (N) – Reduce the number of comparisons (NM) • Use efficient data structures to store the candidates • No need to match every candidate against every transaction/location
  • 13. Reducing the number of candidates • Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent • Important Support property: – Support of an itemset never exceeds the support of its subsets – This is known as the anti-monotone property of support
  • 14. Reducing the number of candidates Applying Apriori principle
  • 15. Reducing the number of candidates • N = 20 • All Possible candidate sets; – NC1 + NC2 + NC3 + … + NCN • Minimum Occurrence Based Filtering Set m= 2 and L = 1 While (L < N){ Scan DB: List = Create Occurrence Frequency Table of candidate sets of Length L If no candidate in List then Break; Filter all candidate sets with Occurrence Frequency < m Create new candidate set of Length (L=L+1) from List }
  • 16. Filter Minimum Occurrences m < 2 Reducing the number of candidates Business Type Count Barber 2 Bakery 2 Book tore 1 Carpenter 1 Convenience Store 3 Electrician 1 Fast Food 3 Flower Shop 1 Gym 1 Games Shop 1 Hardware Store 1 Hospital 1 Internet Café 1 Library 1 Meat Shop 1 Petrol Pump 1 Pharmacy 1 Sports Shop 1 Sweets Shop 1 Vegetable Market 1 Business Type Count Barber 2 Bakery 2 Convenience Store 3 Fast Food 3 Filter Scan 1 Business Type Count (Barber, Bakery) 1 (Barber, Convenience Store) 1 (Barber, Fast Food) 1 (Bakery, Convenience Store) 2 (Bakery, Fast Food) 3 (Convenience Store, Fast Food) 3 Pairs of Two Items; 4C2 = 6 Business Type Count (Bakery, Convenience Store) 2 (Bakery, Fast Food) 3 (Convenience Store, Fast Food) 3 Filter Minimum Occurrences m < 2 L1 L2