SlideShare a Scribd company logo
Mining frequent patterns and associations
What is Concept Description?
• As we know that data mining functionalities can be classified into two categories:
1. Descriptive
2. Predictive
▪ Descriptive
• This task presents the general properties of data stored in a database.
• The descriptive tasks are used to find out patterns in data.
• E.g.: Cluster, Trends, etc.
▪ Predictive
• These tasks predict the value of one attribute on the basis of values of other attributes.
• E.g.: Festival Customer/Product Sell prediction at store
What is Concept Description? Cont..
• Descriptive data mining describes the data set in a concise and
summative manner and presents interesting general properties of the
data.
• Predictive data mining analyzes the data in order to construct one or a
set of models, and attempts to predict the behavior of new data sets.
• Database is usually storing the large amounts of data in great detail.
However users often like to view sets of summarized data in concise,
descriptive terms.
• A concept usually refers to a collection of concise or summarized data
such as grade wise students, products selling on festival etc.
What is Concept Description? Cont..
• Concept description generates descriptions for characterization and
comparison of the data, it is also called class description.
• Characterization provides a concise and brief summarization of the data.
• While concept or class comparison (also known as discrimination)
provides discriminations (inequity) comparing two or more collections of
data.
• Example
• Given the ABC Company database, for example, examining individual customer
transactions.
• Sales managers may prefer to view the data generalized to higher levels, such as
summarized by customer groups according to geographic regions, frequency of
purchases per group and customer income.
The Problem
When we go grocery shopping, we often have a standard list of things to buy. Each shopper
has a distinctive list, depending on one’s needs and preferences. A housewife might buy
healthy ingredients for a family dinner, while a bachelor might buy beer and chips.
Understanding these buying patterns can help to increase sales in several ways. If there is a
pair of items, X and Y, that are frequently bought together:
Both X and Y can be placed on the same shelf, so that buyers of one item would be
prompted to buy the other.
Promotional discounts could be applied to just one out of the two items.
Advertisements on X could be targeted at buyers who purchase Y.
X and Y could be combined into a new product, such as having Y in flavors of X.
While we may know that certain items are frequently bought together, the question
is, how do we uncover these associations?
Besides increasing sales profits, association rules can also be used in other fields.
In medical diagnosis for instance, understanding which symptoms tend to co-
morbid can help to improve patient care and medicine prescription.
Association rules analysis is a technique to uncover how items are associated to
each other. There are three common ways to measure association.
Market Basket Analysis
• Market Basket Analysis is a modelling technique to find frequent itemset.
• It is based on, if you buy a certain group of items, you are more (or less) likely to buy
another group of items.
• For example, if you are in a store and you buy a car then you are more likely to buy
insurance at the same time than somebody who don't buy insurance also.
• The set of items that a customer buys it referred as an itemset.
• Market basket analysis seeks to find relationships between purchases (Items).
• E.g. IF {Car, Accessories} THEN {Insurance}
{Car, Accessories} 🡪 {Insurance}
The probability that a customer will buy
car without an accessories is referred as
the support for rule.
The conditional probability that a
customer will purchase Insurance is
referred to as the confidence.
Association Rule Mining
• Given a set of transactions, we need rules that will predict the
occurrence of an item based on the occurrences of other items in the
transaction.
• Market-Basket transactions
TI
D
Items
1 Bread, Milk
2 Bread, Chocolate, Pepsi, Eggs
3 Milk, Chocolate, Pepsi, Coke
4 Bread, Milk, Chocolate, Pepsi
5 Bread, Milk, Chocolate, Coke
Example of Association
Rules
{Chocolate} → {Pepsi},
{Milk, Bread} → {Eggs, Coke},
{Pepsi, Bread} → {Milk}
Association Rule Mining Cont..
Itemset
• A collection of one or more items
o E.g. : {Milk, Bread, Chocolate}
• k-itemset
An itemset that contains k items
Support count (σ)
• Frequency of occurrence of an itemset
o E.g. σ({Milk, Bread, Chocolate}) = 2
Support
• Fraction of transactions that contain an itemset
o E.g. s({Milk, Bread, Chocolate}) = 2/5
Frequent Itemset
• An itemset whose support is greater than or equal to a
minimum support threshold
• Lift(l) –
The lift of the rule X=>Y is the confidence of the rule divided by the expected confidence,
assuming that the itemsets X and Y are independent of each other.The expected confidence is
the confidence divided by the frequency of {Y}.
TI
D
Items
1 Bread, Milk
2 Bread, Chocolate, Pepsi, Eggs
3 Milk, Chocolate, Pepsi, Coke
4 Bread, Milk, Chocolate, Pepsi
5 Bread, Milk, Chocolate, Coke
Association Rule Mining Cont..
• Association Rule
• An implication expression of the form X → Y, where X and Y are
item sets
• E.g.: {Milk, Chocolate} → {Pepsi}
• Rule Evaluation
• Support (s)
• Fraction of transactions that contain both X and Y
• Confidence (c)
• Measures how often items in Y appear in transactions that contain X
TI
D
Items
1 Bread, Milk
2 Bread, Chocolate, Pepsi, Eggs
3 Milk, Chocolate, Pepsi, Coke
4 Bread, Milk, Chocolate, Pepsi
5 Bread, Milk, Chocolate, Coke
Example:
Find support & confidence for {Milk, Chocolate} ⇒ Pepsi
Association Rule Mining Cont..
TI
D
Items
1 Bread, Milk
2 Bread, Chocolate, Pepsi, Eggs
3 Milk, Chocolate, Pepsi, Coke
4 Bread, Milk, Chocolate, Pepsi
5 Bread, Milk, Chocolate, Coke
Calculate Support &
Confidence
Support (s) : 0.4
1. {Milk, Chocolate} → {Pepsi} c = 0.67
2. {Milk, Pepsi} → {Chocolate} c = 1.0
3. {Chocolate, Pepsi} → {Milk} c = 0.67
4. {Pepsi} → {Milk, Chocolate} c = 0.67
5. {Chocolate} → {Milk, Pepsi} c = 0.5
6. {Milk} → {Chocolate, Pepsi} c = 0.5
A common strategy adopted by many association rule mining algorithms is to decompose the problem into
two major subtasks:
1. Frequent Itemset Generation
• The objective is to find all the item-sets that satisfy the minimum support threshold.
• These item sets are called frequent item sets.
2. Rule Generation
• The objective is to extract all the high-confidence rules from the frequent item sets found in the
previous step.
• These rules are called strong rules.
Apriori Algorithm - Example
Scan
D
C
1
L
1
C
2
Scan
D
C
2
L
2
Minimum Support =
2
ItemS
et
Min.
Sup
{1} 2
{2} 3
{3} 3
{4} 1
{5} 3
ItemS
et
Min.
Sup
{1} 2
{2} 3
{3} 3
{5} 3
ItemS
et
{1 2}
{1 3}
{1 5}
{2 3}
{2 5}
{3 5}
ItemS
et
Min.
Sup
{1 2} 1
{1 3} 2
{1 5} 1
{2 3} 2
{2 5} 3
{3 5} 2
ItemS
et
Min.
Sup
{1 3} 2
{2 3} 2
{2 5} 3
{3 5} 2
Scan
D
TID Items
100 1 3 4
200 2 3 5
300 1 2 3 5
400 2 5
ItemS
et
Min.
Sup
{1 2 3} 1
{1 2 5} 1
{1 3 5} 1
{2 3 5} 2
Apriori Algorithm - Example Cont..
Rules Generation
Minimum Support =
2
Association
Rule
Support Confidenc
e
Confidence (%)
2 ^ 3 🡪 5 2 2/2 = 1 100 %
3 ^ 5 🡪 2 2 2/2 = 1 100 %
2 ^ 5 🡪 3 2 2/3 = 0.66 66%
2 = 3 ^ 5 2 2/3 = 0.66 66%
3 = 2 ^ 5 2 2/3 = 0.66 66%
5 = 2 ^ 3 2 2/3 = 0.66 66%
Apriori Algorithm
• Purpose: The Apriori Algorithm is an influential algorithm for mining
frequent itemsets for Boolean association rules.
• Key Concepts:
• Frequent Itemsets:
• The sets of item which has minimum support (denoted by Li for ith-Itemset).
• Apriori Property:
• Any subset of frequent itemset must be frequent.
• Join Operation:
• To find Lk, a set of candidate k-itemsets is generated by joining Lk-1 itself.
Apriori Algorithm Cont..
Find the frequent itemsets
• The sets of items that have minimum support and a subset of a frequent itemset must also be a
frequent itemset (Apriori Property).
• E.g. if {AB} is a frequent itemset, both {A} and {B} should be a frequent itemset.
• Use the frequent item sets to generate association rules.
The Apriori Algorithm : Pseudo code
• Join Step: Ck is generated by joining Lk-1with itself
• Prune Step: Any (k-1) itemset that is not frequent cannot be a subset of a frequent k-itemset
Ck: Candidate itemset of size k
Lk: Frequent itemset of size k
L1= {frequent items};
for (k = 1; Lk != ∅; k++) do begin
Ck+1 = candidates generated from Lk;
for each transaction t in database do
Increment the count of all candidates in Ck+1
That are contained in t
Lk+1 = candidates in Ck+1 with min_support
end
return ∪k Lk;
Apriori Algorithm Steps
Step 1:
• Start with itemsets containing just a single item (Individual items).
Step 2:
• Determine the support for itemsets.
• Keep the itemsets that meet your minimum support threshold and remove
itemsets that do not support minimum support.
Step 3:
• Using the itemsets you have kept from Step 1, generate all the possible itemset
combinations.
Step 4:
• Repeat steps 1 & 2 until there are no more new itemsets.
Apriori Algorithm (Try Yourself!)
A database has 4 transactions. Let Min_sup = 50% and Min_conf =
75%
Sr. Association Rule Support Confidence Confidence (%)
Rule 1 Butter^Milk 🡪 Bread 2 2/2 = 1 100%
Rule 2 Milk^Bread 🡪 Butter 2 2/2 = 1 100%
Rule 3 Butter^Bread 🡪 Milk 2 2/3 = 0.66 66%
Rule 4 Butter🡪Milk^Bread 2 2/3 = 0.66 66%
Rule 5 Milk🡪Butter^Bread 2 2/3 = 0.66 66%
Rule 6 Bread🡪Butter^Milk 2 2/3 = 0.66 66%
TID Items
1000 Cheese, Milk, Cookies
2000 Butter, Milk, Bread
3000 Cheese, Butter, Milk, Bread
4000 Butter, Bread
Frequent Itemset Sup
Butter,Milk,Bread 2
Min_sup = 50%
How to convert support in
integer?
Given % X Total Records
100
So here, 50 X 4
100
=
2
FP-Growth Algorithm
The FP-Growth Algorithm is proposed by Han.
It is an efficient and scalable method for mining the complete set of
frequent patterns.
Using prefix-tree structure for storing information about frequent
patterns named frequent-pattern tree (FP-tree).
Once an FP-tree has been constructed, it uses a recursive divide-and-
conquer approach to mine the frequent item sets.
Arranged
Order
A : 8
C : 8
E : 8
G : 5
B : 2
D : 2
F : 2
Remaining
all
O,I,J,L,P,M
& N is with
min_sup =
1
FP-Growth Algorithm - Example
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
Step:1
Freq. 1-
Itemsets.
Min_Sup ≥ 2
Step:2
Transactions with items sorted
based on frequencies, and ignoring
the infrequent items.
FP-Tree Generation
Building the FP-Tree
✔ Scan data to determine the
support count of each item.
✔ Infrequent items are
discarded, while the frequent
items are sorted in
decreasing support counts.
✔ Make a second pass over the
data to construct the FP-tree.
✔ As the transactions are read,
before being processed, their
items are sorted according to
the above order.
TID Items
1 A B C E F
O
2 A C G
3 E I
4 A C D E G
5 A C E G L
6 E J
7 A B C E F
P
8 A C D
9 A C E G M
10 A C E G N
Minimum Support =
2
FP-Tree after reading 1st transaction
A:8
C:8
E:8
G:5
B:2
D:2
F:2
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
nul
l
A:1
C:1
E:1
B:1
F:1
Header
Prof. Naimish R Vadodariya 24
#3160714 (DM) ⬥ Unit 3 – Concept Description, Mining Frequent Patterns, Associations & Correlations
FP-Tree after reading 2nd transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:2
C:2
E:1
B:1
F:1
Header
FP-Tree after reading 3rd transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:2
C:2
E:1
B:1
F:1
Header
E:1
FP-Tree after reading 4th transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:3
C:3
E:2
B:1
F:1
Header
E:1
G:1
D:1
FP-Tree after reading 5th transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:4
C:4
E:3
B:1
F:1
Header
E:1
G:2
D:1
FP-Tree after reading 6th transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:4
C:4
E:3
B:1
F:1
Header
E:2
G:2
D:1
FP-Tree after reading 7th transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:5
C:5
E:4
B:2
F:2
Header
E:2
G:2
D:1
FP-Tree after reading 8th transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:6
C:6
E:4
B:2
F:2
Header
E:2
G:2
D:1
D:1
FP-Tree after reading 9th transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:7
C:7
E:5
B:2
F:2
Header
E:2
G:3
D:1
D:1
FP-Tree after reading 10th transaction
A C E B F
A C G
E
A C E G D
A C E G
E
A C E B F
A C D
A C E G
A C E G
G:1
A:8
C:8
E:8
G:5
B:2
D:2
F:2
nul
l
A:8
C:8
E:6
B:2
F:2
Header
E:2
G:4
D:1
D:1
Apriori Algorithm – Example
Min-support = 60% and Confidence
= 80%
TID Items
100 1,3,4,6
200 2,3,5,7
300 1,2,3,5,8
400 2,5,9,10
500 1,4
Itemse
t
Support
2,5 3
Rule Support Confidence Conf. (%)
2 🡪 5 3 3/3 = 1 100%
5 🡪 2 3 3/3 = 1 100%
TID Items
1000 A,B,C
2000 A,C
3000 A,D
5000 B,E,F
Itemse
t
Support
A,C 2
Rule Support Confidence Conf. (%)
A 🡪 C 2 2/3 = 0.66 66%
C 🡪 A 2 2/2 = 1 100%
Min-support =
50%
FP-Growth Algorithm – Example
B:
8
A:
5
nul
l
C:
3
D:
1
A:
2
C:
1
D:
1
E:1
D:
1
E:1
C:
3
D:
1
D:
1
E:1
Header
FP-Tree
Minimum Support =
3
TID Items
1 A B
2 B C D
3 A C D E
4 A D E
5 A B C
6 A B C D
7 B C
8 A B C
9 A B D
10 B C E
Item Support
B 8
A 7
C 7
D 5
E 3

More Related Content

What's hot

Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
Searching algorithms
Searching algorithmsSearching algorithms
Searching algorithms
Trupti Agrawal
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
Utkarsh Sharma
 
Market Basket Analysis of bakery Shop
Market Basket Analysis of bakery ShopMarket Basket Analysis of bakery Shop
Market Basket Analysis of bakery Shop
VarunSahdev2
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
Mainul Hassan
 
Unit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptxUnit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptx
Dr.Shweta
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
Edureka!
 
Searching
SearchingSearching
Searching
Ashim Lamichhane
 
Market basket analysis using apriori algorithm on
Market basket analysis using apriori algorithm onMarket basket analysis using apriori algorithm on
Market basket analysis using apriori algorithm on
Madhu Kiran
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning
ANKUSH PAL
 
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
Simplilearn
 
Apriori
AprioriApriori
Binary Search - Design & Analysis of Algorithms
Binary Search - Design & Analysis of AlgorithmsBinary Search - Design & Analysis of Algorithms
Binary Search - Design & Analysis of Algorithms
Drishti Bhalla
 
data generalization and summarization
data generalization and summarization data generalization and summarization
data generalization and summarization
janani thirupathi
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
DataminingTools Inc
 
Linear and Binary search
Linear and Binary searchLinear and Binary search
Linear and Binary search
Nisha Soms
 
CS 402 DATAMINING AND WAREHOUSING -MODULE 3
CS 402 DATAMINING AND WAREHOUSING -MODULE 3CS 402 DATAMINING AND WAREHOUSING -MODULE 3
CS 402 DATAMINING AND WAREHOUSING -MODULE 3
NIMMYRAJU
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
Anuj Gupta
 
Decision tree
Decision treeDecision tree
Decision tree
Ami_Surati
 

What's hot (20)

kmean clustering
kmean clusteringkmean clustering
kmean clustering
 
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
 
Searching algorithms
Searching algorithmsSearching algorithms
Searching algorithms
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 
Market Basket Analysis of bakery Shop
Market Basket Analysis of bakery ShopMarket Basket Analysis of bakery Shop
Market Basket Analysis of bakery Shop
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Unit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptxUnit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptx
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
 
Searching
SearchingSearching
Searching
 
Market basket analysis using apriori algorithm on
Market basket analysis using apriori algorithm onMarket basket analysis using apriori algorithm on
Market basket analysis using apriori algorithm on
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning
 
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
 
Apriori
AprioriApriori
Apriori
 
Binary Search - Design & Analysis of Algorithms
Binary Search - Design & Analysis of AlgorithmsBinary Search - Design & Analysis of Algorithms
Binary Search - Design & Analysis of Algorithms
 
data generalization and summarization
data generalization and summarization data generalization and summarization
data generalization and summarization
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Linear and Binary search
Linear and Binary searchLinear and Binary search
Linear and Binary search
 
CS 402 DATAMINING AND WAREHOUSING -MODULE 3
CS 402 DATAMINING AND WAREHOUSING -MODULE 3CS 402 DATAMINING AND WAREHOUSING -MODULE 3
CS 402 DATAMINING AND WAREHOUSING -MODULE 3
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Decision tree
Decision treeDecision tree
Decision tree
 

Similar to MODULE 5 _ Mining frequent patterns and associations.pptx

apriori.pptx
apriori.pptxapriori.pptx
apriori.pptx
selvifitria1
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithm
hktripathy
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules
FEG
 
AssociationRule.pdf
AssociationRule.pdfAssociationRule.pdf
AssociationRule.pdf
WailaBaba
 
Unit 4_ML.pptx
Unit 4_ML.pptxUnit 4_ML.pptx
Unit 4_ML.pptx
SmithaRaj16
 
6. Association Rule.pdf
6. Association Rule.pdf6. Association Rule.pdf
6. Association Rule.pdf
Jyoti Yadav
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data Mining
Kamal Acharya
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
Sulman Ahmed
 
BIM Data Mining Unit4 by Tekendra Nath Yogi
 BIM Data Mining Unit4 by Tekendra Nath Yogi BIM Data Mining Unit4 by Tekendra Nath Yogi
BIM Data Mining Unit4 by Tekendra Nath Yogi
Tekendra Nath Yogi
 
Association 04.03.14
Association   04.03.14Association   04.03.14
Association 04.03.14
rahulmath80
 
Mining Frequent Patterns And Association Rules
Mining Frequent Patterns And Association RulesMining Frequent Patterns And Association Rules
Mining Frequent Patterns And Association Rules
Rashmi Bhat
 
big data seminar.pptx
big data seminar.pptxbig data seminar.pptx
big data seminar.pptx
AmenahAbbood
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
Smarten Augmented Analytics
 
Data mining techniques unit III
Data mining techniques unit IIIData mining techniques unit III
Data mining techniques unit III
malathieswaran29
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithm
hina firdaus
 
Association Rule Mining
Association Rule MiningAssociation Rule Mining
Association Rule Mining
PALLAB DAS
 
Market Basket Analysis
Market Basket AnalysisMarket Basket Analysis
Market Basket Analysis
Sandeep Prasad
 
DM -Unit 2-PPT.ppt
DM -Unit 2-PPT.pptDM -Unit 2-PPT.ppt
DM -Unit 2-PPT.ppt
raju980973
 
Data Mining Concepts 15061
Data Mining Concepts 15061Data Mining Concepts 15061
Data Mining Concepts 15061badirh
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
dataminers.ir
 

Similar to MODULE 5 _ Mining frequent patterns and associations.pptx (20)

apriori.pptx
apriori.pptxapriori.pptx
apriori.pptx
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithm
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules
 
AssociationRule.pdf
AssociationRule.pdfAssociationRule.pdf
AssociationRule.pdf
 
Unit 4_ML.pptx
Unit 4_ML.pptxUnit 4_ML.pptx
Unit 4_ML.pptx
 
6. Association Rule.pdf
6. Association Rule.pdf6. Association Rule.pdf
6. Association Rule.pdf
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data Mining
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
 
BIM Data Mining Unit4 by Tekendra Nath Yogi
 BIM Data Mining Unit4 by Tekendra Nath Yogi BIM Data Mining Unit4 by Tekendra Nath Yogi
BIM Data Mining Unit4 by Tekendra Nath Yogi
 
Association 04.03.14
Association   04.03.14Association   04.03.14
Association 04.03.14
 
Mining Frequent Patterns And Association Rules
Mining Frequent Patterns And Association RulesMining Frequent Patterns And Association Rules
Mining Frequent Patterns And Association Rules
 
big data seminar.pptx
big data seminar.pptxbig data seminar.pptx
big data seminar.pptx
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
 
Data mining techniques unit III
Data mining techniques unit IIIData mining techniques unit III
Data mining techniques unit III
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithm
 
Association Rule Mining
Association Rule MiningAssociation Rule Mining
Association Rule Mining
 
Market Basket Analysis
Market Basket AnalysisMarket Basket Analysis
Market Basket Analysis
 
DM -Unit 2-PPT.ppt
DM -Unit 2-PPT.pptDM -Unit 2-PPT.ppt
DM -Unit 2-PPT.ppt
 
Data Mining Concepts 15061
Data Mining Concepts 15061Data Mining Concepts 15061
Data Mining Concepts 15061
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 

More from nikshaikh786

Module 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptxModule 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptx
nikshaikh786
 
Module 1_ Introduction.pptx
Module 1_ Introduction.pptxModule 1_ Introduction.pptx
Module 1_ Introduction.pptx
nikshaikh786
 
Module 1_ Introduction to Mobile Computing.pptx
Module 1_  Introduction to Mobile Computing.pptxModule 1_  Introduction to Mobile Computing.pptx
Module 1_ Introduction to Mobile Computing.pptx
nikshaikh786
 
Module 2_ GSM Mobile services.pptx
Module 2_  GSM Mobile services.pptxModule 2_  GSM Mobile services.pptx
Module 2_ GSM Mobile services.pptx
nikshaikh786
 
Module 2_ Cyber offenses & Cybercrime.pptx
Module 2_ Cyber offenses & Cybercrime.pptxModule 2_ Cyber offenses & Cybercrime.pptx
Module 2_ Cyber offenses & Cybercrime.pptx
nikshaikh786
 
Module 1- Introduction to Cybercrime.pptx
Module 1- Introduction to Cybercrime.pptxModule 1- Introduction to Cybercrime.pptx
Module 1- Introduction to Cybercrime.pptx
nikshaikh786
 
MODULE 5- EDA.pptx
MODULE 5- EDA.pptxMODULE 5- EDA.pptx
MODULE 5- EDA.pptx
nikshaikh786
 
MODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptxMODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptx
nikshaikh786
 
Module 3 - Time Series.pptx
Module 3 - Time Series.pptxModule 3 - Time Series.pptx
Module 3 - Time Series.pptx
nikshaikh786
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptx
nikshaikh786
 
MODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptxMODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptx
nikshaikh786
 
IOE MODULE 6.pptx
IOE MODULE 6.pptxIOE MODULE 6.pptx
IOE MODULE 6.pptx
nikshaikh786
 
MAD&PWA VIVA QUESTIONS.pdf
MAD&PWA VIVA QUESTIONS.pdfMAD&PWA VIVA QUESTIONS.pdf
MAD&PWA VIVA QUESTIONS.pdf
nikshaikh786
 
VIVA QUESTIONS FOR DEVOPS.pdf
VIVA QUESTIONS FOR DEVOPS.pdfVIVA QUESTIONS FOR DEVOPS.pdf
VIVA QUESTIONS FOR DEVOPS.pdf
nikshaikh786
 
IOE MODULE 5.pptx
IOE MODULE 5.pptxIOE MODULE 5.pptx
IOE MODULE 5.pptx
nikshaikh786
 
Mad&pwa practical no. 1
Mad&pwa practical no. 1Mad&pwa practical no. 1
Mad&pwa practical no. 1
nikshaikh786
 
Ioe module 4
Ioe module 4Ioe module 4
Ioe module 4
nikshaikh786
 
Ioe module 3
Ioe module 3Ioe module 3
Ioe module 3
nikshaikh786
 
Ioe module 2
Ioe module 2Ioe module 2
Ioe module 2
nikshaikh786
 
Ioe module 1
Ioe module 1Ioe module 1
Ioe module 1
nikshaikh786
 

More from nikshaikh786 (20)

Module 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptxModule 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptx
 
Module 1_ Introduction.pptx
Module 1_ Introduction.pptxModule 1_ Introduction.pptx
Module 1_ Introduction.pptx
 
Module 1_ Introduction to Mobile Computing.pptx
Module 1_  Introduction to Mobile Computing.pptxModule 1_  Introduction to Mobile Computing.pptx
Module 1_ Introduction to Mobile Computing.pptx
 
Module 2_ GSM Mobile services.pptx
Module 2_  GSM Mobile services.pptxModule 2_  GSM Mobile services.pptx
Module 2_ GSM Mobile services.pptx
 
Module 2_ Cyber offenses & Cybercrime.pptx
Module 2_ Cyber offenses & Cybercrime.pptxModule 2_ Cyber offenses & Cybercrime.pptx
Module 2_ Cyber offenses & Cybercrime.pptx
 
Module 1- Introduction to Cybercrime.pptx
Module 1- Introduction to Cybercrime.pptxModule 1- Introduction to Cybercrime.pptx
Module 1- Introduction to Cybercrime.pptx
 
MODULE 5- EDA.pptx
MODULE 5- EDA.pptxMODULE 5- EDA.pptx
MODULE 5- EDA.pptx
 
MODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptxMODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptx
 
Module 3 - Time Series.pptx
Module 3 - Time Series.pptxModule 3 - Time Series.pptx
Module 3 - Time Series.pptx
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptx
 
MODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptxMODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptx
 
IOE MODULE 6.pptx
IOE MODULE 6.pptxIOE MODULE 6.pptx
IOE MODULE 6.pptx
 
MAD&PWA VIVA QUESTIONS.pdf
MAD&PWA VIVA QUESTIONS.pdfMAD&PWA VIVA QUESTIONS.pdf
MAD&PWA VIVA QUESTIONS.pdf
 
VIVA QUESTIONS FOR DEVOPS.pdf
VIVA QUESTIONS FOR DEVOPS.pdfVIVA QUESTIONS FOR DEVOPS.pdf
VIVA QUESTIONS FOR DEVOPS.pdf
 
IOE MODULE 5.pptx
IOE MODULE 5.pptxIOE MODULE 5.pptx
IOE MODULE 5.pptx
 
Mad&pwa practical no. 1
Mad&pwa practical no. 1Mad&pwa practical no. 1
Mad&pwa practical no. 1
 
Ioe module 4
Ioe module 4Ioe module 4
Ioe module 4
 
Ioe module 3
Ioe module 3Ioe module 3
Ioe module 3
 
Ioe module 2
Ioe module 2Ioe module 2
Ioe module 2
 
Ioe module 1
Ioe module 1Ioe module 1
Ioe module 1
 

Recently uploaded

Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
MuhammadTufail242431
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
ShahidSultan24
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
Kamal Acharya
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 

Recently uploaded (20)

Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 

MODULE 5 _ Mining frequent patterns and associations.pptx

  • 1. Mining frequent patterns and associations
  • 2. What is Concept Description? • As we know that data mining functionalities can be classified into two categories: 1. Descriptive 2. Predictive ▪ Descriptive • This task presents the general properties of data stored in a database. • The descriptive tasks are used to find out patterns in data. • E.g.: Cluster, Trends, etc. ▪ Predictive • These tasks predict the value of one attribute on the basis of values of other attributes. • E.g.: Festival Customer/Product Sell prediction at store
  • 3. What is Concept Description? Cont.. • Descriptive data mining describes the data set in a concise and summative manner and presents interesting general properties of the data. • Predictive data mining analyzes the data in order to construct one or a set of models, and attempts to predict the behavior of new data sets. • Database is usually storing the large amounts of data in great detail. However users often like to view sets of summarized data in concise, descriptive terms. • A concept usually refers to a collection of concise or summarized data such as grade wise students, products selling on festival etc.
  • 4. What is Concept Description? Cont.. • Concept description generates descriptions for characterization and comparison of the data, it is also called class description. • Characterization provides a concise and brief summarization of the data. • While concept or class comparison (also known as discrimination) provides discriminations (inequity) comparing two or more collections of data. • Example • Given the ABC Company database, for example, examining individual customer transactions. • Sales managers may prefer to view the data generalized to higher levels, such as summarized by customer groups according to geographic regions, frequency of purchases per group and customer income.
  • 5. The Problem When we go grocery shopping, we often have a standard list of things to buy. Each shopper has a distinctive list, depending on one’s needs and preferences. A housewife might buy healthy ingredients for a family dinner, while a bachelor might buy beer and chips. Understanding these buying patterns can help to increase sales in several ways. If there is a pair of items, X and Y, that are frequently bought together: Both X and Y can be placed on the same shelf, so that buyers of one item would be prompted to buy the other. Promotional discounts could be applied to just one out of the two items. Advertisements on X could be targeted at buyers who purchase Y. X and Y could be combined into a new product, such as having Y in flavors of X.
  • 6. While we may know that certain items are frequently bought together, the question is, how do we uncover these associations? Besides increasing sales profits, association rules can also be used in other fields. In medical diagnosis for instance, understanding which symptoms tend to co- morbid can help to improve patient care and medicine prescription. Association rules analysis is a technique to uncover how items are associated to each other. There are three common ways to measure association.
  • 7.
  • 8.
  • 9.
  • 10. Market Basket Analysis • Market Basket Analysis is a modelling technique to find frequent itemset. • It is based on, if you buy a certain group of items, you are more (or less) likely to buy another group of items. • For example, if you are in a store and you buy a car then you are more likely to buy insurance at the same time than somebody who don't buy insurance also. • The set of items that a customer buys it referred as an itemset. • Market basket analysis seeks to find relationships between purchases (Items). • E.g. IF {Car, Accessories} THEN {Insurance} {Car, Accessories} 🡪 {Insurance} The probability that a customer will buy car without an accessories is referred as the support for rule. The conditional probability that a customer will purchase Insurance is referred to as the confidence.
  • 11. Association Rule Mining • Given a set of transactions, we need rules that will predict the occurrence of an item based on the occurrences of other items in the transaction. • Market-Basket transactions TI D Items 1 Bread, Milk 2 Bread, Chocolate, Pepsi, Eggs 3 Milk, Chocolate, Pepsi, Coke 4 Bread, Milk, Chocolate, Pepsi 5 Bread, Milk, Chocolate, Coke Example of Association Rules {Chocolate} → {Pepsi}, {Milk, Bread} → {Eggs, Coke}, {Pepsi, Bread} → {Milk}
  • 12. Association Rule Mining Cont.. Itemset • A collection of one or more items o E.g. : {Milk, Bread, Chocolate} • k-itemset An itemset that contains k items Support count (σ) • Frequency of occurrence of an itemset o E.g. σ({Milk, Bread, Chocolate}) = 2 Support • Fraction of transactions that contain an itemset o E.g. s({Milk, Bread, Chocolate}) = 2/5 Frequent Itemset • An itemset whose support is greater than or equal to a minimum support threshold • Lift(l) – The lift of the rule X=>Y is the confidence of the rule divided by the expected confidence, assuming that the itemsets X and Y are independent of each other.The expected confidence is the confidence divided by the frequency of {Y}. TI D Items 1 Bread, Milk 2 Bread, Chocolate, Pepsi, Eggs 3 Milk, Chocolate, Pepsi, Coke 4 Bread, Milk, Chocolate, Pepsi 5 Bread, Milk, Chocolate, Coke
  • 13. Association Rule Mining Cont.. • Association Rule • An implication expression of the form X → Y, where X and Y are item sets • E.g.: {Milk, Chocolate} → {Pepsi} • Rule Evaluation • Support (s) • Fraction of transactions that contain both X and Y • Confidence (c) • Measures how often items in Y appear in transactions that contain X TI D Items 1 Bread, Milk 2 Bread, Chocolate, Pepsi, Eggs 3 Milk, Chocolate, Pepsi, Coke 4 Bread, Milk, Chocolate, Pepsi 5 Bread, Milk, Chocolate, Coke Example: Find support & confidence for {Milk, Chocolate} ⇒ Pepsi
  • 14. Association Rule Mining Cont.. TI D Items 1 Bread, Milk 2 Bread, Chocolate, Pepsi, Eggs 3 Milk, Chocolate, Pepsi, Coke 4 Bread, Milk, Chocolate, Pepsi 5 Bread, Milk, Chocolate, Coke Calculate Support & Confidence Support (s) : 0.4 1. {Milk, Chocolate} → {Pepsi} c = 0.67 2. {Milk, Pepsi} → {Chocolate} c = 1.0 3. {Chocolate, Pepsi} → {Milk} c = 0.67 4. {Pepsi} → {Milk, Chocolate} c = 0.67 5. {Chocolate} → {Milk, Pepsi} c = 0.5 6. {Milk} → {Chocolate, Pepsi} c = 0.5 A common strategy adopted by many association rule mining algorithms is to decompose the problem into two major subtasks: 1. Frequent Itemset Generation • The objective is to find all the item-sets that satisfy the minimum support threshold. • These item sets are called frequent item sets. 2. Rule Generation • The objective is to extract all the high-confidence rules from the frequent item sets found in the previous step. • These rules are called strong rules.
  • 15. Apriori Algorithm - Example Scan D C 1 L 1 C 2 Scan D C 2 L 2 Minimum Support = 2 ItemS et Min. Sup {1} 2 {2} 3 {3} 3 {4} 1 {5} 3 ItemS et Min. Sup {1} 2 {2} 3 {3} 3 {5} 3 ItemS et {1 2} {1 3} {1 5} {2 3} {2 5} {3 5} ItemS et Min. Sup {1 2} 1 {1 3} 2 {1 5} 1 {2 3} 2 {2 5} 3 {3 5} 2 ItemS et Min. Sup {1 3} 2 {2 3} 2 {2 5} 3 {3 5} 2 Scan D TID Items 100 1 3 4 200 2 3 5 300 1 2 3 5 400 2 5 ItemS et Min. Sup {1 2 3} 1 {1 2 5} 1 {1 3 5} 1 {2 3 5} 2
  • 16. Apriori Algorithm - Example Cont.. Rules Generation Minimum Support = 2 Association Rule Support Confidenc e Confidence (%) 2 ^ 3 🡪 5 2 2/2 = 1 100 % 3 ^ 5 🡪 2 2 2/2 = 1 100 % 2 ^ 5 🡪 3 2 2/3 = 0.66 66% 2 = 3 ^ 5 2 2/3 = 0.66 66% 3 = 2 ^ 5 2 2/3 = 0.66 66% 5 = 2 ^ 3 2 2/3 = 0.66 66%
  • 17. Apriori Algorithm • Purpose: The Apriori Algorithm is an influential algorithm for mining frequent itemsets for Boolean association rules. • Key Concepts: • Frequent Itemsets: • The sets of item which has minimum support (denoted by Li for ith-Itemset). • Apriori Property: • Any subset of frequent itemset must be frequent. • Join Operation: • To find Lk, a set of candidate k-itemsets is generated by joining Lk-1 itself.
  • 18. Apriori Algorithm Cont.. Find the frequent itemsets • The sets of items that have minimum support and a subset of a frequent itemset must also be a frequent itemset (Apriori Property). • E.g. if {AB} is a frequent itemset, both {A} and {B} should be a frequent itemset. • Use the frequent item sets to generate association rules. The Apriori Algorithm : Pseudo code • Join Step: Ck is generated by joining Lk-1with itself • Prune Step: Any (k-1) itemset that is not frequent cannot be a subset of a frequent k-itemset Ck: Candidate itemset of size k Lk: Frequent itemset of size k L1= {frequent items}; for (k = 1; Lk != ∅; k++) do begin Ck+1 = candidates generated from Lk; for each transaction t in database do Increment the count of all candidates in Ck+1 That are contained in t Lk+1 = candidates in Ck+1 with min_support end return ∪k Lk;
  • 19. Apriori Algorithm Steps Step 1: • Start with itemsets containing just a single item (Individual items). Step 2: • Determine the support for itemsets. • Keep the itemsets that meet your minimum support threshold and remove itemsets that do not support minimum support. Step 3: • Using the itemsets you have kept from Step 1, generate all the possible itemset combinations. Step 4: • Repeat steps 1 & 2 until there are no more new itemsets.
  • 20. Apriori Algorithm (Try Yourself!) A database has 4 transactions. Let Min_sup = 50% and Min_conf = 75% Sr. Association Rule Support Confidence Confidence (%) Rule 1 Butter^Milk 🡪 Bread 2 2/2 = 1 100% Rule 2 Milk^Bread 🡪 Butter 2 2/2 = 1 100% Rule 3 Butter^Bread 🡪 Milk 2 2/3 = 0.66 66% Rule 4 Butter🡪Milk^Bread 2 2/3 = 0.66 66% Rule 5 Milk🡪Butter^Bread 2 2/3 = 0.66 66% Rule 6 Bread🡪Butter^Milk 2 2/3 = 0.66 66% TID Items 1000 Cheese, Milk, Cookies 2000 Butter, Milk, Bread 3000 Cheese, Butter, Milk, Bread 4000 Butter, Bread Frequent Itemset Sup Butter,Milk,Bread 2 Min_sup = 50% How to convert support in integer? Given % X Total Records 100 So here, 50 X 4 100 = 2
  • 21. FP-Growth Algorithm The FP-Growth Algorithm is proposed by Han. It is an efficient and scalable method for mining the complete set of frequent patterns. Using prefix-tree structure for storing information about frequent patterns named frequent-pattern tree (FP-tree). Once an FP-tree has been constructed, it uses a recursive divide-and- conquer approach to mine the frequent item sets.
  • 22. Arranged Order A : 8 C : 8 E : 8 G : 5 B : 2 D : 2 F : 2 Remaining all O,I,J,L,P,M & N is with min_sup = 1 FP-Growth Algorithm - Example A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G Step:1 Freq. 1- Itemsets. Min_Sup ≥ 2 Step:2 Transactions with items sorted based on frequencies, and ignoring the infrequent items. FP-Tree Generation Building the FP-Tree ✔ Scan data to determine the support count of each item. ✔ Infrequent items are discarded, while the frequent items are sorted in decreasing support counts. ✔ Make a second pass over the data to construct the FP-tree. ✔ As the transactions are read, before being processed, their items are sorted according to the above order. TID Items 1 A B C E F O 2 A C G 3 E I 4 A C D E G 5 A C E G L 6 E J 7 A B C E F P 8 A C D 9 A C E G M 10 A C E G N Minimum Support = 2
  • 23. FP-Tree after reading 1st transaction A:8 C:8 E:8 G:5 B:2 D:2 F:2 A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G nul l A:1 C:1 E:1 B:1 F:1 Header
  • 24. Prof. Naimish R Vadodariya 24 #3160714 (DM) ⬥ Unit 3 – Concept Description, Mining Frequent Patterns, Associations & Correlations FP-Tree after reading 2nd transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:2 C:2 E:1 B:1 F:1 Header
  • 25. FP-Tree after reading 3rd transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:2 C:2 E:1 B:1 F:1 Header E:1
  • 26. FP-Tree after reading 4th transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:3 C:3 E:2 B:1 F:1 Header E:1 G:1 D:1
  • 27. FP-Tree after reading 5th transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:4 C:4 E:3 B:1 F:1 Header E:1 G:2 D:1
  • 28. FP-Tree after reading 6th transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:4 C:4 E:3 B:1 F:1 Header E:2 G:2 D:1
  • 29. FP-Tree after reading 7th transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:5 C:5 E:4 B:2 F:2 Header E:2 G:2 D:1
  • 30. FP-Tree after reading 8th transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:6 C:6 E:4 B:2 F:2 Header E:2 G:2 D:1 D:1
  • 31. FP-Tree after reading 9th transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:7 C:7 E:5 B:2 F:2 Header E:2 G:3 D:1 D:1
  • 32. FP-Tree after reading 10th transaction A C E B F A C G E A C E G D A C E G E A C E B F A C D A C E G A C E G G:1 A:8 C:8 E:8 G:5 B:2 D:2 F:2 nul l A:8 C:8 E:6 B:2 F:2 Header E:2 G:4 D:1 D:1
  • 33. Apriori Algorithm – Example Min-support = 60% and Confidence = 80% TID Items 100 1,3,4,6 200 2,3,5,7 300 1,2,3,5,8 400 2,5,9,10 500 1,4 Itemse t Support 2,5 3 Rule Support Confidence Conf. (%) 2 🡪 5 3 3/3 = 1 100% 5 🡪 2 3 3/3 = 1 100% TID Items 1000 A,B,C 2000 A,C 3000 A,D 5000 B,E,F Itemse t Support A,C 2 Rule Support Confidence Conf. (%) A 🡪 C 2 2/3 = 0.66 66% C 🡪 A 2 2/2 = 1 100% Min-support = 50%
  • 34. FP-Growth Algorithm – Example B: 8 A: 5 nul l C: 3 D: 1 A: 2 C: 1 D: 1 E:1 D: 1 E:1 C: 3 D: 1 D: 1 E:1 Header FP-Tree Minimum Support = 3 TID Items 1 A B 2 B C D 3 A C D E 4 A D E 5 A B C 6 A B C D 7 B C 8 A B C 9 A B D 10 B C E Item Support B 8 A 7 C 7 D 5 E 3