Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Association-Analysis.pdf
1. Association Rules and
Frequent Pattern
Analysis
Dr. Iqbal H. Sarker
Dept of CSE, CUET
Research LAB Web:
Sarker DataLAB
(http://sarkerdatalab.com/)
Machine Learning Slide 1
Iqbal H. Sarker
2. Today’s Agenda
Introduction to Association Rules
Motivation with Examples
Algorithms
How it works?
Real life Application Areas
Summary
Slide 2
Iqbal H. Sarker Machine Learning
3. Introduction to AR
Ideas come from the market basket analysis (MBA)
◼ Let’s go shopping!
Milk, eggs, sugar,
bread
Eggs, sugar
Milk, eggs, cereal,
bread
Customer1
Customer2 Customer3
◼ What do my customer buy? Which product are bought together?
◼ Aim: Find associations and correlations between the different
items that customers place in their shopping basket
Slide 3
Iqbal H. Sarker Machine Learning
4. Association rule learning is
a rule-based machine
learning method for discovering
interesting relations between
variables in large databases.
Iqbal H. Sarker Machine Learning Slide 4
6. Introduction to AR
Formalizing the problem a little bit
◼ Transaction Database T: a set of transactions T = {t1, t2, …, tn}
◼ Each transaction contains a set of items I (item set)
◼ An itemset is a collection of items I = {i1, i2, …, im}
General aim:
◼ Find frequent/interesting patterns, associations, correlations, or
causal structures among sets of items or elements in
databases or other information repositories.
◼ Put this relationships in terms of association rules
➢ X Y
Slide 6
Iqbal H. Sarker Machine Learning
7. What’s an Interesting Rule?
An association rule is an TID Items
implication of two itemsets
◼ X Y
T1
T2
T3
T4
T5
bread, jelly, peanut-butter
bread, peanut-butter
bread, milk, peanut-butter
beer, bread
beer, milk
Many measures of interest.
The two most used are:
◼ Support (s)
➢ The occurring frequency of the rule,
i.e., number of transactions that
contain both X and Y
s =
(X Y)
No.of trans.
◼ Confidence (c)
➢ The strength of the association,
i.e., measures of how often items in Y
Slide 7
Iqbal H. Sarker Machine Learning
appear in transactions that contain X
c =
(X Y )
(X)
8. 8
Mining Association Rules—an Example
For rule A C:
support = support({A}{C}) = 50%
confidence = support({A}{C})/support({A}) = 66.6%
Min. support 50%
Min. confidence 50%
Transaction-id Items bought
10 A, B, C
20 A, C
30 A, D
40 B, E, F
Frequent pattern Support
{A} 75%
{B} 50%
{C} 50%
{A, C} 50%
Machine Learning
Iqbal H. Sarker
9. The Apriori Algorithm: Basics
The name, Apriori, is based on the fact that the algorithm
uses prior knowledge of frequent itemset properties
It consists of two steps
1. Generate all frequent itemsets whose support ≥
minsup
2. Use frequent itemsets to generate association rules
So, let’s pay attention to the first step
Slide 9
Iqbal H. Sarker Machine Learning
10. Apriori
null
A B C D E
AB AD
AC AE BD
BC BE CE
CD DE
ABC ABE
ABD ACD ADE
ACE BCD BDE
BCE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
Given n items, we have 2^n possible itemsets.
◼ Do we have to generate them all?
Slide 10
Iqbal H. Sarker Machine Learning
11. Apriori
Let’s avoid expanding all the graph
Key idea:
◼ Use Apriori Property: Any subsets of a frequent itemset are
also frequent itemsets
Therefore, the algorithm iteratively does:
◼ Create itemsets
◼ Only continue exploration of those whose support ≥ minsup
Slide 11
Iqbal H. Sarker Machine Learning
12. Apriori: Pseudo-code
Iqbal H. Sarker Machine Learning Slide 12
Join Step: Ck is generated by joining Lk-1with itself
Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a
frequent k-itemset
Pseudo-code:
Ck: Candidate itemset of size k
Lk : frequent itemset of size k
L1 = {frequent items};
for (k = 1; Lk !=; k++) do begin
Ck+1 = candidates generated from Lk;
for each transaction t in database do
increment the count of all candidates in Ck+1 that are contained in t
Lk+1 = candidates in Ck+1 with min_support
end
return k Lk;
13. Illustration of the Apriori
principle
Found to be
Infrequent
null
AB AC AD AE BC BD BE CD CE DE
A B C D E
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
null
AB AC AD AE BC BD BE CD CE DE
A B C D E
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
Pruned
Infrequent supersets
14. Another Example
null
Infrequent
itemset
A B C D E
AB AD
AC AE BD
BC BE CE
CD DE
ABC ABE
ABD ACD ADE
ACE BCD BDE
BCE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
Slide 14
Iqbal H. Sarker Machine Learning
16. Apriori
Remember that Apriori consists of two steps
1. Generate all frequent itemsets whose support ≥ minsup
2. Use frequent itemsets to generate association rules
We accomplished step 1. So we have all frequent
itemsets
So, let’s pay attention to the second step
Slide 16
Iqbal H. Sarker Machine Learning
17. Rule Generation in Apriori
Given a frequent itemset L
◼ Find all non-empty subsets F in L, such that the association
rule F {L-F} satisfies the minimum confidence
◼ Create the rule F {L-F}
If L={A,B,C}
◼ The candidate itemsets are: ABC, ACB, BCA, ABC,
BAC, CAB
◼ In general, there are 2K-2 candidate solutions, where k is the
length of the itemset L
Slide 17
Iqbal H. Sarker Machine Learning