LAB1:
APRIORI ALGORITHM
Prepared by: TA. Amany Adel Ali
a.adel@fci-cu.edu.eg
Definition Of Apriori Algorithm:
 Apriori: is a seminal algorithm for mining frequent itemsets for Boolean association
rules.
 Apriori uses a “Bottom Up Approach”, where frequent subsets are extended one item
at a time (a step known as candidate generation), and groups of candidates are tested
against the data.
Key Concepts:
 A frequent itemset: is a set of items that occur together frequently in a dataset.
 A frequent itemsets: All the sets which contain the item with the minimum support
(denoted by Li for ith itemset).
 Apriori property: All nonempty subsets of a frequent itemset must also be frequent.
Apriori Algorithm Example
 A typical example of frequent itemset mining is market basket analysis
Steps To Perform Apriori Algorithm
Apriori Algorithm: An Example
 |D| = 9
 min_sup = 2
Apriori Algorithm: An Example Cont.
Apriori Algorithm: An Example Cont.
Apriori Algorithm: An Example Cont.
Apriori Algorithm: An Example Cont.
 The algorithm uses L3 join L3 to generate a candidate set of 4-itemsets, C4.
 the join results in {I1, I2, I3, I5}
 itemset {I1, I2, I3, I5} is pruned because its subset {I2, I3, I5} is not frequent.
 C4= Ø , and the algorithm terminates.
Generating Association Rules from Frequent Itemsets
Generating Association Rules from Frequent Itemsets
Generating Association Rules from Frequent Itemsets
 Once the frequent itemsets from transactions in a database D have been found, it is
straightforward to generate strong association rules from them.
 strong association rules satisfy both minimum support and minimum confidence.
 For each frequent itemset l, generate all nonempty subsets of l.
 output the rule if confidence min conf threshold.
 Because the rules are generated from frequent itemsets, each one automatically satisfies
the minimum support.
Generating Association Rules from Frequent Itemsets
Generating Association Rules from Frequent Itemsets
 the minimum confidence threshold = 70%
 The frequent itemset X = {I1, I2,I5}.
 The nonempty subsets of X are {I1, I2}, {I1, I5}, {I2, I5}, {I1}, {I2}, and {I5}.
{I1, I2} → I5, confidence = 2/4 = 50%
{I1, I5} → I2, confidence = 2/2 = 100% Strong
{I2, I5} → I1, confidence = 2/2 = 100% Strong
{I1} → {I2, I5}, confidence = 2/6 = 33%
{I2} → {I1, I5}, confidence = 2/7 = 29%
{I5} → {I1, I2} confidence = 2/2 = 100% Strong
Generating Association Rules from Frequent Itemsets
 the minimum confidence threshold = 70%
 The frequent itemset Y = {I1, I2,I3}.
 The nonempty subsets of Y are {I1, I2}, {I1, I3}, {I2, I3}, {I1}, {I2}, and {I3}.
{I1, I2} → I3, confidence = 2/4 = 50%
{I1, I3} → I2, confidence = 2/4 = 50%
{I2, I3} → I1, confidence = 2/4 = 50%
{I1} → {I2, I3}, confidence = 2/6 = 33%
{I2} → {I1, I3}, confidence = 2/7 = 29%
{I3} → {I1, I2} confidence = 2/6 = 33%
Apriori Algorithm: Another Example
 Part(a): Apply the Apriori algorithm to the following data set:
Step-1: Index The Data
Step-2: Calculate The Support For Each One
Step-3: Continue To Calculate The Support
Generating Association Rules from Frequent Itemsets
 Part(b): Generate the strong Association Rules from the Frequent Itemsets
 the minimum confidence threshold = 70%
 The frequent itemset X = {Milk, Bread, Eggs} = {1,2,3}
 The nonempty subsets of X are {1, 2}, {1, 3}, {2, 3}, {1}, {2}, and {3}.
Generating Association Rules from Frequent Itemsets
Shopping Basket Analysis using SQL Server and Visual Server
 https://www.youtube.com/watch?v=GyakcKKGwGA
Data mining ..... Association rule mining

Data mining ..... Association rule mining

  • 1.
    LAB1: APRIORI ALGORITHM Prepared by:TA. Amany Adel Ali a.adel@fci-cu.edu.eg
  • 3.
    Definition Of AprioriAlgorithm:  Apriori: is a seminal algorithm for mining frequent itemsets for Boolean association rules.  Apriori uses a “Bottom Up Approach”, where frequent subsets are extended one item at a time (a step known as candidate generation), and groups of candidates are tested against the data.
  • 4.
    Key Concepts:  Afrequent itemset: is a set of items that occur together frequently in a dataset.  A frequent itemsets: All the sets which contain the item with the minimum support (denoted by Li for ith itemset).  Apriori property: All nonempty subsets of a frequent itemset must also be frequent.
  • 5.
    Apriori Algorithm Example A typical example of frequent itemset mining is market basket analysis
  • 6.
    Steps To PerformApriori Algorithm
  • 7.
    Apriori Algorithm: AnExample  |D| = 9  min_sup = 2
  • 8.
    Apriori Algorithm: AnExample Cont.
  • 9.
    Apriori Algorithm: AnExample Cont.
  • 10.
    Apriori Algorithm: AnExample Cont.
  • 11.
    Apriori Algorithm: AnExample Cont.  The algorithm uses L3 join L3 to generate a candidate set of 4-itemsets, C4.  the join results in {I1, I2, I3, I5}  itemset {I1, I2, I3, I5} is pruned because its subset {I2, I3, I5} is not frequent.  C4= Ø , and the algorithm terminates.
  • 12.
    Generating Association Rulesfrom Frequent Itemsets
  • 13.
    Generating Association Rulesfrom Frequent Itemsets
  • 14.
    Generating Association Rulesfrom Frequent Itemsets  Once the frequent itemsets from transactions in a database D have been found, it is straightforward to generate strong association rules from them.  strong association rules satisfy both minimum support and minimum confidence.  For each frequent itemset l, generate all nonempty subsets of l.  output the rule if confidence min conf threshold.  Because the rules are generated from frequent itemsets, each one automatically satisfies the minimum support.
  • 15.
    Generating Association Rulesfrom Frequent Itemsets
  • 16.
    Generating Association Rulesfrom Frequent Itemsets  the minimum confidence threshold = 70%  The frequent itemset X = {I1, I2,I5}.  The nonempty subsets of X are {I1, I2}, {I1, I5}, {I2, I5}, {I1}, {I2}, and {I5}. {I1, I2} → I5, confidence = 2/4 = 50% {I1, I5} → I2, confidence = 2/2 = 100% Strong {I2, I5} → I1, confidence = 2/2 = 100% Strong {I1} → {I2, I5}, confidence = 2/6 = 33% {I2} → {I1, I5}, confidence = 2/7 = 29% {I5} → {I1, I2} confidence = 2/2 = 100% Strong
  • 17.
    Generating Association Rulesfrom Frequent Itemsets  the minimum confidence threshold = 70%  The frequent itemset Y = {I1, I2,I3}.  The nonempty subsets of Y are {I1, I2}, {I1, I3}, {I2, I3}, {I1}, {I2}, and {I3}. {I1, I2} → I3, confidence = 2/4 = 50% {I1, I3} → I2, confidence = 2/4 = 50% {I2, I3} → I1, confidence = 2/4 = 50% {I1} → {I2, I3}, confidence = 2/6 = 33% {I2} → {I1, I3}, confidence = 2/7 = 29% {I3} → {I1, I2} confidence = 2/6 = 33%
  • 18.
    Apriori Algorithm: AnotherExample  Part(a): Apply the Apriori algorithm to the following data set:
  • 19.
  • 20.
    Step-2: Calculate TheSupport For Each One
  • 21.
    Step-3: Continue ToCalculate The Support
  • 22.
    Generating Association Rulesfrom Frequent Itemsets  Part(b): Generate the strong Association Rules from the Frequent Itemsets  the minimum confidence threshold = 70%  The frequent itemset X = {Milk, Bread, Eggs} = {1,2,3}  The nonempty subsets of X are {1, 2}, {1, 3}, {2, 3}, {1}, {2}, and {3}.
  • 23.
    Generating Association Rulesfrom Frequent Itemsets
  • 24.
    Shopping Basket Analysisusing SQL Server and Visual Server  https://www.youtube.com/watch?v=GyakcKKGwGA