Product recommendation

Product Recommendation
Frequent Itemset Mining
Association Rules
A-priori algorithm
Joe Duimstra
Aug 20, 2015

Frequent Itemsets
Try to identify the items that are frequently bought together
Example:people who buy a,b,c tend to buy d,e
Amazon:
– Keeps log of what you've bought
– Uses logs of all users to find items that are frequently
bought together

Typical Problem
●
A large set of items
●
A large set of baskets
●
Each basket has a small subset of
items
●
Define 'frequent' itemsets as those that
appear in at least s baskets where s is
the 'support threshold'

Small example
From: Jure Leskovec, Stanford CS246

Association Rules
If-then rules about basket contents

Computing Association Rules
1.Read data from disk. Data is typically stored
basket-by-basket
2.Generate pairs, triples, quadruples, etc of items
as each basket is read
3.Count number of occurences of each itemset
4.Calculate confidence based on support for
itemsets
BUT...

...If the data is large
1. Disk I/O will slow processing—fastest way is to
sequentially read entire data set, rather than
randomly accessing different bucket
2. Itemset counting limited by storing counts in
memory—disk I/O will further slow computation
1. For n=1 items, memory is O(n)
2. For n=2 items, memory is O(n2)
3. Quickly run out of memory for large n

A-priori Algorithm

Uses multiple passes through the data and counts only selected
itemsets

Main idea
– If a set of items I appears at least s times, so does every
subset J of I
– Contrapositive for pairs:
• If item i does not appear in s baskets, then no pair
including i can appear in s baskets

Product recommendation

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (7)

Similar to Product recommendation

Similar to Product recommendation (7)

Recently uploaded

Recently uploaded (20)

Product recommendation