Apriori algorithm

7,434 views

Published on

Data Mining, Apriori algorithm, mining the data

Published in: Technology
1 Comment
7 Likes
Statistics
Notes
No Downloads
Views
Total views
7,434
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
374
Comments
1
Likes
7
Embeds 0
No embeds

No notes for slide

Apriori algorithm

  1. 1. Jung Hoon Kim N5, Room 2239 E-mail: junghoon.kim@kaist.ac.kr 2014.01.07 KAIST Knowledge Service Engineering Data Mining Lab. 1
  2. 2. Introduction  Frequent pattern and association rule mining is one of the few exceptions to emerge from machine learning  Apriori algorithm  AprioriTid algorithm  AprioriAll algorithm  FP-Tree algorithm KAIST Knowledge Service Engineering Data Mining Lab. 2
  3. 3. Notation  KAIST Knowledge Service Engineering Data Mining Lab. 3
  4. 4. Principle  downward closure property.  If an itemset is frequenct, then all of its subsets must also be frequent  if an itemset is not frequent, any of its superset is never frequent KAIST Knowledge Service Engineering Data Mining Lab. 4
  5. 5. Apriori algorithm  Pseudo code KAIST Knowledge Service Engineering Data Mining Lab. 5
  6. 6. Example KAIST Knowledge Service Engineering Data Mining Lab. 6
  7. 7. Discussion  Too many database scanning makes high computation  Need minsup & minconf to be specified in advance.  Use hash-tree to store the candidate itemsets. Sometimes it adapt trie-structure to store sets. KAIST Knowledge Service Engineering Data Mining Lab. 7
  8. 8. AprioriTid  KAIST Knowledge Service Engineering Data Mining Lab. 8
  9. 9. AprioriTid KAIST Knowledge Service Engineering Data Mining Lab. 9
  10. 10. AprioriTid KAIST Knowledge Service Engineering Data Mining Lab. 10
  11. 11. AprioriTid KAIST Knowledge Service Engineering Data Mining Lab. 11
  12. 12. FP-Growth  To avoid scanning multiple database  the cost of database is too high !!  To avoid making lots of candidates  in apriori algorithm, the bottleneck is generation of candidate  How can solve these problems? KAIST Knowledge Service Engineering Data Mining Lab. 12
  13. 13. FP-Growth  Algorithm was too simple 1. Scan the database once, find frequent 1-itemsets (single item patterns) 2. Sort the frequent items in frequency descending order, f-list(F-list = f-c-a-b-m-p) 3. Scan the DB again, construct the FP-tree KAIST Knowledge Service Engineering Data Mining Lab. 13
  14. 14. FP-Growth Algorithm KAIST Knowledge Service Engineering Data Mining Lab. 14
  15. 15. FP-Tree  Scanning the transaction with TID=100 KAIST Knowledge Service Engineering Data Mining Lab. 15
  16. 16. FP-Tree  Scanning the transaction with TID=200 KAIST Knowledge Service Engineering Data Mining Lab. 16
  17. 17. FP-Tree  Final FP-Tree KAIST Knowledge Service Engineering Data Mining Lab. 17
  18. 18. Mine a FP-Tree forming conditional pattern bases II. constructing conditional FP-trees III. recursively mining conditional FP-trees I. KAIST Knowledge Service Engineering Data Mining Lab. 18
  19. 19. Conditional pattern base  frequent itemset as a co-occurring suffix pattern  for example  m : <f, c, a> : support / 2  m : <f,c,a,b> : support / 1 KAIST Knowledge Service Engineering Data Mining Lab. 19
  20. 20. Conditional pattern tree  {m}’s conditional pattern tree KAIST Knowledge Service Engineering Data Mining Lab. 20
  21. 21. Pseudo Code KAIST Knowledge Service Engineering Data Mining Lab. 21
  22. 22. Conclusion  In data mining, association rules are useful for analyzing and predicting customer behavior. They play an important part in shopping basket data analysis, product clustering, catalog design and store layout. KAIST Knowledge Service Engineering Data Mining Lab. 22
  23. 23. Thank you KAIST Knowledge Service Engineering Data Mining Lab. 23

×