Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- What to Upload to SlideShare by SlideShare 6427986 views
- Customer Code: Creating a Company C... by HubSpot 4794536 views
- Be A Great Product Leader (Amplify,... by Adam Nash 1073731 views
- Trillion Dollar Coach Book (Bill Ca... by Eric Schmidt 1260132 views
- APIdays Paris 2019 - Innovation @ s... by apidays 1505965 views
- A few thoughts on work life-balance by Wim Vanderbauwhede 1102647 views

Mining Association Rules in Large Databases , Apriori Algorithm,Association Rule Mining

No Downloads

Total views

867

On SlideShare

0

From Embeds

0

Number of Embeds

14

Shares

0

Downloads

20

Comments

2

Likes

1

No notes for slide

- 1. Er. Nawaraj Bhandari Data Warehouse/Data Mining Mining Association Rules in Large Databases Chapter 7:
- 2. Introduction Association rule mining finds interesting association or correlation relationships among a large set of data items. With massive amounts of data continuously being collected and stored , many industries are becoming interested in mining association huge amounts of business transaction records can help in many business decision making processes, such as catalog design, cross-marketing, and loss-leader analysis. A typical example of association rule mining is market basket analysis.
- 3. Association Rules Analyzes and predicts customer behavior. If / then statements. Examples: Bread=>butter. If someone purchase bread then he/she likely to purchase butter. Buys{onions, potatoes}=> buys{tomatoes}
- 4. Parts of Association Rules Bread=>butter[20%, 45%] Bread: Antecedent Butter: Consequent 20% is Support And 45% is Confidence
- 5. Support and Confidence A=>B Support denoted probability that contains both A & B Confidence denotes probability that a transaction containing A also contains B.
- 6. Support and Confidence Consider in a super market Total transcations: 100 Bread: 20 So , 20/100 * 100=20% which is support In 20 transaction of bread, butter : 9 transactions So, 9/20 * 100=45% which is confidence.
- 7. Types of Association Rules Single dimension association rule Multidimensional association rule Hybrid association rule
- 8. Single dimension association rule Bread=>Butter Dimension: buying. Here one and only dimension is buying.
- 9. Multi dimension association rule With 2 or more dimensions. Occupation(I.T), Age(>22)=>buys(laptops) Here we have 3 dimensions i.e occupation, age limit and buys. In multidimensional rules we can not duplicate dimension.
- 10. Hybrid dimension association rule Dimension or predicates can be repeated. Time(5 O'clock ), Buy(tea)=>Buy(biscuits) If a person at 5 o’clock get tea, he or she is likely to get biscuits also. Here dimensions are repeated.
- 11. Field of association rule Web usages mining Banking Bio informatics Market based analysis Credit/ debit card analysis Product clustering Catalog design
- 12. Algorithms of association rule Apriori Algorithm Elcat Algorithm F.P Growth Algorithm
- 13. Apriori Algorithm If you brought tooth brush, there will be suggestion of tooth paste or if you brought beer there will be suggestion of chips and potato cracker etc. Many ecommerce websites are using these trends of suggestion in market. This is called Apriori Algorithms. This is machine learning algorithms and a lot of ecommerce websites (like flipcart, amazon) are using this.
- 14. Apriori Algorithm
- 15. Apriori Algorithm Candidates First C1: Item Set Support Count M 3 O 4 N 2 K 5 E 4 Y 3 D 1 A 1 U 1 C 2
- 16. Apriori Algorithm L1: (The item set which are frequently repeating using minimum support) Item Set Support Count M 3 O 4 K 5 E 4 Y 3
- 17. Apriori Algorithm Candidates First C2: Item Set Support Count M, O 1 M, K 3 M, E 2 M,Y 2 O, K 3 O, E 3 O, Y 2 K, E 4 K, Y 3 E, Y 2
- 18. Apriori Algorithm L2: (The item set which are frequently repeating using minimum support) Item Set Support Count M, K 3 O, K 3 O, E 3 K, E 4 K, Y 3
- 19. Apriori Algorithm Candidates First C3: Item Set Support Count M, K, O 1 M, K, E 2 M, K, Y 2 O, K, E 3 O, K, Y 2
- 20. Apriori Algorithm L3: (The item set which are frequently repeating using minimum support) Item Set Support Count O, K, E 3
- 21. Apriori Algorithm Now create association rules with support and confidence for O, K, E. Association rules as like O AND K GIVES E Confidence= (support/no of time it occur i.e. O AND K OF O^K=>E) For example confidence for o and k = (3/3)=1 Association Rule Support Confidence Confidence % O^K=>E 3 3/3=1 100 O^E=>K 3 3/3=1 100 K^E=>O 3 3/4=0.75 75 E=>O^K 3 3/4=0.75 75 K=>O^E 3 3/5=0.6 60 O=>K^E 3 3/4=0.75 75
- 22. Apriori Algorithm Compare this with the minimum confidence 80% Association Rule Support Confidence Confidence % O^K=>E 3 3/3=1 100 O^E=>K 3 3/3=1 100 Hence final association rules are: O^K=>E O^E=>K Now this is called market basket analysis.
- 23. Pros and Cons of Association Rule Mining Pros It is an easy-to-implement and easy-to-understand algorithm. It can be used on large itemsets. Cons Sometimes, it may need to find a large number of candidate rules which can be computationally expensive. Calculating support is also expensive because it has to go through the entire database. June 8, 2019 Data Mining: Concepts and Techniques 23
- 24. Assignment Minimum support:2, Minimum confidence:70%. Use Apriori algorithm to get frequent itemsets and strong association rules. TID Item 1 I1, I3, I4 2 I2, I3, I5 3 I1, I2, I3, I5 4 I2, I5
- 25. References 1. Sam Anahory, Dennis Murray, “Data warehousing In the Real World”, Pearson Education. 2. Kimball, R. “The Data Warehouse Toolkit”, Wiley, 1996. 3. Teorey, T. J., “Database Modeling and Design: The Entity-Relationship Approach”, Morgan Kaufmann Publishers, Inc., 1990. 4. “An Overview of Data Warehousing and OLAP Technology”, S. Chaudhuri, Microsoft Research 5. “Data Warehousing with Oracle”, M. A. Shahzad 6. “Data Mining Concepts and Techniques”, Morgan Kaufmann J. Han, M Kamber Second Edition ISBN : 978-1-55860-901-3
- 26. ANY QUESTIONS?

No public clipboards found for this slide

Login to see the comments