SlideShare a Scribd company logo
1 of 21
Download to read offline
www.edureka.co/r-for-analytics
Know The Science Behind Product
Recommendation
www.edureka.co/r-for-analyticsSlide 2
Objectives
What is data mining
What is Business Analytics
Stages of Analytics / data mining
What is R
overview of Machine Learning
 What is Association rule mining
Use-case
At the end of this session, you will be able to
Slide 3 www.edureka.co/r-for-analytics
Business Analytics
Why Business Analytics is getting popular these days ?
Cost of storing data Cost of processing data
Slide 4 www.edureka.co/r-for-analytics
Cross Industry standard Process for data mining ( CRISP – DM )
Stages of Analytics / Data Mining
Slide 5 www.edureka.co/r-for-analytics
What is R
R is Programming Language
R is Environment for Statistical Analysis
R is Data Analysis Software
Slide 6 www.edureka.co/r-for-analytics
R : Characteristics
Effective and fast data handling and storage facility
A bunch of operators for calculations on arrays, lists, vectors etc
A large integrated collection of tools for data analysis, and visualization
Facilities for data analysis using graphs and display either directly at the computer or paper
A well implemented and effective programming language called ‘S’ on top of which R is built
A complete range of packages to extend and enrich the functionality of R
Slide 7 www.edureka.co/r-for-analytics
Who Uses R : Domains
 Telecom
 Pharmaceuticals
 Financial Services
 Life Sciences
 Education, etc
Slide 8
Common Machine Learning Algorithms
Types of Learning
Supervised Learning
Unsupervised Learning
Algorithms
 Naïve Bayes
 Support Vector Machines
 Random Forests
 Decision Trees
Algorithms
 K-means
 Fuzzy Clustering
 Hierarchical Clustering
Gaussian mixture models
Self-organizing maps
Slide 9Slide 9Slide 9 www.edureka.co/r-for-analytics
Association Rule Mining
 Wal-Mart customers who purchase Barbie dolls have a 60% likelihood of
also purchasing one of three types of candy bars
 Customers who purchase maintenance agreements are very likely to
purchase large appliances
 When a new hardware store opens, one of the most commonly sold items is
toilet bowl cleaners
Slide 10Slide 10Slide 10 www.edureka.co/r-for-analytics
What is Association Rule Mining?
 In data mining, Association Rule Mining is a popular and well researched method for discovering interesting relations
between variables in large databases.
 It is intended to identify strong rules discovered in databases using different measures of interests.
 The rule found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes
together, he or she is likely to also buy hamburger meat.
 Such information can be used as the basis for decisions about marketing activities such as, e.g., promotional pricing
or product placements.
Slide 11Slide 11Slide 11 www.edureka.co/r-for-analytics
How good is Association Rule?
Here we have 5 customers. Each customer is given a bucket and their purchases are as follows :
Customer Items Purchased
1 OJ, soda
2 Milk, OJ, window cleaner
3 OJ, detergent
4 OJ, detergent, soda
5 Window cleaner, soda
Here, customer 1 purchases OJ (orange juice), and soda.
customer 2 purchases Milk, OJ and window cleaner
customer 3 purchases OJ and detergent
customer 4 purchases OJ, detergent and soda
customer 5 purchases window cleaner and soda.
Now lets form a matrix to analyze the above data and conclude inferences
Slide 12Slide 12Slide 12 www.edureka.co/r-for-analytics
How good is Association Rule?
OJ Window
cleaner
Milk Soda Detergent
OJ 4 1 1 2 2
Window cleaner 1 2 1 1 0
Milk 1 1 1 0 0
Soda 2 1 0 3 1
Detergent 2 0 0 1 2
Simple patterns derived from the above observation :
 OJ and soda are more likely purchased together than any other two items
 Detergent is never purchased with milk or window cleaner
 Milk is never purchased with soda or detergent
Co-occurence of Products
Slide 13Slide 13Slide 13 www.edureka.co/r-for-analytics
Association Rule Mining
The following three terms are the important constraints on which the Association Rules are made
Support
The support Supp(x)=proportion of
transactions in the data set which
contain the interest.
Confidence
The confidence of a rule:
Conf(x=>y)= Supp(X U Y)/Supp(X)
Lift
The lift of a rule: Lift(X=>Y)=
Supp(X U Y) / (Supp(X) X Supp(Y))
Now lets calculate the Support, Confidence and Lift for our ‘Groceries’ data
Support Confidence
{Soda} => {OJ} 0.4 0.6667
{OJ} => {Soda} 0.4 0.5
Slide 14Slide 14Slide 14 www.edureka.co/r-for-analytics
Association Rule Mining
The Groceries data set contains 1 month (30 days) of real-world
point-of-sale transaction data from a typical local grocery outlet. The
data set contains 9835 transactions and the items are aggregated to
169 categories.
‘arules’ provides the infrastructure for representing, manipulating
and analyzing transaction data and patterns.
Various visualization techniques for association rules and
itemsets. This package extends package arules.
Slide 15Slide 15Slide 15 www.edureka.co/r-for-analytics
Association Rule Mining
Syntax - apriori(data, parameter = NULL,
appearance = NULL, control = NULL)
apriori() - The apriori function is present in the ‘arules’ package. It employs level-wise search for frequent item-sets.
Slide 16Slide 16Slide 16 www.edureka.co/r-for-analytics
Association Rule Mining
Going through 1098 rules manually, is not an efficient option.
Let us make use of the ‘Viz’ in arulesViz and visualize the rules.
Slide 17Slide 17Slide 17 www.edureka.co/r-for-analytics
Association Rule Mining
Now lets plot the data using the ‘Scatter Plot’ graph
 A scatter plot is a mathematical diagram to display values
for two variables for a set of data.
 The data is displayed as a collection of points
 Scatter plot is used when a variable exists below the control
of the experimenter.
Conclusion:
 It can be seen that rules with high lift have relatively
low support.
 Most interesting rules reside on support-confidence
border.
Slide 18Slide 18Slide 18 www.edureka.co/r-for-analytics
Association Rule Mining
Now after applying the Association Rules, the Support, Confidence and the Lift values for the Groceries data is as
shown below:
Slide 19Slide 19Slide 19 www.edureka.co/r-for-analytics
Association Rule Mining
Slide 20Slide 20Slide 20 www.edureka.co/r-for-analytics
Conclusion:
 The most interesting rules according to ‘lift’ can be seen at the top-center.
 There are 3 rules containing “Butter” and 1 other item in the antecedent, in consequence to “whipped/sour cream”
Let us zoom into the plot to observe the significant inferences:
Association Rule Mining
Association Mining

More Related Content

What's hot

Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Edureka!
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Edureka!
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with REdureka!
 
Webinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine LearningWebinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine LearningEdureka!
 
Data Science Project Lifecycle
Data Science Project LifecycleData Science Project Lifecycle
Data Science Project LifecycleJason Geng
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Data Science Thailand
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI dayMohammed Barakat
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Life of a data scientist (pub)
Life of a data scientist (pub)Life of a data scientist (pub)
Life of a data scientist (pub)Buhwan Jeong
 
Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)heba_ahmad
 
Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Benjamin Taylor
 
Data Science: Not Just For Big Data
Data Science: Not Just For Big DataData Science: Not Just For Big Data
Data Science: Not Just For Big DataRevolution Analytics
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 
Big data Competitions by Komes Chandavimol
Big data Competitions by Komes ChandavimolBig data Competitions by Komes Chandavimol
Big data Competitions by Komes ChandavimolIMC Institute
 

What's hot (20)

Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with R
 
Webinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine LearningWebinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine Learning
 
Data Science Project Lifecycle
Data Science Project LifecycleData Science Project Lifecycle
Data Science Project Lifecycle
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)
 
Unit 3 part 2
Unit  3 part 2Unit  3 part 2
Unit 3 part 2
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data science
Data scienceData science
Data science
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI day
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Life of a data scientist (pub)
Life of a data scientist (pub)Life of a data scientist (pub)
Life of a data scientist (pub)
 
Data science
Data scienceData science
Data science
 
Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)
 
Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Predictive analytics and big data tutorial
Predictive analytics and big data tutorial
 
Data Science: Not Just For Big Data
Data Science: Not Just For Big DataData Science: Not Just For Big Data
Data Science: Not Just For Big Data
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Big data Competitions by Komes Chandavimol
Big data Competitions by Komes ChandavimolBig data Competitions by Komes Chandavimol
Big data Competitions by Komes Chandavimol
 

Similar to Association Mining

Data Mining Apriori Algorithm Implementation using R
Data Mining Apriori Algorithm Implementation using RData Mining Apriori Algorithm Implementation using R
Data Mining Apriori Algorithm Implementation using RIRJET Journal
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithmhina firdaus
 
Cluster2
Cluster2Cluster2
Cluster2work
 
5 Benefits of Predictive Analytics for E-Commerce
5 Benefits of Predictive Analytics for E-Commerce5 Benefits of Predictive Analytics for E-Commerce
5 Benefits of Predictive Analytics for E-CommerceEdureka!
 
Comparative Study of Improved Association Rules Mining Based On Shopping System
Comparative Study of Improved Association Rules Mining Based On Shopping SystemComparative Study of Improved Association Rules Mining Based On Shopping System
Comparative Study of Improved Association Rules Mining Based On Shopping SystemEswar Publications
 
Data mining techniques and dss
Data mining techniques and dssData mining techniques and dss
Data mining techniques and dssNiyitegekabilly
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentationmillerca2
 
Data Science - Part VI - Market Basket and Product Recommendation Engines
Data Science - Part VI - Market Basket and Product Recommendation EnginesData Science - Part VI - Market Basket and Product Recommendation Engines
Data Science - Part VI - Market Basket and Product Recommendation EnginesDerek Kane
 
Data Mining For Supermarket Sale Analysis Using Association Rule
Data Mining For Supermarket Sale Analysis Using Association RuleData Mining For Supermarket Sale Analysis Using Association Rule
Data Mining For Supermarket Sale Analysis Using Association Ruleijtsrd
 
RETAIL STORE ANALYSIS
RETAIL STORE ANALYSISRETAIL STORE ANALYSIS
RETAIL STORE ANALYSISManvi Chandra
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data MiningScottperrone
 
Final project ADS INFO-7390
Final project ADS INFO-7390Final project ADS INFO-7390
Final project ADS INFO-7390Tushar Goel
 
A primer on optimization using solvers
A primer on optimization using solversA primer on optimization using solvers
A primer on optimization using solversAnwar Ali Mohamed
 
6. Association Rule.pdf
6. Association Rule.pdf6. Association Rule.pdf
6. Association Rule.pdfJyoti Yadav
 
Mining Association Rules in Large Database
Mining Association Rules in Large DatabaseMining Association Rules in Large Database
Mining Association Rules in Large DatabaseEr. Nawaraj Bhandari
 
Unit 2 Chapter 4.pdf
Unit 2 Chapter 4.pdfUnit 2 Chapter 4.pdf
Unit 2 Chapter 4.pdfmmdspgl
 

Similar to Association Mining (20)

Data Mining Apriori Algorithm Implementation using R
Data Mining Apriori Algorithm Implementation using RData Mining Apriori Algorithm Implementation using R
Data Mining Apriori Algorithm Implementation using R
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithm
 
BAS 250 Lecture 4
BAS 250 Lecture 4BAS 250 Lecture 4
BAS 250 Lecture 4
 
Cluster2
Cluster2Cluster2
Cluster2
 
5 Benefits of Predictive Analytics for E-Commerce
5 Benefits of Predictive Analytics for E-Commerce5 Benefits of Predictive Analytics for E-Commerce
5 Benefits of Predictive Analytics for E-Commerce
 
Comparative Study of Improved Association Rules Mining Based On Shopping System
Comparative Study of Improved Association Rules Mining Based On Shopping SystemComparative Study of Improved Association Rules Mining Based On Shopping System
Comparative Study of Improved Association Rules Mining Based On Shopping System
 
1705 keynote abbott
1705 keynote abbott1705 keynote abbott
1705 keynote abbott
 
Data mining techniques and dss
Data mining techniques and dssData mining techniques and dss
Data mining techniques and dss
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
 
Data Science - Part VI - Market Basket and Product Recommendation Engines
Data Science - Part VI - Market Basket and Product Recommendation EnginesData Science - Part VI - Market Basket and Product Recommendation Engines
Data Science - Part VI - Market Basket and Product Recommendation Engines
 
Data Mining For Supermarket Sale Analysis Using Association Rule
Data Mining For Supermarket Sale Analysis Using Association RuleData Mining For Supermarket Sale Analysis Using Association Rule
Data Mining For Supermarket Sale Analysis Using Association Rule
 
2559 Big Data Pack
2559 Big Data Pack2559 Big Data Pack
2559 Big Data Pack
 
RETAIL STORE ANALYSIS
RETAIL STORE ANALYSISRETAIL STORE ANALYSIS
RETAIL STORE ANALYSIS
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data Mining
 
Final project ADS INFO-7390
Final project ADS INFO-7390Final project ADS INFO-7390
Final project ADS INFO-7390
 
A primer on optimization using solvers
A primer on optimization using solversA primer on optimization using solvers
A primer on optimization using solvers
 
6. Association Rule.pdf
6. Association Rule.pdf6. Association Rule.pdf
6. Association Rule.pdf
 
Data Mining
Data Mining Data Mining
Data Mining
 
Mining Association Rules in Large Database
Mining Association Rules in Large DatabaseMining Association Rules in Large Database
Mining Association Rules in Large Database
 
Unit 2 Chapter 4.pdf
Unit 2 Chapter 4.pdfUnit 2 Chapter 4.pdf
Unit 2 Chapter 4.pdf
 

More from Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaEdureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaEdureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaEdureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaEdureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaEdureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaEdureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaEdureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaEdureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaEdureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaEdureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | EdurekaEdureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEdureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEdureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaEdureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaEdureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaEdureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaEdureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaEdureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | EdurekaEdureka!
 

More from Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Association Mining

  • 1. www.edureka.co/r-for-analytics Know The Science Behind Product Recommendation
  • 2. www.edureka.co/r-for-analyticsSlide 2 Objectives What is data mining What is Business Analytics Stages of Analytics / data mining What is R overview of Machine Learning  What is Association rule mining Use-case At the end of this session, you will be able to
  • 3. Slide 3 www.edureka.co/r-for-analytics Business Analytics Why Business Analytics is getting popular these days ? Cost of storing data Cost of processing data
  • 4. Slide 4 www.edureka.co/r-for-analytics Cross Industry standard Process for data mining ( CRISP – DM ) Stages of Analytics / Data Mining
  • 5. Slide 5 www.edureka.co/r-for-analytics What is R R is Programming Language R is Environment for Statistical Analysis R is Data Analysis Software
  • 6. Slide 6 www.edureka.co/r-for-analytics R : Characteristics Effective and fast data handling and storage facility A bunch of operators for calculations on arrays, lists, vectors etc A large integrated collection of tools for data analysis, and visualization Facilities for data analysis using graphs and display either directly at the computer or paper A well implemented and effective programming language called ‘S’ on top of which R is built A complete range of packages to extend and enrich the functionality of R
  • 7. Slide 7 www.edureka.co/r-for-analytics Who Uses R : Domains  Telecom  Pharmaceuticals  Financial Services  Life Sciences  Education, etc
  • 8. Slide 8 Common Machine Learning Algorithms Types of Learning Supervised Learning Unsupervised Learning Algorithms  Naïve Bayes  Support Vector Machines  Random Forests  Decision Trees Algorithms  K-means  Fuzzy Clustering  Hierarchical Clustering Gaussian mixture models Self-organizing maps
  • 9. Slide 9Slide 9Slide 9 www.edureka.co/r-for-analytics Association Rule Mining  Wal-Mart customers who purchase Barbie dolls have a 60% likelihood of also purchasing one of three types of candy bars  Customers who purchase maintenance agreements are very likely to purchase large appliances  When a new hardware store opens, one of the most commonly sold items is toilet bowl cleaners
  • 10. Slide 10Slide 10Slide 10 www.edureka.co/r-for-analytics What is Association Rule Mining?  In data mining, Association Rule Mining is a popular and well researched method for discovering interesting relations between variables in large databases.  It is intended to identify strong rules discovered in databases using different measures of interests.  The rule found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is likely to also buy hamburger meat.  Such information can be used as the basis for decisions about marketing activities such as, e.g., promotional pricing or product placements.
  • 11. Slide 11Slide 11Slide 11 www.edureka.co/r-for-analytics How good is Association Rule? Here we have 5 customers. Each customer is given a bucket and their purchases are as follows : Customer Items Purchased 1 OJ, soda 2 Milk, OJ, window cleaner 3 OJ, detergent 4 OJ, detergent, soda 5 Window cleaner, soda Here, customer 1 purchases OJ (orange juice), and soda. customer 2 purchases Milk, OJ and window cleaner customer 3 purchases OJ and detergent customer 4 purchases OJ, detergent and soda customer 5 purchases window cleaner and soda. Now lets form a matrix to analyze the above data and conclude inferences
  • 12. Slide 12Slide 12Slide 12 www.edureka.co/r-for-analytics How good is Association Rule? OJ Window cleaner Milk Soda Detergent OJ 4 1 1 2 2 Window cleaner 1 2 1 1 0 Milk 1 1 1 0 0 Soda 2 1 0 3 1 Detergent 2 0 0 1 2 Simple patterns derived from the above observation :  OJ and soda are more likely purchased together than any other two items  Detergent is never purchased with milk or window cleaner  Milk is never purchased with soda or detergent Co-occurence of Products
  • 13. Slide 13Slide 13Slide 13 www.edureka.co/r-for-analytics Association Rule Mining The following three terms are the important constraints on which the Association Rules are made Support The support Supp(x)=proportion of transactions in the data set which contain the interest. Confidence The confidence of a rule: Conf(x=>y)= Supp(X U Y)/Supp(X) Lift The lift of a rule: Lift(X=>Y)= Supp(X U Y) / (Supp(X) X Supp(Y)) Now lets calculate the Support, Confidence and Lift for our ‘Groceries’ data Support Confidence {Soda} => {OJ} 0.4 0.6667 {OJ} => {Soda} 0.4 0.5
  • 14. Slide 14Slide 14Slide 14 www.edureka.co/r-for-analytics Association Rule Mining The Groceries data set contains 1 month (30 days) of real-world point-of-sale transaction data from a typical local grocery outlet. The data set contains 9835 transactions and the items are aggregated to 169 categories. ‘arules’ provides the infrastructure for representing, manipulating and analyzing transaction data and patterns. Various visualization techniques for association rules and itemsets. This package extends package arules.
  • 15. Slide 15Slide 15Slide 15 www.edureka.co/r-for-analytics Association Rule Mining Syntax - apriori(data, parameter = NULL, appearance = NULL, control = NULL) apriori() - The apriori function is present in the ‘arules’ package. It employs level-wise search for frequent item-sets.
  • 16. Slide 16Slide 16Slide 16 www.edureka.co/r-for-analytics Association Rule Mining Going through 1098 rules manually, is not an efficient option. Let us make use of the ‘Viz’ in arulesViz and visualize the rules.
  • 17. Slide 17Slide 17Slide 17 www.edureka.co/r-for-analytics Association Rule Mining Now lets plot the data using the ‘Scatter Plot’ graph  A scatter plot is a mathematical diagram to display values for two variables for a set of data.  The data is displayed as a collection of points  Scatter plot is used when a variable exists below the control of the experimenter. Conclusion:  It can be seen that rules with high lift have relatively low support.  Most interesting rules reside on support-confidence border.
  • 18. Slide 18Slide 18Slide 18 www.edureka.co/r-for-analytics Association Rule Mining Now after applying the Association Rules, the Support, Confidence and the Lift values for the Groceries data is as shown below:
  • 19. Slide 19Slide 19Slide 19 www.edureka.co/r-for-analytics Association Rule Mining
  • 20. Slide 20Slide 20Slide 20 www.edureka.co/r-for-analytics Conclusion:  The most interesting rules according to ‘lift’ can be seen at the top-center.  There are 3 rules containing “Butter” and 1 other item in the antecedent, in consequence to “whipped/sour cream” Let us zoom into the plot to observe the significant inferences: Association Rule Mining