SlideShare a Scribd company logo
1 of 17
IBM SPSS Modeler 14.2IBM SPSS Modeler 14.2
Association rules
Apriori Algorithm
Pincer Search Algorithm
FP Growth Algorithm
(Frequent Pattern)
1
IBM SPSS Modeler 14.2
Association Analysis
2
Also referred to as
Affinity Analysis
Market Basket Analysis
For MBA, basically means what is
being purchased together
•Association rules represent
patterns without a specific target;
in a way it is undirected or
unsupervised data mining
•Fits in the Exploratory category of
data mining
IBM SPSS Modeler 14.2
Association Rules
• Other potential uses
• Items purchases on credit card give insight to next produce or
service purchased
• Help determine bundles for telcoms
• Help bankers determine identify customers for other services
• Unusual combinations of things like insurance claims may need
further investigation
• Medical histories may give indications of complications or helpful
combinations for patients
3
IBM SPSS Modeler 14.2
Defining MBA
• MBA data
• Customers
• Purchases (baskets or item sets)
• Items
• Set of tables
• Purchase (Order) is the fundamental data structure
• Individual items are line items
• Product –descriptive info
• Customer info can be helpful
4
IBM SPSS Modeler 14.2
Levels of Data
5
Adapted from Barry & Linoff
IBM SPSS Modeler 14.2
MBA
• The three levels of data are important for MBA. They can be used to
answer a number of questions
• Average number of baskets/customer/time unit
• Average unique items per customer
• Average number of items per basket
• For a given product, what is the proportion of customers who have ever
purchased the product?
• For a given product, what is the average number of baskets per customer
that include the item
• For a given product, what is the average quantity purchased in an order
when the product is purchased?
6
IBM SPSS Modeler 14.2
Item Popularity
• Most common item in one-item baskets
• Most common item in multi-item baskets
• Most common items among repeat customers
• Change in buying patterns of item over time
• Buying pattern for an item by region
• Time and geography are two of the most important
attributes of MBA data
7
IBM SPSS Modeler 14.2
Association Rules
• Actionable Rules
• Wal-Mart customers who purchase Barbie dolls have a 60 percent
likelihood of also purchasing one of three types of candy bars
• Trivial Rules
• Customers who purchase maintenance agreements are very likely
to purchase a large appliance
• Inexplicable Rules
• When a new hardware store opens, one of the most commonly
sold items is toilet cleaners
Adapted from Barry & Linoff
IBM SPSS Modeler 14.2
What exactly is an Association
Rule?
• Of the form:
IF antecedent THEN consequent
If (orange juice, milk) Then (bread, bacon)
• Rules include measure of support and confidence
9
IBM SPSS Modeler 14.2
How good is an Association
Rule?
• Transactions can be converted to Co-occurrence matrices
• Co-occurrence tables highlight simple patterns
• Confidence and support can be directly determined from a co-
occurrence table
10
IBM SPSS Modeler 14.2
Co-Occurrence Table
OJ WC Milk Soda Det
OJ
WC -
Milk - -
Soda - - -
Det - - - -
11
Customer Items
1 Orange juice, soda
2 Milk, orange juice, window cleaner
3 Orange juice, detergent
4 Orange juice, detergent, soda
5 Window cleaner, milk
IBM SPSS Modeler 14.2
Co-Occurrence Table
OJ WC Milk Soda Det
OJ 4 1 1 2 2
WC - 2 2 0 0
Milk - - 2 0 0
Soda - - - 2 1
Det - - - 2
12
Customer Items
1 Orange juice, soda
2 Milk, orange juice, window cleaner
3 Orange juice, detergent
4 Orange juice, detergent, soda
5 Window cleaner, milk
IBM SPSS Modeler 14.2
Confidence, Support and Lift
• Support for the rule
# records with both antecedent and consequent
Total # records
• Confidence for the rule
# records with both antecedent and consequent
# records of the antecedent
• Expected Confidence
# records of the consequent
Total # records
• Lift
Confidence / Expected Confidence
13
IBM SPSS Modeler 14.2
Confidence and Support
• Rule: If soda then orange juice
From the co-occurrence table, soda and orange juice occur together 2 times (out of 5
total transactions)
Thus, support for the rule is 2/5 or 40%
• Confidence for the rule:
Soda occurs 2 times; so confidence of orange juice given soda would be 2/2 or 100%
• Lift for the rule: Confidence / Expected Confidence
confidence = 100%; expected confidence=80%
lift = 1.0/.8 = 1.25
• Rule: If orange juice then soda
support for the rule is the same—40%
orange juice occurs 4 times; so confidence of soda given orange juice is 2/4 or 50%
lift = .5/.8
14
IBM SPSS Modeler 14.2
Building Association Rules
15
Adapted from Barry & Linoff
IBM SPSS Modeler 14.2
Product Hierarchies
16
IBM SPSS Modeler 14.2IBM SPSS Modeler 14.2
Prepared by David Douglas, University
of Arkansas
Hosted by the University of Arkansas 17
Association rule learning is a popular and well researched method for
discovering interesting relations between variables in large databases.
It is intended to identify strong rules discovered in databases using
different measures of interestingness.[1] Based on the concept of strong
rules, Rakesh Agrawal et al.[2] introduced association rules for
discovering regularities between products in large-scale transaction
data recorded by point-of-sale (POS) systems in supermarkets. For
example, the rule found in the sales data of a supermarket would
indicate that if a customer buys onions and potatoes together, he or she
is likely to also buy hamburger meat. Such information can be used as
the basis for decisions about marketing activities such as, e.g.,
promotional pricing or product placements. In addition to the above
example from market basket analysis association rules are employed
today in many application areas including Web usage mining, intrusion
detection, Continuous production, and bioinformatics. In contrast
with sequence mining, association rule learning typically does not
consider the order of items either within a transaction or across
transactions

More Related Content

Viewers also liked

Eclat algorithm in association rule mining
Eclat algorithm in association rule miningEclat algorithm in association rule mining
Eclat algorithm in association rule miningDeepa Jeya
 
CMG 101 - Understanding performance
CMG 101 - Understanding performanceCMG 101 - Understanding performance
CMG 101 - Understanding performancePeter HJ van Eijk
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningWan Aezwani Wab
 
Geothermal energy in mining
Geothermal energy in miningGeothermal energy in mining
Geothermal energy in miningMartin Preene
 
Sequential pattern mining
Sequential pattern miningSequential pattern mining
Sequential pattern miningkiran said
 
Data mining- Association Analysis -market basket
Data mining- Association Analysis -market basketData mining- Association Analysis -market basket
Data mining- Association Analysis -market basketSwapnil Soni
 
Masket Basket Analysis
Masket Basket AnalysisMasket Basket Analysis
Masket Basket AnalysisMarc Berman
 
Reservoir modeling work flow chart
Reservoir modeling work flow chartReservoir modeling work flow chart
Reservoir modeling work flow chartDr. Arzu Javadova
 
Market Basket Analysis
Market Basket AnalysisMarket Basket Analysis
Market Basket AnalysisMahendra Gupta
 
Real-time Market Basket Analysis for Retail with Hadoop
Real-time Market Basket Analysis for Retail with HadoopReal-time Market Basket Analysis for Retail with Hadoop
Real-time Market Basket Analysis for Retail with HadoopDataWorks Summit
 
The comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmThe comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmdeepti92pawar
 
Модификация алгоритма Viola-Jones на основе детектирования цвета кожи
Модификация алгоритма Viola-Jones на основе детектирования цвета кожиМодификация алгоритма Viola-Jones на основе детектирования цвета кожи
Модификация алгоритма Viola-Jones на основе детектирования цвета кожиEunix
 

Viewers also liked (18)

Fp growth
Fp growthFp growth
Fp growth
 
Eclat algorithm in association rule mining
Eclat algorithm in association rule miningEclat algorithm in association rule mining
Eclat algorithm in association rule mining
 
Apriori
AprioriApriori
Apriori
 
CMG 101 - Understanding performance
CMG 101 - Understanding performanceCMG 101 - Understanding performance
CMG 101 - Understanding performance
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule Mining
 
Reservoir Modeling
Reservoir ModelingReservoir Modeling
Reservoir Modeling
 
Geothermal energy in mining
Geothermal energy in miningGeothermal energy in mining
Geothermal energy in mining
 
Sequential pattern mining
Sequential pattern miningSequential pattern mining
Sequential pattern mining
 
Practical Data Mining: FP-Growth
Practical Data Mining: FP-GrowthPractical Data Mining: FP-Growth
Practical Data Mining: FP-Growth
 
Data mining- Association Analysis -market basket
Data mining- Association Analysis -market basketData mining- Association Analysis -market basket
Data mining- Association Analysis -market basket
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Masket Basket Analysis
Masket Basket AnalysisMasket Basket Analysis
Masket Basket Analysis
 
Reservoir modeling work flow chart
Reservoir modeling work flow chartReservoir modeling work flow chart
Reservoir modeling work flow chart
 
Market Basket Analysis
Market Basket AnalysisMarket Basket Analysis
Market Basket Analysis
 
Reservoir modeling and characterization
Reservoir modeling and characterizationReservoir modeling and characterization
Reservoir modeling and characterization
 
Real-time Market Basket Analysis for Retail with Hadoop
Real-time Market Basket Analysis for Retail with HadoopReal-time Market Basket Analysis for Retail with Hadoop
Real-time Market Basket Analysis for Retail with Hadoop
 
The comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmThe comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithm
 
Модификация алгоритма Viola-Jones на основе детектирования цвета кожи
Модификация алгоритма Viola-Jones на основе детектирования цвета кожиМодификация алгоритма Viola-Jones на основе детектирования цвета кожи
Модификация алгоритма Viola-Jones на основе детектирования цвета кожи
 

Similar to Association

What goes with what (Market Basket Analysis)
What goes with what (Market Basket Analysis)What goes with what (Market Basket Analysis)
What goes with what (Market Basket Analysis)Kumar P
 
Comparative Study of Improved Association Rules Mining Based On Shopping System
Comparative Study of Improved Association Rules Mining Based On Shopping SystemComparative Study of Improved Association Rules Mining Based On Shopping System
Comparative Study of Improved Association Rules Mining Based On Shopping SystemEswar Publications
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithmhina firdaus
 
Market Basket Analysis.pptx
Market Basket Analysis.pptxMarket Basket Analysis.pptx
Market Basket Analysis.pptxssuserb7effa
 
Final project ADS INFO-7390
Final project ADS INFO-7390Final project ADS INFO-7390
Final project ADS INFO-7390Tushar Goel
 
DATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docx
DATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docxDATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docx
DATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docxwhittemorelucilla
 
Association and Classification Algorithm
Association and Classification AlgorithmAssociation and Classification Algorithm
Association and Classification AlgorithmMedicaps University
 
Big Data Case Study on Walmart
Big Data Case Study on WalmartBig Data Case Study on Walmart
Big Data Case Study on WalmartJainamParikh3
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...Smarten Augmented Analytics
 
Chap. 4 types of consumer
Chap. 4 types of consumerChap. 4 types of consumer
Chap. 4 types of consumerMagiel Amora
 
Association Mining
Association Mining Association Mining
Association Mining Edureka!
 
Products Frequently Bought Together in Stores Using classificat...
Products Frequently Bought Together in Stores               Using classificat...Products Frequently Bought Together in Stores               Using classificat...
Products Frequently Bought Together in Stores Using classificat...hibaziyad99
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGZaranTech LLC
 
data science certification
data science certificationdata science certification
data science certificationdevipatnala1
 
Data science course in mysore
Data science course in mysoreData science course in mysore
Data science course in mysoreTejaspathiLV
 
Data science training in hyderabad
Data science training in hyderabadData science training in hyderabad
Data science training in hyderabadsushmapetloju
 
best data science certification
best data science certificationbest data science certification
best data science certificationdevipatnala1
 

Similar to Association (20)

What goes with what (Market Basket Analysis)
What goes with what (Market Basket Analysis)What goes with what (Market Basket Analysis)
What goes with what (Market Basket Analysis)
 
Comparative Study of Improved Association Rules Mining Based On Shopping System
Comparative Study of Improved Association Rules Mining Based On Shopping SystemComparative Study of Improved Association Rules Mining Based On Shopping System
Comparative Study of Improved Association Rules Mining Based On Shopping System
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithm
 
Market Basket Analysis.pptx
Market Basket Analysis.pptxMarket Basket Analysis.pptx
Market Basket Analysis.pptx
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Final project ADS INFO-7390
Final project ADS INFO-7390Final project ADS INFO-7390
Final project ADS INFO-7390
 
DATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docx
DATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docxDATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docx
DATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docx
 
Association and Classification Algorithm
Association and Classification AlgorithmAssociation and Classification Algorithm
Association and Classification Algorithm
 
Big Data Case Study on Walmart
Big Data Case Study on WalmartBig Data Case Study on Walmart
Big Data Case Study on Walmart
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
 
Chap. 4 types of consumer
Chap. 4 types of consumerChap. 4 types of consumer
Chap. 4 types of consumer
 
DWM
DWMDWM
DWM
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Association Mining
Association Mining Association Mining
Association Mining
 
Products Frequently Bought Together in Stores Using classificat...
Products Frequently Bought Together in Stores               Using classificat...Products Frequently Bought Together in Stores               Using classificat...
Products Frequently Bought Together in Stores Using classificat...
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAINING
 
data science certification
data science certificationdata science certification
data science certification
 
Data science course in mysore
Data science course in mysoreData science course in mysore
Data science course in mysore
 
Data science training in hyderabad
Data science training in hyderabadData science training in hyderabad
Data science training in hyderabad
 
best data science certification
best data science certificationbest data science certification
best data science certification
 

Association

  • 1. IBM SPSS Modeler 14.2IBM SPSS Modeler 14.2 Association rules Apriori Algorithm Pincer Search Algorithm FP Growth Algorithm (Frequent Pattern) 1
  • 2. IBM SPSS Modeler 14.2 Association Analysis 2 Also referred to as Affinity Analysis Market Basket Analysis For MBA, basically means what is being purchased together •Association rules represent patterns without a specific target; in a way it is undirected or unsupervised data mining •Fits in the Exploratory category of data mining
  • 3. IBM SPSS Modeler 14.2 Association Rules • Other potential uses • Items purchases on credit card give insight to next produce or service purchased • Help determine bundles for telcoms • Help bankers determine identify customers for other services • Unusual combinations of things like insurance claims may need further investigation • Medical histories may give indications of complications or helpful combinations for patients 3
  • 4. IBM SPSS Modeler 14.2 Defining MBA • MBA data • Customers • Purchases (baskets or item sets) • Items • Set of tables • Purchase (Order) is the fundamental data structure • Individual items are line items • Product –descriptive info • Customer info can be helpful 4
  • 5. IBM SPSS Modeler 14.2 Levels of Data 5 Adapted from Barry & Linoff
  • 6. IBM SPSS Modeler 14.2 MBA • The three levels of data are important for MBA. They can be used to answer a number of questions • Average number of baskets/customer/time unit • Average unique items per customer • Average number of items per basket • For a given product, what is the proportion of customers who have ever purchased the product? • For a given product, what is the average number of baskets per customer that include the item • For a given product, what is the average quantity purchased in an order when the product is purchased? 6
  • 7. IBM SPSS Modeler 14.2 Item Popularity • Most common item in one-item baskets • Most common item in multi-item baskets • Most common items among repeat customers • Change in buying patterns of item over time • Buying pattern for an item by region • Time and geography are two of the most important attributes of MBA data 7
  • 8. IBM SPSS Modeler 14.2 Association Rules • Actionable Rules • Wal-Mart customers who purchase Barbie dolls have a 60 percent likelihood of also purchasing one of three types of candy bars • Trivial Rules • Customers who purchase maintenance agreements are very likely to purchase a large appliance • Inexplicable Rules • When a new hardware store opens, one of the most commonly sold items is toilet cleaners Adapted from Barry & Linoff
  • 9. IBM SPSS Modeler 14.2 What exactly is an Association Rule? • Of the form: IF antecedent THEN consequent If (orange juice, milk) Then (bread, bacon) • Rules include measure of support and confidence 9
  • 10. IBM SPSS Modeler 14.2 How good is an Association Rule? • Transactions can be converted to Co-occurrence matrices • Co-occurrence tables highlight simple patterns • Confidence and support can be directly determined from a co- occurrence table 10
  • 11. IBM SPSS Modeler 14.2 Co-Occurrence Table OJ WC Milk Soda Det OJ WC - Milk - - Soda - - - Det - - - - 11 Customer Items 1 Orange juice, soda 2 Milk, orange juice, window cleaner 3 Orange juice, detergent 4 Orange juice, detergent, soda 5 Window cleaner, milk
  • 12. IBM SPSS Modeler 14.2 Co-Occurrence Table OJ WC Milk Soda Det OJ 4 1 1 2 2 WC - 2 2 0 0 Milk - - 2 0 0 Soda - - - 2 1 Det - - - 2 12 Customer Items 1 Orange juice, soda 2 Milk, orange juice, window cleaner 3 Orange juice, detergent 4 Orange juice, detergent, soda 5 Window cleaner, milk
  • 13. IBM SPSS Modeler 14.2 Confidence, Support and Lift • Support for the rule # records with both antecedent and consequent Total # records • Confidence for the rule # records with both antecedent and consequent # records of the antecedent • Expected Confidence # records of the consequent Total # records • Lift Confidence / Expected Confidence 13
  • 14. IBM SPSS Modeler 14.2 Confidence and Support • Rule: If soda then orange juice From the co-occurrence table, soda and orange juice occur together 2 times (out of 5 total transactions) Thus, support for the rule is 2/5 or 40% • Confidence for the rule: Soda occurs 2 times; so confidence of orange juice given soda would be 2/2 or 100% • Lift for the rule: Confidence / Expected Confidence confidence = 100%; expected confidence=80% lift = 1.0/.8 = 1.25 • Rule: If orange juice then soda support for the rule is the same—40% orange juice occurs 4 times; so confidence of soda given orange juice is 2/4 or 50% lift = .5/.8 14
  • 15. IBM SPSS Modeler 14.2 Building Association Rules 15 Adapted from Barry & Linoff
  • 16. IBM SPSS Modeler 14.2 Product Hierarchies 16
  • 17. IBM SPSS Modeler 14.2IBM SPSS Modeler 14.2 Prepared by David Douglas, University of Arkansas Hosted by the University of Arkansas 17 Association rule learning is a popular and well researched method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using different measures of interestingness.[1] Based on the concept of strong rules, Rakesh Agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (POS) systems in supermarkets. For example, the rule found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is likely to also buy hamburger meat. Such information can be used as the basis for decisions about marketing activities such as, e.g., promotional pricing or product placements. In addition to the above example from market basket analysis association rules are employed today in many application areas including Web usage mining, intrusion detection, Continuous production, and bioinformatics. In contrast with sequence mining, association rule learning typically does not consider the order of items either within a transaction or across transactions