SlideShare a Scribd company logo
1 of 3
Support is an indication of how frequently the items appear in the database. 
Confidence indicates the number of times the if/then statements have been found to be true. 
In data mining, association rules are useful for analyzing and predicting customer behavior. They 
play an important part in shopping basket data analysis, product clustering, catalog design and store 
layout. 
In data mining and association rule learning, lift is a measure of the performance of a targeting 
model (association rule) at predicting or classifying cases as having an enhanced response (with 
respect to the population as a whole), measured against a random choice targeting model. A 
targeting model is doing a good job if the response within the target is much better than the 
average for the population as a whole. Lift is simply the ratio of these values: target response 
divided by average response. 
For example, suppose a population has an average response rate of 5%, but a certain model (or 
rule) has identified a segment with a response rate of 20%. Then that segment would have a lift 
of 4.0 (20%/5%). 
Typically, the modeller seeks to divide the population into quantiles, and rank the quantiles by 
lift. Organizations can then consider each quantile, and by weighing the predicted response rate 
(and associated financial benefit) against the cost, they can decide whether to market to that 
quantile or not. 
Lift is analogous to information retrieval's average precision metric, if one treats the precision 
(fraction of the positives that are true positives) as the target response probability. 
The lift curve can also be considered a variation on the receiver operating characteristic (ROC) 
curve, and is also known in econometrics as the Lorenz or power curve.[1] 
The difference between the lifts observed on two different subgroups is called the uplift. The 
subtraction of two lift curves forms the uplift curve, which is a metric used in uplift modelling.[2] 
[3] 
From your book below definitions
# 3 Determining what consists of a frequent item set is related to the concept of support. The support 
of a rule is simply the number of transactions that include both the antecedent and consequent item 
sets. It is called a support because it measures the degree to which the data “ support” the validity of 
the rule. The support is sometimes expressed as a percentage of the total number of records in the 
database. For example, the support for the item set { red, white} in the phone faceplate example is 4 ( 
100 × 4 10 = 40%). What constitutes a frequent item set is therefore defined as an item set that has a 
support that exceeds a selected minimum support, determined by the user. 
It is easy to generate frequent one- item sets. All we need to do is to count, for each item, how many 
transactions in the database include the item. These transaction counts are the supports for the one - 
item sets. We drop one- item sets that have support below the desired minimum support to create a list 
of the frequent one- item sets. To generate frequent two- item sets, we use the frequent one- item sets. 
The reasoning is that if a certain one- item set did not exceed the minimum support, any larger size item 
set that includes it will not exceed the minimum support. 
Confidence = To measure the strength of association implied by a rule, we use the measures of 
confidence and lift ratio, as described below. Support and Confidence In addition to support, which we 
described earlier, there is another measure that expresses the degree of uncertainty about the if – then 
rule. This is known as the confidence1 of the rule. This measure compares the co- occurrence of the 
antecedent and consequent item sets in the database to the occurrence of the antecedent item sets. 
Confidence is defined as the ratio of the number of transactions that include all antecedent and 
consequent item sets ( namely, the support) to the number of transactions that include all the 
antecedent item sets: Confidence = no. transactions with both antecedent and consequent item sets no. 
transactions tFor example, suppose that a supermarket database has 100,000 point- of-sale 
transactions. Of these transactions, 2000 include both orange juice and ( over- the- counter) flu 
medication, and 800 of these include soup purchases. The association rule “ IF orange juice and flu 
medication are purchased THEN soup is purchased on the same trip” has a support of 800 transactions ( 
alternatively, 0.8% = 800/ 100,000) and a confidence of 40% (= 800/ 2000) . To see the relationship 
between support and confidence, let us think about what each is measuring ( estimating). One way to 
think of support is that it is the ( estimated) probability that a transaction selected randomly from the 
database will contain all items in the antecedent and the consequent: P( antecedent AND consequent). 
In comparison, the confidence is the ( estimated) conditional probability that a trans-action selected 
randomly will include all the items in the consequent given that the transaction includes all the items in 
the antecedent: 
P( antecedent AND consequent) P( antecedent) = P( consequent | antecedent). 
A high value of confidence suggests a strong association rule ( in which we are highly confident). 
However, this can be deceptive because if the antecedentecedent item set and/ or the consequent has a 
high level of support, we can have a high value for confidence even when the antecedent and 
consequent are independent! For example, if nearly all customers buy bananas and nearly all customers
buy ice cream, the confidence level will be high regardless of whether there is an association between 
the items. 
Lift Ratio A better way to judge the strength of an association rule is to compare the confi -dence of the 
rule with a benchmark value, where we assume that the occurrence of the consequent item set in a 
transaction is independent of the occurrence of the antecedent for each rule. In other words, if the 
antecedent and conse-quent item sets are independent, what confidence values would we expect to 
see? Under independence, the support would be 
P( antecedent AND consequent) = P( antecedent) × P( consequent), 
and the benchmark confidence would be 
P( antecedent) × P( consequent) P( antecedent) = P( consequent). 
The estimate of this benchmark from the data, called the benchmark confidence value for a rule, is 
computed by Benchmark confidence = no. transactions with consequent item set no. transactions in 
database . We compare the confidence to the benchmark confidence by looking at their ratio: This is 
called the lift ratio of a rule. The lift ratio is the confidence of the rule divided by the confidence, 
assuming independence of consequent from antecedent: Lift ratio = confidence benchmark confidence . 
and the benchmark confidence would be 
P( antecedent) × P( consequent) P( antecedent) = P( consequent). 
The estimate of this benchmark from the data, called the benchmark confidence value for a rule, is 
computed by Benchmark confidence = no. transactions with consequent item set no. transactions in 
database . We compare the confidence to the benchmark confidence by looking at their ratio: This is 
called the lift ratio of a rule. The lift ratio is the confidence of the rule divided by the confidence, 
assuming independence of consequent from antecedent: Lift ratio = confidence benchmark confidence .

More Related Content

What's hot

Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientistsAjay Ohri
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionDerek Kane
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionRupak Roy
 
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachReducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachErik De Monte
 
Stock market prediction using Twitter sentiment analysis
Stock market prediction using Twitter sentiment analysisStock market prediction using Twitter sentiment analysis
Stock market prediction using Twitter sentiment analysisjournal ijrtem
 
Data Analysis and Statistics
Data Analysis and StatisticsData Analysis and Statistics
Data Analysis and StatisticsT.S. Lim
 
Types of Probability Distributions - Statistics II
Types of Probability Distributions - Statistics IITypes of Probability Distributions - Statistics II
Types of Probability Distributions - Statistics IIRupak Roy
 
The Use of Bitcoin for Portfolio Optimization
The Use of Bitcoin for Portfolio OptimizationThe Use of Bitcoin for Portfolio Optimization
The Use of Bitcoin for Portfolio OptimizationFederico Tenga
 
Scope and objective of the assignment
Scope and objective of the assignmentScope and objective of the assignment
Scope and objective of the assignmentGourab Chakraborty
 
An Enhanced Approach of Sensitive Information Hiding
An Enhanced Approach of Sensitive Information HidingAn Enhanced Approach of Sensitive Information Hiding
An Enhanced Approach of Sensitive Information Hidingijsrd.com
 
Google Stock Price Forecasting
Google Stock Price ForecastingGoogle Stock Price Forecasting
Google Stock Price ForecastingArkaprava Kundu
 
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACROBOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACROAnthony Kilili
 
Leveraging Technology and Analytics BSA Risk Assessment
Leveraging Technology and Analytics BSA Risk AssessmentLeveraging Technology and Analytics BSA Risk Assessment
Leveraging Technology and Analytics BSA Risk AssessmentErik De Monte
 
IRJET- Effecient Support Itemset Mining using Parallel Map Reducing
IRJET-  	  Effecient Support Itemset Mining using Parallel Map ReducingIRJET-  	  Effecient Support Itemset Mining using Parallel Map Reducing
IRJET- Effecient Support Itemset Mining using Parallel Map ReducingIRJET Journal
 
Presentation on the topic of association rule mining
Presentation on the topic of association rule miningPresentation on the topic of association rule mining
Presentation on the topic of association rule miningHamzaJaved64
 
Types of Statistics
Types of Statistics Types of Statistics
Types of Statistics Rupak Roy
 
Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit
Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit
Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit Rupak Roy
 
Statistics Assignments 090427
Statistics Assignments 090427Statistics Assignments 090427
Statistics Assignments 090427amykua
 

What's hot (20)

Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model Selection
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Demand forecasting
Demand forecastingDemand forecasting
Demand forecasting
 
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachReducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
 
Stock market prediction using Twitter sentiment analysis
Stock market prediction using Twitter sentiment analysisStock market prediction using Twitter sentiment analysis
Stock market prediction using Twitter sentiment analysis
 
AI IoT data science and consumer behaviour with assortment planning , pricing...
AI IoT data science and consumer behaviour with assortment planning , pricing...AI IoT data science and consumer behaviour with assortment planning , pricing...
AI IoT data science and consumer behaviour with assortment planning , pricing...
 
Data Analysis and Statistics
Data Analysis and StatisticsData Analysis and Statistics
Data Analysis and Statistics
 
Types of Probability Distributions - Statistics II
Types of Probability Distributions - Statistics IITypes of Probability Distributions - Statistics II
Types of Probability Distributions - Statistics II
 
The Use of Bitcoin for Portfolio Optimization
The Use of Bitcoin for Portfolio OptimizationThe Use of Bitcoin for Portfolio Optimization
The Use of Bitcoin for Portfolio Optimization
 
Scope and objective of the assignment
Scope and objective of the assignmentScope and objective of the assignment
Scope and objective of the assignment
 
An Enhanced Approach of Sensitive Information Hiding
An Enhanced Approach of Sensitive Information HidingAn Enhanced Approach of Sensitive Information Hiding
An Enhanced Approach of Sensitive Information Hiding
 
Google Stock Price Forecasting
Google Stock Price ForecastingGoogle Stock Price Forecasting
Google Stock Price Forecasting
 
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACROBOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
 
Leveraging Technology and Analytics BSA Risk Assessment
Leveraging Technology and Analytics BSA Risk AssessmentLeveraging Technology and Analytics BSA Risk Assessment
Leveraging Technology and Analytics BSA Risk Assessment
 
IRJET- Effecient Support Itemset Mining using Parallel Map Reducing
IRJET-  	  Effecient Support Itemset Mining using Parallel Map ReducingIRJET-  	  Effecient Support Itemset Mining using Parallel Map Reducing
IRJET- Effecient Support Itemset Mining using Parallel Map Reducing
 
Presentation on the topic of association rule mining
Presentation on the topic of association rule miningPresentation on the topic of association rule mining
Presentation on the topic of association rule mining
 
Types of Statistics
Types of Statistics Types of Statistics
Types of Statistics
 
Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit
Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit
Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit
 
Statistics Assignments 090427
Statistics Assignments 090427Statistics Assignments 090427
Statistics Assignments 090427
 

Viewers also liked

Trabajo de parto
Trabajo de partoTrabajo de parto
Trabajo de partomiguel pozo
 
Pola ayat dasar
Pola ayat dasarPola ayat dasar
Pola ayat dasaramranmj
 
Laranja baten historia - Historia de una naranja
Laranja baten historia - Historia de una naranjaLaranja baten historia - Historia de una naranja
Laranja baten historia - Historia de una naranjasanjoseweb
 
6LH Mugikortasuna Astea
6LH Mugikortasuna Astea6LH Mugikortasuna Astea
6LH Mugikortasuna Asteasanjoseweb
 
Arizonako olatua
Arizonako olatuaArizonako olatua
Arizonako olatuasanjoseweb
 
Argazki Lehiaketa Mugikortasuna - Concurso fotografías movilidad
Argazki Lehiaketa Mugikortasuna - Concurso fotografías movilidadArgazki Lehiaketa Mugikortasuna - Concurso fotografías movilidad
Argazki Lehiaketa Mugikortasuna - Concurso fotografías movilidadsanjoseweb
 
Time usa 14 april 2014
Time usa 14 april 2014Time usa 14 april 2014
Time usa 14 april 2014Maria Ortiz
 
Radiologia de torax
Radiologia de toraxRadiologia de torax
Radiologia de toraxDeisy Claros
 
Atmosferaren kutsadura kimikoa eta osasuna
Atmosferaren kutsadura kimikoa eta osasunaAtmosferaren kutsadura kimikoa eta osasuna
Atmosferaren kutsadura kimikoa eta osasunasanjoseweb
 
Trabajo de parto
Trabajo de partoTrabajo de parto
Trabajo de partomiguel pozo
 
Kutsadura akustikoa eta osasuna
Kutsadura akustikoa eta osasunaKutsadura akustikoa eta osasuna
Kutsadura akustikoa eta osasunasanjoseweb
 
Ipuina: zubi misteriotsuan
Ipuina: zubi misteriotsuanIpuina: zubi misteriotsuan
Ipuina: zubi misteriotsuansanjoseweb
 
vlsi design summer training ppt
vlsi design summer training pptvlsi design summer training ppt
vlsi design summer training pptBhagwan Lal Teli
 

Viewers also liked (14)

Trabajo de parto
Trabajo de partoTrabajo de parto
Trabajo de parto
 
Pola ayat dasar
Pola ayat dasarPola ayat dasar
Pola ayat dasar
 
Laranja baten historia - Historia de una naranja
Laranja baten historia - Historia de una naranjaLaranja baten historia - Historia de una naranja
Laranja baten historia - Historia de una naranja
 
6LH Mugikortasuna Astea
6LH Mugikortasuna Astea6LH Mugikortasuna Astea
6LH Mugikortasuna Astea
 
Arizonako olatua
Arizonako olatuaArizonako olatua
Arizonako olatua
 
Argazki Lehiaketa Mugikortasuna - Concurso fotografías movilidad
Argazki Lehiaketa Mugikortasuna - Concurso fotografías movilidadArgazki Lehiaketa Mugikortasuna - Concurso fotografías movilidad
Argazki Lehiaketa Mugikortasuna - Concurso fotografías movilidad
 
Time usa 14 april 2014
Time usa 14 april 2014Time usa 14 april 2014
Time usa 14 april 2014
 
Radiologia de torax
Radiologia de toraxRadiologia de torax
Radiologia de torax
 
Scientific Method Notes
Scientific Method NotesScientific Method Notes
Scientific Method Notes
 
Atmosferaren kutsadura kimikoa eta osasuna
Atmosferaren kutsadura kimikoa eta osasunaAtmosferaren kutsadura kimikoa eta osasuna
Atmosferaren kutsadura kimikoa eta osasuna
 
Trabajo de parto
Trabajo de partoTrabajo de parto
Trabajo de parto
 
Kutsadura akustikoa eta osasuna
Kutsadura akustikoa eta osasunaKutsadura akustikoa eta osasuna
Kutsadura akustikoa eta osasuna
 
Ipuina: zubi misteriotsuan
Ipuina: zubi misteriotsuanIpuina: zubi misteriotsuan
Ipuina: zubi misteriotsuan
 
vlsi design summer training ppt
vlsi design summer training pptvlsi design summer training ppt
vlsi design summer training ppt
 

Similar to Assignment #3 10.19.14

Understanding Association Rule Mining
Understanding Association Rule MiningUnderstanding Association Rule Mining
Understanding Association Rule MiningMohit Rajput
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligenceGeo Marian
 
Market Basket Analysis of bakery Shop
Market Basket Analysis of bakery ShopMarket Basket Analysis of bakery Shop
Market Basket Analysis of bakery ShopVarunSahdev2
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithmhina firdaus
 
IRJET-Comparative Analysis of Apriori and Apriori with Hashing Algorithm
IRJET-Comparative Analysis of  Apriori and Apriori with Hashing AlgorithmIRJET-Comparative Analysis of  Apriori and Apriori with Hashing Algorithm
IRJET-Comparative Analysis of Apriori and Apriori with Hashing AlgorithmIRJET Journal
 
Mining Negative Association Rules
Mining Negative Association RulesMining Negative Association Rules
Mining Negative Association RulesIOSR Journals
 
IRJET- Minning Frequent Patterns,Associations and Correlations
IRJET-  	  Minning Frequent Patterns,Associations and CorrelationsIRJET-  	  Minning Frequent Patterns,Associations and Correlations
IRJET- Minning Frequent Patterns,Associations and CorrelationsIRJET Journal
 
5Association AnalysisBasic Concepts an.docx
 5Association AnalysisBasic Concepts an.docx 5Association AnalysisBasic Concepts an.docx
5Association AnalysisBasic Concepts an.docxShiraPrater50
 
Analyzing Adverse Drug Events Using Data Mining Approach
Analyzing Adverse Drug Events Using Data Mining ApproachAnalyzing Adverse Drug Events Using Data Mining Approach
Analyzing Adverse Drug Events Using Data Mining ApproachRupal7
 
Buy iso 9001
Buy iso 9001Buy iso 9001
Buy iso 9001daritajon
 
An iso 9001 certified company
An iso 9001 certified companyAn iso 9001 certified company
An iso 9001 certified companydenritafu
 
MBKM_Minggu 9_Association Rule with R Studio.pptx
MBKM_Minggu 9_Association Rule with R Studio.pptxMBKM_Minggu 9_Association Rule with R Studio.pptx
MBKM_Minggu 9_Association Rule with R Studio.pptxgina458018
 
How to get iso 9001 certification
How to get iso 9001 certificationHow to get iso 9001 certification
How to get iso 9001 certificationarikajom
 
Sabs iso 9001
Sabs iso 9001Sabs iso 9001
Sabs iso 9001jomsatgec
 
Cost of iso 9001 certification
Cost of iso 9001 certificationCost of iso 9001 certification
Cost of iso 9001 certificationjomjintra
 
Iso certification 9001
Iso certification 9001Iso certification 9001
Iso certification 9001pogerita
 

Similar to Assignment #3 10.19.14 (20)

BAS 250 Lecture 4
BAS 250 Lecture 4BAS 250 Lecture 4
BAS 250 Lecture 4
 
Understanding Association Rule Mining
Understanding Association Rule MiningUnderstanding Association Rule Mining
Understanding Association Rule Mining
 
Association rules
Association rulesAssociation rules
Association rules
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Market Basket Analysis of bakery Shop
Market Basket Analysis of bakery ShopMarket Basket Analysis of bakery Shop
Market Basket Analysis of bakery Shop
 
Association rules and frequent pattern growth algorithms
Association rules and frequent pattern growth algorithmsAssociation rules and frequent pattern growth algorithms
Association rules and frequent pattern growth algorithms
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithm
 
IRJET-Comparative Analysis of Apriori and Apriori with Hashing Algorithm
IRJET-Comparative Analysis of  Apriori and Apriori with Hashing AlgorithmIRJET-Comparative Analysis of  Apriori and Apriori with Hashing Algorithm
IRJET-Comparative Analysis of Apriori and Apriori with Hashing Algorithm
 
Mining Negative Association Rules
Mining Negative Association RulesMining Negative Association Rules
Mining Negative Association Rules
 
IRJET- Minning Frequent Patterns,Associations and Correlations
IRJET-  	  Minning Frequent Patterns,Associations and CorrelationsIRJET-  	  Minning Frequent Patterns,Associations and Correlations
IRJET- Minning Frequent Patterns,Associations and Correlations
 
5Association AnalysisBasic Concepts an.docx
 5Association AnalysisBasic Concepts an.docx 5Association AnalysisBasic Concepts an.docx
5Association AnalysisBasic Concepts an.docx
 
Analyzing Adverse Drug Events Using Data Mining Approach
Analyzing Adverse Drug Events Using Data Mining ApproachAnalyzing Adverse Drug Events Using Data Mining Approach
Analyzing Adverse Drug Events Using Data Mining Approach
 
Buy iso 9001
Buy iso 9001Buy iso 9001
Buy iso 9001
 
Unit 4_ML.pptx
Unit 4_ML.pptxUnit 4_ML.pptx
Unit 4_ML.pptx
 
An iso 9001 certified company
An iso 9001 certified companyAn iso 9001 certified company
An iso 9001 certified company
 
MBKM_Minggu 9_Association Rule with R Studio.pptx
MBKM_Minggu 9_Association Rule with R Studio.pptxMBKM_Minggu 9_Association Rule with R Studio.pptx
MBKM_Minggu 9_Association Rule with R Studio.pptx
 
How to get iso 9001 certification
How to get iso 9001 certificationHow to get iso 9001 certification
How to get iso 9001 certification
 
Sabs iso 9001
Sabs iso 9001Sabs iso 9001
Sabs iso 9001
 
Cost of iso 9001 certification
Cost of iso 9001 certificationCost of iso 9001 certification
Cost of iso 9001 certification
 
Iso certification 9001
Iso certification 9001Iso certification 9001
Iso certification 9001
 

Assignment #3 10.19.14

  • 1. Support is an indication of how frequently the items appear in the database. Confidence indicates the number of times the if/then statements have been found to be true. In data mining, association rules are useful for analyzing and predicting customer behavior. They play an important part in shopping basket data analysis, product clustering, catalog design and store layout. In data mining and association rule learning, lift is a measure of the performance of a targeting model (association rule) at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model. A targeting model is doing a good job if the response within the target is much better than the average for the population as a whole. Lift is simply the ratio of these values: target response divided by average response. For example, suppose a population has an average response rate of 5%, but a certain model (or rule) has identified a segment with a response rate of 20%. Then that segment would have a lift of 4.0 (20%/5%). Typically, the modeller seeks to divide the population into quantiles, and rank the quantiles by lift. Organizations can then consider each quantile, and by weighing the predicted response rate (and associated financial benefit) against the cost, they can decide whether to market to that quantile or not. Lift is analogous to information retrieval's average precision metric, if one treats the precision (fraction of the positives that are true positives) as the target response probability. The lift curve can also be considered a variation on the receiver operating characteristic (ROC) curve, and is also known in econometrics as the Lorenz or power curve.[1] The difference between the lifts observed on two different subgroups is called the uplift. The subtraction of two lift curves forms the uplift curve, which is a metric used in uplift modelling.[2] [3] From your book below definitions
  • 2. # 3 Determining what consists of a frequent item set is related to the concept of support. The support of a rule is simply the number of transactions that include both the antecedent and consequent item sets. It is called a support because it measures the degree to which the data “ support” the validity of the rule. The support is sometimes expressed as a percentage of the total number of records in the database. For example, the support for the item set { red, white} in the phone faceplate example is 4 ( 100 × 4 10 = 40%). What constitutes a frequent item set is therefore defined as an item set that has a support that exceeds a selected minimum support, determined by the user. It is easy to generate frequent one- item sets. All we need to do is to count, for each item, how many transactions in the database include the item. These transaction counts are the supports for the one - item sets. We drop one- item sets that have support below the desired minimum support to create a list of the frequent one- item sets. To generate frequent two- item sets, we use the frequent one- item sets. The reasoning is that if a certain one- item set did not exceed the minimum support, any larger size item set that includes it will not exceed the minimum support. Confidence = To measure the strength of association implied by a rule, we use the measures of confidence and lift ratio, as described below. Support and Confidence In addition to support, which we described earlier, there is another measure that expresses the degree of uncertainty about the if – then rule. This is known as the confidence1 of the rule. This measure compares the co- occurrence of the antecedent and consequent item sets in the database to the occurrence of the antecedent item sets. Confidence is defined as the ratio of the number of transactions that include all antecedent and consequent item sets ( namely, the support) to the number of transactions that include all the antecedent item sets: Confidence = no. transactions with both antecedent and consequent item sets no. transactions tFor example, suppose that a supermarket database has 100,000 point- of-sale transactions. Of these transactions, 2000 include both orange juice and ( over- the- counter) flu medication, and 800 of these include soup purchases. The association rule “ IF orange juice and flu medication are purchased THEN soup is purchased on the same trip” has a support of 800 transactions ( alternatively, 0.8% = 800/ 100,000) and a confidence of 40% (= 800/ 2000) . To see the relationship between support and confidence, let us think about what each is measuring ( estimating). One way to think of support is that it is the ( estimated) probability that a transaction selected randomly from the database will contain all items in the antecedent and the consequent: P( antecedent AND consequent). In comparison, the confidence is the ( estimated) conditional probability that a trans-action selected randomly will include all the items in the consequent given that the transaction includes all the items in the antecedent: P( antecedent AND consequent) P( antecedent) = P( consequent | antecedent). A high value of confidence suggests a strong association rule ( in which we are highly confident). However, this can be deceptive because if the antecedentecedent item set and/ or the consequent has a high level of support, we can have a high value for confidence even when the antecedent and consequent are independent! For example, if nearly all customers buy bananas and nearly all customers
  • 3. buy ice cream, the confidence level will be high regardless of whether there is an association between the items. Lift Ratio A better way to judge the strength of an association rule is to compare the confi -dence of the rule with a benchmark value, where we assume that the occurrence of the consequent item set in a transaction is independent of the occurrence of the antecedent for each rule. In other words, if the antecedent and conse-quent item sets are independent, what confidence values would we expect to see? Under independence, the support would be P( antecedent AND consequent) = P( antecedent) × P( consequent), and the benchmark confidence would be P( antecedent) × P( consequent) P( antecedent) = P( consequent). The estimate of this benchmark from the data, called the benchmark confidence value for a rule, is computed by Benchmark confidence = no. transactions with consequent item set no. transactions in database . We compare the confidence to the benchmark confidence by looking at their ratio: This is called the lift ratio of a rule. The lift ratio is the confidence of the rule divided by the confidence, assuming independence of consequent from antecedent: Lift ratio = confidence benchmark confidence . and the benchmark confidence would be P( antecedent) × P( consequent) P( antecedent) = P( consequent). The estimate of this benchmark from the data, called the benchmark confidence value for a rule, is computed by Benchmark confidence = no. transactions with consequent item set no. transactions in database . We compare the confidence to the benchmark confidence by looking at their ratio: This is called the lift ratio of a rule. The lift ratio is the confidence of the rule divided by the confidence, assuming independence of consequent from antecedent: Lift ratio = confidence benchmark confidence .