SlideShare a Scribd company logo
1 of 20
Download to read offline
Predictive Analysis Major Assignment # 3
MBA 590 – Supermarket Organic Product Analysis
Final Report
Prepared by:
ARTHUR DOUCETTE
RYAN SULIER
SUJIT SRIVASTAVA
Organic Product Consumption Analysis
1 | ​Page
Organic Product Consumption Analysis
TABLE OF CONTENTS
EXECUTIVE SUMMARY
DATA IMPUTATION
DATA MODELING PROCESS
PREDICTIVE ANALYSIS MODELS
JMP GRAPHS FOR ANALYZING PROFITABILITY OF THE CUSTOMERS:
BASIC TREE
LEAFSIZE-50
INTERACTIVE TREE
MODAL COMPARISON
CONCLUSION
APPENDIX
APPENDIX A
APPENDIX B
APPENDIX C (I)
APPENDIX C(II)
APPENDIX C(III)
2 | ​Page
Organic Product Consumption Analysis
Executive Summary
The purpose of this report is to explore the given business scenario and figure out how customers are
likely to purchase organic products for a supermarket, as well as to build a predictive model for
classifying customers according to their likelihood to purchase these products.
In our analysis, we try to identify the profitability of customers who purchase organic products from
the supermarket vs customers who do not purchase these products.
Data Imputation
To predict the data analysis more accurately, we decided to use data partition node in SAS miner and
divided the data into 30% as test and 70% as validation data. We have used validation data as
benchmark to predict the results. The supermarket data set for organic products had over 22,000
observations had many missing values in 13 variables. We had to impute the missing data values,
using most common data imputation techniques like Tree/Mean etc. The summary of missing values
and how the data was imputed is given in appendix A.
3 | ​Page
Organic Product Consumption Analysis
Data Modeling Process
This report will analyze the behavior of customers regrading organic products of the mentioned
supermarket. It will help answer the following questions:
- How can we characterize the “profitability” of the customers who purchased organic products
vs those who didn’t purchase organic products? Do they spend similar amounts, or does
there appear to be a significant difference? Do customers who purchase organic products
spend more at your store in general than customers how don’t purchase organic products (or
vice versa)?
- Continuing along a similar path, are there any noticeable differences in the percentage of
customers who purchase organic products across the different loyalty status groups (for
example, is the percentage of platinum customers who purchase organic products higher than
the percentage of tin customers who purchase organic products)? What about the
profitability of the customers in the different loyalty groups?
- What factors seem to have the most impact on a customer’s likelihood to purchase organic
products? (include any relevant statistical output to support your answer) Based on your
model, how would you describe the “typical” organic products customer?
To better analyze the observations, we have used the combination of both logistic regression and
decision trees.
We have listed below all the combinations for performed analysis methods:
- JMP graphs for analyzing profitability of the customers
- SAS miner for generating following prediction models
o Forward
o Backward
o Stepwise
o Basic Tree
o Tree with 50 leaves
o Interactive Tree
4 | ​Page
Organic Product Consumption Analysis
Predictive Analysis Models
We have tried to use both JMP and SAS Miner tools in our analysis to get the maximum results. These
analysis methods are explained below in detail:
JMP GRAPHS FOR ANALYZING PROFITABILITY OF THE CUSTOMERS:
We have used JMP Graph Builder to analyze the profitability of the customer. This analysis will answer
our following question:
How can we characterize the “profitability” of the customers who purchased organic products vs those
who didn’t purchase organic products? Do they spend similar amounts, or does there appear to be a
significant difference? Do customers who purchase organic products spend more at your store in
general than customers how don’t purchase organic products (or vice versa)?
After running the dataset against JMP, the result of the JMP Box plot using Graph builder is shown
below:
The above box plot clearly shows that the customer’s total amount that they spend whether they buy
the organic products or not, does not differ a great deal. Means there is very little difference in their
profitability.
Using the JMP Graph builder we can also answer the following question:
Continuing along a similar path, are there any noticeable differences in the percentage of
customers who purchase organic products across the different loyalty status groups (for
5 | ​Page
Organic Product Consumption Analysis
example, is the percentage of platinum customers who purchase organic products higher than
the percentage of tin customers who purchase organic products)? What about the profitability
of the customers in the different loyalty groups?
To find the profitability of the customers in different loyalty groups, we decided to use ​Fit-Y-By-X tool
of JMP.
Using this tool we got the following result, which shows the categorical data according to the
customer’s loyalty class:
The above graph clearly shows that the customers with Tin class are actually buying more Organic
products than that of Silver, Platinum or Gold. But it’s not a huge difference. In effect, looking at the
graph, we can say that there is not much difference between classes who buys more Organic
products, but there is little more buying difference in Tin class customer than other classes. Maybe
these customers were taking advantage of coupons that were offered to them.
Other output from this tool is given in Appendix B
We used numerous models for analysis of the following question:
What factors seem to have the most impact on a customer’s likelihood to purchase organic products?
(include any relevant statistical output to support your answer) Based on your model, how would you
describe the “typical” organic products customer?
These models are explained below:
6 | ​Page
Organic Product Consumption Analysis
BASIC TREE
First we tried to analyze the data using a simple Basic Tree predictive model. The results that we got
are shown below:
Other details of this basic tree are given in Appendix C (i)
7 | ​Page
Organic Product Consumption Analysis
LEAFSIZE-50
After trying Basic tree we decided to increase the leaf-size of the tree and analyze the data with 50
leaf nodes. The purpose of increasing the leaf size is to have decent number of observations included
in terminal node which will help us predict the data more precisely.​ ​The result that is shown below:
For the reference we have included many other important details like Fit Stats, Treemap, Leaf Stats,
Score overlay details and other details in Appendix C (ii)
8 | ​Page
Organic Product Consumption Analysis
INTERACTIVE TREE
At last we also tried Interactive Tree, the result of which is shown below:
Other details of this interactive tree is listed in Appendix C (iii)
9 | ​Page
Organic Product Consumption Analysis
Modal Comparison
After analyzing the results from all the predictive models, we did a model comparison using SAS
Model Comparison tool. The resulting graph with comparison in SAS miner is shown below:
In the above figure we can see the data partition node being added to make the validation data up to
70% and test data down to 30%. Impute node also can be seen just before all the predictive model
nodes. This is because we need to impute values before we start the modeling process. StatExplore
node gives the details of overall graph and data analytics which can be used for various statistics not
involving regression. At last we see the model comparison node responsible for comparing the models
based on misclassification rate.
After looking at the results of Model Comparison node in SAS, we found that the best model that was
evaluated in terms of fit statistics was LeafSize-50 Tree. Details are as follows:
Fit stat for Model Comparison is shown below:
10 | ​Page
Organic Product Consumption Analysis
Looking at the above graph, it is evident that the selected model LeafSize-50 is the best model among
the other two comparatively. LeafSize-50 is having the least misclassification rate of 0.193 or 19%
Conclusion
Looking at the analysis based on LeafSize-50 model, we concluded that Age, Affluence Grade and
Gender are the most important factors having the most impact on a customer’s likelihood to purchase
organic product. Looking at the tree we see that Female customers are more likely to purchase the
products.
Some of our recommendation to supermarket would be to focus on customers with young age.
Maybe educate them to use organic products and list the benefits of using the product. Maybe apply
digital marketing concepts to reach to young audience and give some incentives for using the Organic
products. They should also come up with some strategy for Male customers of all ages.
11 | ​Page
Organic Product Consumption Analysis
Appendix
Appendix A
12 | ​Page
Organic Product Consumption Analysis
Appendix B
13 | ​Page
Organic Product Consumption Analysis
Appendix C (i)
14 | ​Page
Organic Product Consumption Analysis
Appendix C (ii)
15 | ​Page
Organic Product Consumption Analysis
16 | ​Page
Organic Product Consumption Analysis
17 | ​Page
Organic Product Consumption Analysis
Appendix C (iii)
18 | ​Page
Organic Product Consumption Analysis
19 | ​Page

More Related Content

What's hot

Conjoint Analysis
Conjoint AnalysisConjoint Analysis
Conjoint Analysiscclayne21
 
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET Journal
 
Boomerang Insights 2016 Consumer Electronics Report
Boomerang Insights 2016 Consumer Electronics ReportBoomerang Insights 2016 Consumer Electronics Report
Boomerang Insights 2016 Consumer Electronics ReportMichelle Ai
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryPranov Mishra
 
Purchase Decision Analysis Marketing Mix (Case Study Mandiri E-Cash Transacti...
Purchase Decision Analysis Marketing Mix (Case Study Mandiri E-Cash Transacti...Purchase Decision Analysis Marketing Mix (Case Study Mandiri E-Cash Transacti...
Purchase Decision Analysis Marketing Mix (Case Study Mandiri E-Cash Transacti...inventionjournals
 
Democratization of Analytics
Democratization of AnalyticsDemocratization of Analytics
Democratization of AnalyticsPrajakta Vaidya
 
Basic statistical & pharmaceutical statistical applications
Basic statistical & pharmaceutical statistical applicationsBasic statistical & pharmaceutical statistical applications
Basic statistical & pharmaceutical statistical applicationsYogitaKolekar1
 
Mevsys Data Mining: one product per customer.
Mevsys Data Mining: one product per customer.Mevsys Data Mining: one product per customer.
Mevsys Data Mining: one product per customer.Mevsys Data Mining
 
Focus of consumer preferences on various factors of green marketing
Focus of consumer preferences on various factors of green marketingFocus of consumer preferences on various factors of green marketing
Focus of consumer preferences on various factors of green marketingArudhra N
 
How Innovation Could Apply to Customer Insights for Better Decision Making?
How Innovation Could Apply to Customer Insights for Better Decision Making?How Innovation Could Apply to Customer Insights for Better Decision Making?
How Innovation Could Apply to Customer Insights for Better Decision Making?Frédéric Baffou
 

What's hot (20)

Conjoint Analysis
Conjoint AnalysisConjoint Analysis
Conjoint Analysis
 
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
 
Boomerang Insights 2016 Consumer Electronics Report
Boomerang Insights 2016 Consumer Electronics ReportBoomerang Insights 2016 Consumer Electronics Report
Boomerang Insights 2016 Consumer Electronics Report
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage Industry
 
Purchase Decision Analysis Marketing Mix (Case Study Mandiri E-Cash Transacti...
Purchase Decision Analysis Marketing Mix (Case Study Mandiri E-Cash Transacti...Purchase Decision Analysis Marketing Mix (Case Study Mandiri E-Cash Transacti...
Purchase Decision Analysis Marketing Mix (Case Study Mandiri E-Cash Transacti...
 
Democratization of Analytics
Democratization of AnalyticsDemocratization of Analytics
Democratization of Analytics
 
A New Sales Forecasting Model for International Restaurants
A New Sales Forecasting Model for International RestaurantsA New Sales Forecasting Model for International Restaurants
A New Sales Forecasting Model for International Restaurants
 
2016 aor-aloysius-et-al
2016 aor-aloysius-et-al2016 aor-aloysius-et-al
2016 aor-aloysius-et-al
 
Basic statistical & pharmaceutical statistical applications
Basic statistical & pharmaceutical statistical applicationsBasic statistical & pharmaceutical statistical applications
Basic statistical & pharmaceutical statistical applications
 
Conjoint analysis
Conjoint analysisConjoint analysis
Conjoint analysis
 
556
556556
556
 
Mevsys Data Mining: one product per customer.
Mevsys Data Mining: one product per customer.Mevsys Data Mining: one product per customer.
Mevsys Data Mining: one product per customer.
 
Focus of consumer preferences on various factors of green marketing
Focus of consumer preferences on various factors of green marketingFocus of consumer preferences on various factors of green marketing
Focus of consumer preferences on various factors of green marketing
 
Ps28
Ps28Ps28
Ps28
 
How Innovation Could Apply to Customer Insights for Better Decision Making?
How Innovation Could Apply to Customer Insights for Better Decision Making?How Innovation Could Apply to Customer Insights for Better Decision Making?
How Innovation Could Apply to Customer Insights for Better Decision Making?
 
Ps30
Ps30Ps30
Ps30
 
Chapter 6 Mr
Chapter 6 MrChapter 6 Mr
Chapter 6 Mr
 
conjoint analysis
conjoint analysisconjoint analysis
conjoint analysis
 
poster
posterposter
poster
 
Conjoint and cluster analysis
Conjoint and cluster analysisConjoint and cluster analysis
Conjoint and cluster analysis
 

Similar to OrganicProducts_FinalReport.docx

Role of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryRole of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryPerceptive Analytics
 
Customer Personality Analysis — Part 1.pdf
Customer Personality Analysis — Part 1.pdfCustomer Personality Analysis — Part 1.pdf
Customer Personality Analysis — Part 1.pdfssuser33ba021
 
2 Chapter 1 The Where, Why, and How of Data Collection Busines.docx
2 Chapter 1  The Where, Why, and How of Data Collection Busines.docx2 Chapter 1  The Where, Why, and How of Data Collection Busines.docx
2 Chapter 1 The Where, Why, and How of Data Collection Busines.docxlorainedeserre
 
Consumer Insights: Finding and Guarding the Treasure Trove
Consumer Insights: Finding and Guarding the Treasure TroveConsumer Insights: Finding and Guarding the Treasure Trove
Consumer Insights: Finding and Guarding the Treasure TroveCapgemini
 
Supply Chain Metrics That Matter: A Focus on Consumer Products - 3 AUG 2015 -...
Supply Chain Metrics That Matter: A Focus on Consumer Products - 3 AUG 2015 -...Supply Chain Metrics That Matter: A Focus on Consumer Products - 3 AUG 2015 -...
Supply Chain Metrics That Matter: A Focus on Consumer Products - 3 AUG 2015 -...Lora Cecere
 
Pricing Strategies for Brands
Pricing Strategies for BrandsPricing Strategies for Brands
Pricing Strategies for Brandsveesingh
 
Pricing strategy progresso
Pricing strategy progressoPricing strategy progresso
Pricing strategy progressoveesingh
 
Can Product ReturnsMake You MoneyS P R I N G 2 0 1 0 .docx
Can Product ReturnsMake You MoneyS P R I N G  2 0 1 0  .docxCan Product ReturnsMake You MoneyS P R I N G  2 0 1 0  .docx
Can Product ReturnsMake You MoneyS P R I N G 2 0 1 0 .docxhacksoni
 
Final presentation zg2088
Final presentation zg2088Final presentation zg2088
Final presentation zg2088ssuserd6504f
 
inTouch Body of work
inTouch Body of workinTouch Body of work
inTouch Body of workamps1000
 
2016 Supply Chains to Admire - Report - 26 July 2016
2016 Supply Chains to Admire - Report - 26 July 20162016 Supply Chains to Admire - Report - 26 July 2016
2016 Supply Chains to Admire - Report - 26 July 2016Lora Cecere
 
Big Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourBig Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourIRJET Journal
 
An Exploration of Sephora's Winning Formula
An Exploration of Sephora's Winning FormulaAn Exploration of Sephora's Winning Formula
An Exploration of Sephora's Winning FormulaKeRoxiLi
 
User survey analysis customers rate their CPM vendors, 2012 Gartner
User survey analysis customers rate  their CPM vendors, 2012 GartnerUser survey analysis customers rate  their CPM vendors, 2012 Gartner
User survey analysis customers rate their CPM vendors, 2012 GartnerMiguel Garcia
 
Supply Chains to Admire - 2018
 Supply Chains to Admire - 2018 Supply Chains to Admire - 2018
Supply Chains to Admire - 2018Lora Cecere
 
Sales analysis using product rating in data mining techniques
Sales analysis using product rating in data mining techniquesSales analysis using product rating in data mining techniques
Sales analysis using product rating in data mining techniqueseSAT Journals
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data ScienceCarolyn Knight
 
Product Analytics Vs Marketing Analytics: What should you use?
Product Analytics Vs Marketing Analytics: What should you use?Product Analytics Vs Marketing Analytics: What should you use?
Product Analytics Vs Marketing Analytics: What should you use?appICEappICE
 
Intelligent Shopping Recommender using Data Mining
Intelligent Shopping Recommender using Data MiningIntelligent Shopping Recommender using Data Mining
Intelligent Shopping Recommender using Data MiningIRJET Journal
 

Similar to OrganicProducts_FinalReport.docx (20)

Role of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryRole of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods Industry
 
Customer Personality Analysis — Part 1.pdf
Customer Personality Analysis — Part 1.pdfCustomer Personality Analysis — Part 1.pdf
Customer Personality Analysis — Part 1.pdf
 
2 Chapter 1 The Where, Why, and How of Data Collection Busines.docx
2 Chapter 1  The Where, Why, and How of Data Collection Busines.docx2 Chapter 1  The Where, Why, and How of Data Collection Busines.docx
2 Chapter 1 The Where, Why, and How of Data Collection Busines.docx
 
Consumer Insights: Finding and Guarding the Treasure Trove
Consumer Insights: Finding and Guarding the Treasure TroveConsumer Insights: Finding and Guarding the Treasure Trove
Consumer Insights: Finding and Guarding the Treasure Trove
 
Supply Chain Metrics That Matter: A Focus on Consumer Products - 3 AUG 2015 -...
Supply Chain Metrics That Matter: A Focus on Consumer Products - 3 AUG 2015 -...Supply Chain Metrics That Matter: A Focus on Consumer Products - 3 AUG 2015 -...
Supply Chain Metrics That Matter: A Focus on Consumer Products - 3 AUG 2015 -...
 
Pricing Strategies for Brands
Pricing Strategies for BrandsPricing Strategies for Brands
Pricing Strategies for Brands
 
Pricing strategy progresso
Pricing strategy progressoPricing strategy progresso
Pricing strategy progresso
 
Can Product ReturnsMake You MoneyS P R I N G 2 0 1 0 .docx
Can Product ReturnsMake You MoneyS P R I N G  2 0 1 0  .docxCan Product ReturnsMake You MoneyS P R I N G  2 0 1 0  .docx
Can Product ReturnsMake You MoneyS P R I N G 2 0 1 0 .docx
 
Final presentation zg2088
Final presentation zg2088Final presentation zg2088
Final presentation zg2088
 
Intro.pptx
Intro.pptxIntro.pptx
Intro.pptx
 
inTouch Body of work
inTouch Body of workinTouch Body of work
inTouch Body of work
 
2016 Supply Chains to Admire - Report - 26 July 2016
2016 Supply Chains to Admire - Report - 26 July 20162016 Supply Chains to Admire - Report - 26 July 2016
2016 Supply Chains to Admire - Report - 26 July 2016
 
Big Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourBig Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer Behaviour
 
An Exploration of Sephora's Winning Formula
An Exploration of Sephora's Winning FormulaAn Exploration of Sephora's Winning Formula
An Exploration of Sephora's Winning Formula
 
User survey analysis customers rate their CPM vendors, 2012 Gartner
User survey analysis customers rate  their CPM vendors, 2012 GartnerUser survey analysis customers rate  their CPM vendors, 2012 Gartner
User survey analysis customers rate their CPM vendors, 2012 Gartner
 
Supply Chains to Admire - 2018
 Supply Chains to Admire - 2018 Supply Chains to Admire - 2018
Supply Chains to Admire - 2018
 
Sales analysis using product rating in data mining techniques
Sales analysis using product rating in data mining techniquesSales analysis using product rating in data mining techniques
Sales analysis using product rating in data mining techniques
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data Science
 
Product Analytics Vs Marketing Analytics: What should you use?
Product Analytics Vs Marketing Analytics: What should you use?Product Analytics Vs Marketing Analytics: What should you use?
Product Analytics Vs Marketing Analytics: What should you use?
 
Intelligent Shopping Recommender using Data Mining
Intelligent Shopping Recommender using Data MiningIntelligent Shopping Recommender using Data Mining
Intelligent Shopping Recommender using Data Mining
 

OrganicProducts_FinalReport.docx

  • 1. Predictive Analysis Major Assignment # 3 MBA 590 – Supermarket Organic Product Analysis Final Report Prepared by: ARTHUR DOUCETTE RYAN SULIER SUJIT SRIVASTAVA
  • 2. Organic Product Consumption Analysis 1 | ​Page
  • 3. Organic Product Consumption Analysis TABLE OF CONTENTS EXECUTIVE SUMMARY DATA IMPUTATION DATA MODELING PROCESS PREDICTIVE ANALYSIS MODELS JMP GRAPHS FOR ANALYZING PROFITABILITY OF THE CUSTOMERS: BASIC TREE LEAFSIZE-50 INTERACTIVE TREE MODAL COMPARISON CONCLUSION APPENDIX APPENDIX A APPENDIX B APPENDIX C (I) APPENDIX C(II) APPENDIX C(III) 2 | ​Page
  • 4. Organic Product Consumption Analysis Executive Summary The purpose of this report is to explore the given business scenario and figure out how customers are likely to purchase organic products for a supermarket, as well as to build a predictive model for classifying customers according to their likelihood to purchase these products. In our analysis, we try to identify the profitability of customers who purchase organic products from the supermarket vs customers who do not purchase these products. Data Imputation To predict the data analysis more accurately, we decided to use data partition node in SAS miner and divided the data into 30% as test and 70% as validation data. We have used validation data as benchmark to predict the results. The supermarket data set for organic products had over 22,000 observations had many missing values in 13 variables. We had to impute the missing data values, using most common data imputation techniques like Tree/Mean etc. The summary of missing values and how the data was imputed is given in appendix A. 3 | ​Page
  • 5. Organic Product Consumption Analysis Data Modeling Process This report will analyze the behavior of customers regrading organic products of the mentioned supermarket. It will help answer the following questions: - How can we characterize the “profitability” of the customers who purchased organic products vs those who didn’t purchase organic products? Do they spend similar amounts, or does there appear to be a significant difference? Do customers who purchase organic products spend more at your store in general than customers how don’t purchase organic products (or vice versa)? - Continuing along a similar path, are there any noticeable differences in the percentage of customers who purchase organic products across the different loyalty status groups (for example, is the percentage of platinum customers who purchase organic products higher than the percentage of tin customers who purchase organic products)? What about the profitability of the customers in the different loyalty groups? - What factors seem to have the most impact on a customer’s likelihood to purchase organic products? (include any relevant statistical output to support your answer) Based on your model, how would you describe the “typical” organic products customer? To better analyze the observations, we have used the combination of both logistic regression and decision trees. We have listed below all the combinations for performed analysis methods: - JMP graphs for analyzing profitability of the customers - SAS miner for generating following prediction models o Forward o Backward o Stepwise o Basic Tree o Tree with 50 leaves o Interactive Tree 4 | ​Page
  • 6. Organic Product Consumption Analysis Predictive Analysis Models We have tried to use both JMP and SAS Miner tools in our analysis to get the maximum results. These analysis methods are explained below in detail: JMP GRAPHS FOR ANALYZING PROFITABILITY OF THE CUSTOMERS: We have used JMP Graph Builder to analyze the profitability of the customer. This analysis will answer our following question: How can we characterize the “profitability” of the customers who purchased organic products vs those who didn’t purchase organic products? Do they spend similar amounts, or does there appear to be a significant difference? Do customers who purchase organic products spend more at your store in general than customers how don’t purchase organic products (or vice versa)? After running the dataset against JMP, the result of the JMP Box plot using Graph builder is shown below: The above box plot clearly shows that the customer’s total amount that they spend whether they buy the organic products or not, does not differ a great deal. Means there is very little difference in their profitability. Using the JMP Graph builder we can also answer the following question: Continuing along a similar path, are there any noticeable differences in the percentage of customers who purchase organic products across the different loyalty status groups (for 5 | ​Page
  • 7. Organic Product Consumption Analysis example, is the percentage of platinum customers who purchase organic products higher than the percentage of tin customers who purchase organic products)? What about the profitability of the customers in the different loyalty groups? To find the profitability of the customers in different loyalty groups, we decided to use ​Fit-Y-By-X tool of JMP. Using this tool we got the following result, which shows the categorical data according to the customer’s loyalty class: The above graph clearly shows that the customers with Tin class are actually buying more Organic products than that of Silver, Platinum or Gold. But it’s not a huge difference. In effect, looking at the graph, we can say that there is not much difference between classes who buys more Organic products, but there is little more buying difference in Tin class customer than other classes. Maybe these customers were taking advantage of coupons that were offered to them. Other output from this tool is given in Appendix B We used numerous models for analysis of the following question: What factors seem to have the most impact on a customer’s likelihood to purchase organic products? (include any relevant statistical output to support your answer) Based on your model, how would you describe the “typical” organic products customer? These models are explained below: 6 | ​Page
  • 8. Organic Product Consumption Analysis BASIC TREE First we tried to analyze the data using a simple Basic Tree predictive model. The results that we got are shown below: Other details of this basic tree are given in Appendix C (i) 7 | ​Page
  • 9. Organic Product Consumption Analysis LEAFSIZE-50 After trying Basic tree we decided to increase the leaf-size of the tree and analyze the data with 50 leaf nodes. The purpose of increasing the leaf size is to have decent number of observations included in terminal node which will help us predict the data more precisely.​ ​The result that is shown below: For the reference we have included many other important details like Fit Stats, Treemap, Leaf Stats, Score overlay details and other details in Appendix C (ii) 8 | ​Page
  • 10. Organic Product Consumption Analysis INTERACTIVE TREE At last we also tried Interactive Tree, the result of which is shown below: Other details of this interactive tree is listed in Appendix C (iii) 9 | ​Page
  • 11. Organic Product Consumption Analysis Modal Comparison After analyzing the results from all the predictive models, we did a model comparison using SAS Model Comparison tool. The resulting graph with comparison in SAS miner is shown below: In the above figure we can see the data partition node being added to make the validation data up to 70% and test data down to 30%. Impute node also can be seen just before all the predictive model nodes. This is because we need to impute values before we start the modeling process. StatExplore node gives the details of overall graph and data analytics which can be used for various statistics not involving regression. At last we see the model comparison node responsible for comparing the models based on misclassification rate. After looking at the results of Model Comparison node in SAS, we found that the best model that was evaluated in terms of fit statistics was LeafSize-50 Tree. Details are as follows: Fit stat for Model Comparison is shown below: 10 | ​Page
  • 12. Organic Product Consumption Analysis Looking at the above graph, it is evident that the selected model LeafSize-50 is the best model among the other two comparatively. LeafSize-50 is having the least misclassification rate of 0.193 or 19% Conclusion Looking at the analysis based on LeafSize-50 model, we concluded that Age, Affluence Grade and Gender are the most important factors having the most impact on a customer’s likelihood to purchase organic product. Looking at the tree we see that Female customers are more likely to purchase the products. Some of our recommendation to supermarket would be to focus on customers with young age. Maybe educate them to use organic products and list the benefits of using the product. Maybe apply digital marketing concepts to reach to young audience and give some incentives for using the Organic products. They should also come up with some strategy for Male customers of all ages. 11 | ​Page
  • 13. Organic Product Consumption Analysis Appendix Appendix A 12 | ​Page
  • 14. Organic Product Consumption Analysis Appendix B 13 | ​Page
  • 15. Organic Product Consumption Analysis Appendix C (i) 14 | ​Page
  • 16. Organic Product Consumption Analysis Appendix C (ii) 15 | ​Page
  • 17. Organic Product Consumption Analysis 16 | ​Page
  • 18. Organic Product Consumption Analysis 17 | ​Page
  • 19. Organic Product Consumption Analysis Appendix C (iii) 18 | ​Page
  • 20. Organic Product Consumption Analysis 19 | ​Page