SlideShare a Scribd company logo
Hackathon
Machine Learning
Submitted by Pro Squad
Apoorva, Deepak, Kunal & Yogesh
INDEX
1. Problem Statement
2. Challenges
4. Binning
5. Data Analysis
6. ML & Business Insights
3. Missing Value Treatment
PROBLEM STATEMENT
Problem Context Relevance AIMs & Objectives
A mall is doing a
coupon campaign
and wants to
ensure the
success of
campaign using a
Robust prediction
model built with
Machine Learning
techniques.
Mall has provided
historical data
which comprises
of recommended
coupons,
customer details
and coupon
consumption
details of
previous years
Mall is going to
run the campaign
again and based
on the historical
data of coupons
effectiveness they
want to increase
the footfalls in
the Mall which
will help the mall
to increase
business for the
shops in the mall.
The AIM of the
project is to come
out with Business
Insights on the
data provided
and Train a
Machine Learning
model which can
predict the
success of
campaign with
highest accuracy
percentage.
CHALLENGES IN HISTORICAL DATA
• 26 features – 9 Numerical and 17
Categorical
• Missing values in 5 Columns
• Categorical Columns have Multiple labels,
going to maximum 25 labels in 1 column.
• Categorical Data has outliers and
skewness
• Most of the features are correlated
MISSING VALUE TREATMENT
• Car – There are 84 values only out of 10147 in
this column which is less then 1% hence we
removed this column as it has no impact.
• Bar, CoffeeHouse, CarryAway,
RestaurantLessThan20, Restaurant20To50 – These
have missing values around 2% hence we have used
the Feature engineering technique to fill the most
commonly occurring value out of the total values
available in these columns.
BINNING
Occupation column has 25 labels and the data frequency variation is very
high creating outliers and skewness, so we used the Binning technique to
reduce the number of labels hence removed the outliers and skewness
BINNING CONTD.
Fig. : 1 Fig. : 2
Fig. : 3 Fig. : 4
Outliers: In Figure – 1, we can see
two dots, these are outliers which we
tackled with binning and hence Figure
- 2 shows the result of binning on the
categorical column
Skewness: In Figure – 3, we can see
the curve is skewed on the right, which
we have tackled with binning and post
processing; Figure – 4, shows the
result of binning on the categorical
column
DATA ANALYSIS
Success of Coupons (Historical Data)
28%
27%
25%
11%
9%
Coffee House
Restaurant(<20)
Carry out & Take away
Bar
Restaurant(20-50)
Coffee House, Carry out and Restaurant(<20) were
the most successful coupons
Age Vs Coupons (Historical Data)
164
862
817
751
495
363
235
692
268
1271
1216
885
570
516
303
739
<21 21 26 31 36 41 46 50+
N Y
Age group from 21 to 31 and 50+, the coupon
usage is very high. Below 21 years the coupon
distribution is low and hence the usage.
DATA ANALYSIS CONTD.
Occupation Vs Coupon Success (Historical Data)
N, 860
Y, 1262
0
200
400
600
800
1000
1200
1400
Student, Unemployed, computer professionals and
Retired categories the success rate is high.
Marital Status (Historical Data)
40%
38%
17%
4% 1%
Single
Married partner
Unmarried partner
Divorced
Widowed
Age group from 21 to 31 and 50+, the coupon
usage is very high. Below 21 years the coupon
distribution is low and hence the usage.
DATA ANALYSIS CONTD.
Multicollinearity Chart
Colour Legend
• Yellow shade – Correlation is 0
• Red and Dark Green is -1 and +1
Business Understanding
• Customer ID, Temperature, Time,
Weather, Direction, Passenger and
Driving Distance impact is very low
• Age, Has Children, Marital status,
Gender, Occupation the impact is
intermediate.
• Restaurant type visit rating has the
highest impact
MACHINE LEARNING MODEL
ML Model 1: Logistic Regression
Logistic
Regression
Cross
Validation
Accuracy
68.97%
ML Model 2: Decision Tree
Hyper Tuning
Cross
Validation
Accuracy
70.95%
Decision Tree
Accuracy
76.63%
ML Model 3: Random Forest
ML Models with their accuracy scores
Random
Forest
Hyper Tuning
Cross
Validation
MACHINE LEARNING
(HYPERTUNING)
Random Forest – Hyper Tuning to get accuracy
No of Estimators: We used Randomize Search and Grid Search
to find the optimum number of Estimators (Trees) which can
give the highest accuracy score and then used the same in our
Machine Learning Model.
No of Folds: We used 5 folds to create random test and train
split within the model to generate 5 accuracy scores and
based on which the average score got select as the most
optimum score.
Random State: We have tuned the Random state to 80 which
is giving the maximum accuracy score in our model.
Business Insights
Advantages to Business
1. Coffee, Restaurant (<20) and Take away coupons are
more successful.
2. Coupons are mostly used by age group 21 to 31 and 50+
3. Computer Workers, Retired, students and Unemployed
are mostly using the coupons.
4. Customers tend to use the coupons if Driving Distance is
between 5 to 15 minutes.
5. Customers tend to use the coupons mostly when the
weather is sunny.
6. Carry away coupons utilization is most for customers
using it 1~3 times in a month.
7. Most footfalls are at 7:00 AM and 6:00 PM, probably to
pick a snack.
A Mall Case Study Machine Learning

More Related Content

What's hot

In-Service Corrosion Mapping—Challenges for the Chemical Industry
In-Service Corrosion Mapping—Challenges for the Chemical IndustryIn-Service Corrosion Mapping—Challenges for the Chemical Industry
In-Service Corrosion Mapping—Challenges for the Chemical Industry
Olympus IMS
 
6. Present Advanced Drones
6. Present Advanced Drones6. Present Advanced Drones
6. Present Advanced Drones
Devender Singh Bohra
 
Drones
DronesDrones
Drones
RoBo karthi
 
UAV(unmanned aerial vehicle) and its application
UAV(unmanned aerial vehicle) and its application UAV(unmanned aerial vehicle) and its application
UAV(unmanned aerial vehicle) and its application
Joy Karmakar
 
UAV
UAVUAV
Beacons
Beacons Beacons
Beacons
Rahul Dhabhai
 
3 layered advanced atm security
3 layered advanced atm security3 layered advanced atm security
3 layered advanced atm security
eSAT Journals
 
hawk eye technology
hawk eye technologyhawk eye technology
hawk eye technology
mousam meher
 
DRONE FOR FIRE FIGHTING OPERATION
DRONE FOR FIRE FIGHTING OPERATIONDRONE FOR FIRE FIGHTING OPERATION
DRONE FOR FIRE FIGHTING OPERATION
IRJET Journal
 
Introduction to Augmented Reality
Introduction to Augmented RealityIntroduction to Augmented Reality
Introduction to Augmented Reality
Mark Billinghurst
 
Military Radar
Military RadarMilitary Radar
Military Radar
Jeet Adhikary
 
Final Year Project Report
Final Year Project ReportFinal Year Project Report
Final Year Project Report
Josh Hammond
 
Drone ppt
Drone pptDrone ppt
Drone ppt
shahwaz mohd
 
Spatial computing - extending reality
Spatial computing - extending realitySpatial computing - extending reality
Spatial computing - extending reality
Kelan tutkimus / Research at Kela
 
The future of drones in construction
The future of drones in constructionThe future of drones in construction
The future of drones in construction
Lobster Pictures time lapse and monitoring
 
Drones
DronesDrones
Drones
vsinha12
 
Drones
DronesDrones
Drones
Elaine E Lum
 
Drone Insights 2021, and its Impact on other sectors in India
Drone Insights 2021, and its Impact on other sectors in IndiaDrone Insights 2021, and its Impact on other sectors in India
Drone Insights 2021, and its Impact on other sectors in India
Kaushik Biswas
 
Quadcopter ppt
Quadcopter pptQuadcopter ppt
Quadcopter ppt
Subhash kumar
 
radar technology
radar technologyradar technology
radar technology
vipin mishra
 

What's hot (20)

In-Service Corrosion Mapping—Challenges for the Chemical Industry
In-Service Corrosion Mapping—Challenges for the Chemical IndustryIn-Service Corrosion Mapping—Challenges for the Chemical Industry
In-Service Corrosion Mapping—Challenges for the Chemical Industry
 
6. Present Advanced Drones
6. Present Advanced Drones6. Present Advanced Drones
6. Present Advanced Drones
 
Drones
DronesDrones
Drones
 
UAV(unmanned aerial vehicle) and its application
UAV(unmanned aerial vehicle) and its application UAV(unmanned aerial vehicle) and its application
UAV(unmanned aerial vehicle) and its application
 
UAV
UAVUAV
UAV
 
Beacons
Beacons Beacons
Beacons
 
3 layered advanced atm security
3 layered advanced atm security3 layered advanced atm security
3 layered advanced atm security
 
hawk eye technology
hawk eye technologyhawk eye technology
hawk eye technology
 
DRONE FOR FIRE FIGHTING OPERATION
DRONE FOR FIRE FIGHTING OPERATIONDRONE FOR FIRE FIGHTING OPERATION
DRONE FOR FIRE FIGHTING OPERATION
 
Introduction to Augmented Reality
Introduction to Augmented RealityIntroduction to Augmented Reality
Introduction to Augmented Reality
 
Military Radar
Military RadarMilitary Radar
Military Radar
 
Final Year Project Report
Final Year Project ReportFinal Year Project Report
Final Year Project Report
 
Drone ppt
Drone pptDrone ppt
Drone ppt
 
Spatial computing - extending reality
Spatial computing - extending realitySpatial computing - extending reality
Spatial computing - extending reality
 
The future of drones in construction
The future of drones in constructionThe future of drones in construction
The future of drones in construction
 
Drones
DronesDrones
Drones
 
Drones
DronesDrones
Drones
 
Drone Insights 2021, and its Impact on other sectors in India
Drone Insights 2021, and its Impact on other sectors in IndiaDrone Insights 2021, and its Impact on other sectors in India
Drone Insights 2021, and its Impact on other sectors in India
 
Quadcopter ppt
Quadcopter pptQuadcopter ppt
Quadcopter ppt
 
radar technology
radar technologyradar technology
radar technology
 

Similar to A Mall Case Study Machine Learning

Pro_Squad.pptx
Pro_Squad.pptxPro_Squad.pptx
Pro_Squad.pptx
Yogesh Dhandharia
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Matt Stubbs
 
Improving profitability of campaigns through data science
Improving profitability of campaigns through data scienceImproving profitability of campaigns through data science
Improving profitability of campaigns through data science
swebi
 
Database Marketing, part two: data enhancement, analytics, and attribution
Database Marketing, part two: data enhancement, analytics, and attribution Database Marketing, part two: data enhancement, analytics, and attribution
Database Marketing, part two: data enhancement, analytics, and attribution
Relevate
 
Attribution modeling 101, Mariia Bocheva
Attribution modeling 101, Mariia BochevaAttribution modeling 101, Mariia Bocheva
Attribution modeling 101, Mariia Bocheva
Mariia Bocheva
 
Attribution modeling 101
Attribution modeling 101 Attribution modeling 101
Attribution modeling 101
OWOX BI
 
IBM Transforming Customer Relationships Through Predictive Analytics
IBM Transforming Customer Relationships Through Predictive AnalyticsIBM Transforming Customer Relationships Through Predictive Analytics
IBM Transforming Customer Relationships Through Predictive Analytics
SFIMA
 
Supply chain strategy and financial metrics 20180118
Supply chain strategy and financial metrics 20180118Supply chain strategy and financial metrics 20180118
Supply chain strategy and financial metrics 20180118
Bram Desmet
 
Entering the Data Analytics industry
Entering the Data Analytics industryEntering the Data Analytics industry
Entering the Data Analytics industry
Gramener
 
Customer analytics
Customer analyticsCustomer analytics
Customer analytics
Karl Melo
 
Data analytics in retail
Data analytics in retailData analytics in retail
Data analytics in retail
tanyazyabkina
 
36% Average Yearly Increase By AutomatingClosedLoopMarketing[1]
36% Average Yearly Increase By AutomatingClosedLoopMarketing[1]36% Average Yearly Increase By AutomatingClosedLoopMarketing[1]
36% Average Yearly Increase By AutomatingClosedLoopMarketing[1]
BrianRBaird; Prophet l Priest l Journey Man
 
Improving Customer Experience via Experimentation Dynamic Talks: San Francisc...
Improving Customer Experience via Experimentation Dynamic Talks: San Francisc...Improving Customer Experience via Experimentation Dynamic Talks: San Francisc...
Improving Customer Experience via Experimentation Dynamic Talks: San Francisc...
Grid Dynamics
 
Cocoa chocolate a dream company
Cocoa chocolate  a dream companyCocoa chocolate  a dream company
Cocoa chocolate a dream company
Manish Kumar Sharma
 
Mather Disciplined Pricing Approach For Banking Summary
Mather Disciplined Pricing Approach For Banking SummaryMather Disciplined Pricing Approach For Banking Summary
Mather Disciplined Pricing Approach For Banking Summary
dfischer
 
Campaign response modeling
Campaign response modelingCampaign response modeling
Campaign response modeling
Esteban Ribero
 
Data Insight Leaders Summit Barcelona 2017
Data Insight Leaders Summit Barcelona 2017Data Insight Leaders Summit Barcelona 2017
Data Insight Leaders Summit Barcelona 2017
Harvinder Atwal
 
Rapid Optimization Application Development Using Excel and Solver
Rapid Optimization Application Development Using Excel and SolverRapid Optimization Application Development Using Excel and Solver
Rapid Optimization Application Development Using Excel and Solver
Michael Mina
 
Reduce Churn and Improve Customer Loyalty
Reduce Churn and Improve Customer LoyaltyReduce Churn and Improve Customer Loyalty
Reduce Churn and Improve Customer Loyalty
Mekko Graphics
 
Sidewalk Event - Why CX matters by mikael vandeskelde
Sidewalk Event - Why CX matters by mikael vandeskelde Sidewalk Event - Why CX matters by mikael vandeskelde
Sidewalk Event - Why CX matters by mikael vandeskelde
Mikael Vandeskelde
 

Similar to A Mall Case Study Machine Learning (20)

Pro_Squad.pptx
Pro_Squad.pptxPro_Squad.pptx
Pro_Squad.pptx
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
 
Improving profitability of campaigns through data science
Improving profitability of campaigns through data scienceImproving profitability of campaigns through data science
Improving profitability of campaigns through data science
 
Database Marketing, part two: data enhancement, analytics, and attribution
Database Marketing, part two: data enhancement, analytics, and attribution Database Marketing, part two: data enhancement, analytics, and attribution
Database Marketing, part two: data enhancement, analytics, and attribution
 
Attribution modeling 101, Mariia Bocheva
Attribution modeling 101, Mariia BochevaAttribution modeling 101, Mariia Bocheva
Attribution modeling 101, Mariia Bocheva
 
Attribution modeling 101
Attribution modeling 101 Attribution modeling 101
Attribution modeling 101
 
IBM Transforming Customer Relationships Through Predictive Analytics
IBM Transforming Customer Relationships Through Predictive AnalyticsIBM Transforming Customer Relationships Through Predictive Analytics
IBM Transforming Customer Relationships Through Predictive Analytics
 
Supply chain strategy and financial metrics 20180118
Supply chain strategy and financial metrics 20180118Supply chain strategy and financial metrics 20180118
Supply chain strategy and financial metrics 20180118
 
Entering the Data Analytics industry
Entering the Data Analytics industryEntering the Data Analytics industry
Entering the Data Analytics industry
 
Customer analytics
Customer analyticsCustomer analytics
Customer analytics
 
Data analytics in retail
Data analytics in retailData analytics in retail
Data analytics in retail
 
36% Average Yearly Increase By AutomatingClosedLoopMarketing[1]
36% Average Yearly Increase By AutomatingClosedLoopMarketing[1]36% Average Yearly Increase By AutomatingClosedLoopMarketing[1]
36% Average Yearly Increase By AutomatingClosedLoopMarketing[1]
 
Improving Customer Experience via Experimentation Dynamic Talks: San Francisc...
Improving Customer Experience via Experimentation Dynamic Talks: San Francisc...Improving Customer Experience via Experimentation Dynamic Talks: San Francisc...
Improving Customer Experience via Experimentation Dynamic Talks: San Francisc...
 
Cocoa chocolate a dream company
Cocoa chocolate  a dream companyCocoa chocolate  a dream company
Cocoa chocolate a dream company
 
Mather Disciplined Pricing Approach For Banking Summary
Mather Disciplined Pricing Approach For Banking SummaryMather Disciplined Pricing Approach For Banking Summary
Mather Disciplined Pricing Approach For Banking Summary
 
Campaign response modeling
Campaign response modelingCampaign response modeling
Campaign response modeling
 
Data Insight Leaders Summit Barcelona 2017
Data Insight Leaders Summit Barcelona 2017Data Insight Leaders Summit Barcelona 2017
Data Insight Leaders Summit Barcelona 2017
 
Rapid Optimization Application Development Using Excel and Solver
Rapid Optimization Application Development Using Excel and SolverRapid Optimization Application Development Using Excel and Solver
Rapid Optimization Application Development Using Excel and Solver
 
Reduce Churn and Improve Customer Loyalty
Reduce Churn and Improve Customer LoyaltyReduce Churn and Improve Customer Loyalty
Reduce Churn and Improve Customer Loyalty
 
Sidewalk Event - Why CX matters by mikael vandeskelde
Sidewalk Event - Why CX matters by mikael vandeskelde Sidewalk Event - Why CX matters by mikael vandeskelde
Sidewalk Event - Why CX matters by mikael vandeskelde
 

Recently uploaded

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 

Recently uploaded (20)

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 

A Mall Case Study Machine Learning

  • 1. Hackathon Machine Learning Submitted by Pro Squad Apoorva, Deepak, Kunal & Yogesh
  • 2. INDEX 1. Problem Statement 2. Challenges 4. Binning 5. Data Analysis 6. ML & Business Insights 3. Missing Value Treatment
  • 3. PROBLEM STATEMENT Problem Context Relevance AIMs & Objectives A mall is doing a coupon campaign and wants to ensure the success of campaign using a Robust prediction model built with Machine Learning techniques. Mall has provided historical data which comprises of recommended coupons, customer details and coupon consumption details of previous years Mall is going to run the campaign again and based on the historical data of coupons effectiveness they want to increase the footfalls in the Mall which will help the mall to increase business for the shops in the mall. The AIM of the project is to come out with Business Insights on the data provided and Train a Machine Learning model which can predict the success of campaign with highest accuracy percentage.
  • 4. CHALLENGES IN HISTORICAL DATA • 26 features – 9 Numerical and 17 Categorical • Missing values in 5 Columns • Categorical Columns have Multiple labels, going to maximum 25 labels in 1 column. • Categorical Data has outliers and skewness • Most of the features are correlated
  • 5. MISSING VALUE TREATMENT • Car – There are 84 values only out of 10147 in this column which is less then 1% hence we removed this column as it has no impact. • Bar, CoffeeHouse, CarryAway, RestaurantLessThan20, Restaurant20To50 – These have missing values around 2% hence we have used the Feature engineering technique to fill the most commonly occurring value out of the total values available in these columns.
  • 6. BINNING Occupation column has 25 labels and the data frequency variation is very high creating outliers and skewness, so we used the Binning technique to reduce the number of labels hence removed the outliers and skewness
  • 7. BINNING CONTD. Fig. : 1 Fig. : 2 Fig. : 3 Fig. : 4 Outliers: In Figure – 1, we can see two dots, these are outliers which we tackled with binning and hence Figure - 2 shows the result of binning on the categorical column Skewness: In Figure – 3, we can see the curve is skewed on the right, which we have tackled with binning and post processing; Figure – 4, shows the result of binning on the categorical column
  • 8. DATA ANALYSIS Success of Coupons (Historical Data) 28% 27% 25% 11% 9% Coffee House Restaurant(<20) Carry out & Take away Bar Restaurant(20-50) Coffee House, Carry out and Restaurant(<20) were the most successful coupons Age Vs Coupons (Historical Data) 164 862 817 751 495 363 235 692 268 1271 1216 885 570 516 303 739 <21 21 26 31 36 41 46 50+ N Y Age group from 21 to 31 and 50+, the coupon usage is very high. Below 21 years the coupon distribution is low and hence the usage.
  • 9. DATA ANALYSIS CONTD. Occupation Vs Coupon Success (Historical Data) N, 860 Y, 1262 0 200 400 600 800 1000 1200 1400 Student, Unemployed, computer professionals and Retired categories the success rate is high. Marital Status (Historical Data) 40% 38% 17% 4% 1% Single Married partner Unmarried partner Divorced Widowed Age group from 21 to 31 and 50+, the coupon usage is very high. Below 21 years the coupon distribution is low and hence the usage.
  • 10. DATA ANALYSIS CONTD. Multicollinearity Chart Colour Legend • Yellow shade – Correlation is 0 • Red and Dark Green is -1 and +1 Business Understanding • Customer ID, Temperature, Time, Weather, Direction, Passenger and Driving Distance impact is very low • Age, Has Children, Marital status, Gender, Occupation the impact is intermediate. • Restaurant type visit rating has the highest impact
  • 11. MACHINE LEARNING MODEL ML Model 1: Logistic Regression Logistic Regression Cross Validation Accuracy 68.97% ML Model 2: Decision Tree Hyper Tuning Cross Validation Accuracy 70.95% Decision Tree Accuracy 76.63% ML Model 3: Random Forest ML Models with their accuracy scores Random Forest Hyper Tuning Cross Validation
  • 12. MACHINE LEARNING (HYPERTUNING) Random Forest – Hyper Tuning to get accuracy No of Estimators: We used Randomize Search and Grid Search to find the optimum number of Estimators (Trees) which can give the highest accuracy score and then used the same in our Machine Learning Model. No of Folds: We used 5 folds to create random test and train split within the model to generate 5 accuracy scores and based on which the average score got select as the most optimum score. Random State: We have tuned the Random state to 80 which is giving the maximum accuracy score in our model.
  • 13. Business Insights Advantages to Business 1. Coffee, Restaurant (<20) and Take away coupons are more successful. 2. Coupons are mostly used by age group 21 to 31 and 50+ 3. Computer Workers, Retired, students and Unemployed are mostly using the coupons. 4. Customers tend to use the coupons if Driving Distance is between 5 to 15 minutes. 5. Customers tend to use the coupons mostly when the weather is sunny. 6. Carry away coupons utilization is most for customers using it 1~3 times in a month. 7. Most footfalls are at 7:00 AM and 6:00 PM, probably to pick a snack.