SlideShare a Scribd company logo
1 of 8
MEVSYS DATA MINING
ONE PRODUCT PER CUSTOMER
Brief outline of a project in
deployment since 2008.
2013 1
TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED
PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED
BUSINESS UNDERSTANDING
The company defined that when a client contacts customer service,
sometimes an opportunity is presented to offer him a product.
Having seven different product types, the question was which
product was the ideal one for each client.
CRM expressed interest in offering the product closest to each client’s
characteristics, avoiding proposals of little interest to him and building
a business relationship based on his needs.
The ideal product was then the one the client was closest to buying
on his own, without considering any direct marketing stimuli.
2013
TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED
PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED
2
DATA UNDERSTANDING
From the client database, general, demographic, products and
transactional data history was extracted. Predictive and outcome
variables were obtained from this source.
From the campaigns database direct marketing sales actions history
was extracted. According to the pursued objective, this information
was used to exclude clients that during the analized period had some
buying stimuli by the company.
Data is available summarized for each month, avaialable around the
15th of the next, therefore predictive models are deployed with a
one month window.
Example: with January data, obtained mid February, predictions to
what will happen during March are deployed.
2013
TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED
PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED
3
DATA PREPARATION
Clients tables crossing with a 5 month margin: 1st, 2nd and 3rd for
predictive variables history, 4th as window (not used) and 5th to obtain
variables to predict.
Campaigns tables crossing to exclude clients that received direct buying
stimuli during that period.
Years of periods of history were obtained.
The outcome variable was generated, according to which product
variable registered an increment between the 3rd and 5th months.
New predictive variables were generated, summarizing and calculating
new information. From an initial 180 variables 350 were obtained: the
final model uses about 30.
The standard necessary transformations were made, according to the
needs of each algorithm to be tested.
2013
TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED
PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED
4
MODELLING
With genetic evolution and brute force algorithms, thousands of
possible models were tested, based on 6 basic types: support vector
machines, neural networks, decision trees, logistic regression,
discriminant analysis and naive bayes.
2013
TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED
PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED
5
# Generated by Log[com.rapidminer.datatable.SimpleDataTable]
# Leaf size Gain Depth Confidence Number of Prepuning Performance
765 0,2307 40 0,2728 25 29%
30 0,0841 14 0,0387 13 36%
485 0,2810 18 0,4842 28 29%
63 0,0566 69 0,1203 20 41%
624 0,0662 86 0,2823 3 38%
765 0,2368 69 0,2611 25 29%
30 0,0877 14 0,0140 13 37%
63 0,0704 40 0,1142 20 39%
EVALUATION
The following is a simplified confusion matrix of the final evaluation, in addition
to the regular statistical validations carried on during modelling.
It describes the model’s performance according to the product predicted versus
the product actually bought.
With a 51% performance, for half of the clients the exact product is predicted.
2013
TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED
PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED
6
P R E D I C T E D
PROD1 PROD2 PROD2 PROD4 PROD5 PROD6 PROD7
B
O
U
G
H
T
PROD1 6.964 838 846 797 300 1.963 582
PROD2 583 3.217 298 161 56 177 127
PROD3 1.111 245 1.142 182 22 605 176
PROD4 707 214 165 2.169 74 220 59
PROD5 84 131 26 70 297 83 33
PROD6 2.491 346 585 459 304 2.469 388
PROD7 185 83 159 26 56 222 820
Buyers: 33.317 / Correct Predictions: 17.078 / Performance: 51%
EVALUATION II: VS NO DATA MINING
In the customer service operator’s system, the ideal product is
indicated for the client he’s currently serving.
If the operator were to offer every client the most frequent product
bought, precision would be 37%. Against this scheme, the model’s
51% increases performance by 38% and also allows a varied
offer according to each individual client.
If the operator were to use a varied offer proportional to the usual
sales distribution, precision would be just 23%. Against this scheme, the
model increases performance by 122%.
2013
TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED
PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED
7
MEVSYS.COM
INFO@MEVSYS.COM
2013
2013 8
TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED
PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED

More Related Content

Similar to Mevsys Data Mining: one product per customer.

Retail Applications: IDC MarketScape Worldwide 2012 Vendor Assessment
Retail Applications: IDC MarketScape Worldwide 2012 Vendor AssessmentRetail Applications: IDC MarketScape Worldwide 2012 Vendor Assessment
Retail Applications: IDC MarketScape Worldwide 2012 Vendor AssessmentDassault Systemes
 
IBM Transforming Customer Relationships Through Predictive Analytics
IBM Transforming Customer Relationships Through Predictive AnalyticsIBM Transforming Customer Relationships Through Predictive Analytics
IBM Transforming Customer Relationships Through Predictive AnalyticsSFIMA
 
Customer insight presentation s houston - boston march 2014
Customer insight presentation   s houston - boston march 2014Customer insight presentation   s houston - boston march 2014
Customer insight presentation s houston - boston march 2014Stuart Houston
 
Big Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourBig Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourIRJET Journal
 
Atidot - InsuTech Innovation Award 2022
Atidot - InsuTech Innovation Award 2022Atidot - InsuTech Innovation Award 2022
Atidot - InsuTech Innovation Award 2022The Digital Insurer
 
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...Neo4j
 
Gordian Knot Analytics Group Overview
Gordian Knot Analytics Group OverviewGordian Knot Analytics Group Overview
Gordian Knot Analytics Group OverviewBrad Wood
 
Machine learning for customer classification
Machine learning for customer classificationMachine learning for customer classification
Machine learning for customer classificationAndrew Barnes
 
E Rev Max The Sigma Way
E Rev Max The Sigma WayE Rev Max The Sigma Way
E Rev Max The Sigma Waysanjay389
 
Customer Retention - Analytics paving way
Customer Retention - Analytics paving wayCustomer Retention - Analytics paving way
Customer Retention - Analytics paving wayAnubhav Srivastava
 
Data Mining Concepts with Customer Relationship Management
Data Mining Concepts with Customer Relationship ManagementData Mining Concepts with Customer Relationship Management
Data Mining Concepts with Customer Relationship ManagementIJERA Editor
 
Webinar: Turning Insight Into Action: Analytics & Effective Denials Management
Webinar: Turning Insight Into Action: Analytics & Effective Denials ManagementWebinar: Turning Insight Into Action: Analytics & Effective Denials Management
Webinar: Turning Insight Into Action: Analytics & Effective Denials ManagementModern Healthcare
 
Role of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryRole of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryPerceptive Analytics
 
Behavioural Analysis PowerPoint Presentation Slides
Behavioural Analysis PowerPoint Presentation SlidesBehavioural Analysis PowerPoint Presentation Slides
Behavioural Analysis PowerPoint Presentation SlidesSlideTeam
 
Rapa 2004 oct_six_sigma_in_insurance
Rapa 2004 oct_six_sigma_in_insuranceRapa 2004 oct_six_sigma_in_insurance
Rapa 2004 oct_six_sigma_in_insurancesanjaybhandari25
 
Mevsys Data Mining: Knowledge Discovery.
Mevsys Data Mining: Knowledge Discovery.Mevsys Data Mining: Knowledge Discovery.
Mevsys Data Mining: Knowledge Discovery.Mevsys Data Mining
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionIRJET Journal
 
Is deep learning is a game changer for marketing analytics
Is deep learning is a game changer for marketing analyticsIs deep learning is a game changer for marketing analytics
Is deep learning is a game changer for marketing analyticsBindhuBhargaviTalasi
 
BRIDGEi2i Whitepaper - The Science of Customer Experience Management
BRIDGEi2i Whitepaper - The Science of Customer Experience ManagementBRIDGEi2i Whitepaper - The Science of Customer Experience Management
BRIDGEi2i Whitepaper - The Science of Customer Experience ManagementBRIDGEi2i Analytics Solutions
 

Similar to Mevsys Data Mining: one product per customer. (20)

Retail Applications: IDC MarketScape Worldwide 2012 Vendor Assessment
Retail Applications: IDC MarketScape Worldwide 2012 Vendor AssessmentRetail Applications: IDC MarketScape Worldwide 2012 Vendor Assessment
Retail Applications: IDC MarketScape Worldwide 2012 Vendor Assessment
 
IBM Transforming Customer Relationships Through Predictive Analytics
IBM Transforming Customer Relationships Through Predictive AnalyticsIBM Transforming Customer Relationships Through Predictive Analytics
IBM Transforming Customer Relationships Through Predictive Analytics
 
Customer insight presentation s houston - boston march 2014
Customer insight presentation   s houston - boston march 2014Customer insight presentation   s houston - boston march 2014
Customer insight presentation s houston - boston march 2014
 
Big Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourBig Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer Behaviour
 
Atidot - InsuTech Innovation Award 2022
Atidot - InsuTech Innovation Award 2022Atidot - InsuTech Innovation Award 2022
Atidot - InsuTech Innovation Award 2022
 
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
 
Gordian Knot Analytics Group Overview
Gordian Knot Analytics Group OverviewGordian Knot Analytics Group Overview
Gordian Knot Analytics Group Overview
 
Machine learning for customer classification
Machine learning for customer classificationMachine learning for customer classification
Machine learning for customer classification
 
E Rev Max The Sigma Way
E Rev Max The Sigma WayE Rev Max The Sigma Way
E Rev Max The Sigma Way
 
Customer Retention - Analytics paving way
Customer Retention - Analytics paving wayCustomer Retention - Analytics paving way
Customer Retention - Analytics paving way
 
Data Mining Concepts with Customer Relationship Management
Data Mining Concepts with Customer Relationship ManagementData Mining Concepts with Customer Relationship Management
Data Mining Concepts with Customer Relationship Management
 
Webinar: Turning Insight Into Action: Analytics & Effective Denials Management
Webinar: Turning Insight Into Action: Analytics & Effective Denials ManagementWebinar: Turning Insight Into Action: Analytics & Effective Denials Management
Webinar: Turning Insight Into Action: Analytics & Effective Denials Management
 
Role of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryRole of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods Industry
 
Behavioural Analysis PowerPoint Presentation Slides
Behavioural Analysis PowerPoint Presentation SlidesBehavioural Analysis PowerPoint Presentation Slides
Behavioural Analysis PowerPoint Presentation Slides
 
Rapa 2004 oct_six_sigma_in_insurance
Rapa 2004 oct_six_sigma_in_insuranceRapa 2004 oct_six_sigma_in_insurance
Rapa 2004 oct_six_sigma_in_insurance
 
Mevsys Data Mining: Knowledge Discovery.
Mevsys Data Mining: Knowledge Discovery.Mevsys Data Mining: Knowledge Discovery.
Mevsys Data Mining: Knowledge Discovery.
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim Prediction
 
Is deep learning is a game changer for marketing analytics
Is deep learning is a game changer for marketing analyticsIs deep learning is a game changer for marketing analytics
Is deep learning is a game changer for marketing analytics
 
Unlock the Value of Usage Data
Unlock the Value of Usage DataUnlock the Value of Usage Data
Unlock the Value of Usage Data
 
BRIDGEi2i Whitepaper - The Science of Customer Experience Management
BRIDGEi2i Whitepaper - The Science of Customer Experience ManagementBRIDGEi2i Whitepaper - The Science of Customer Experience Management
BRIDGEi2i Whitepaper - The Science of Customer Experience Management
 

Mevsys Data Mining: one product per customer.

  • 1. MEVSYS DATA MINING ONE PRODUCT PER CUSTOMER Brief outline of a project in deployment since 2008. 2013 1 TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED
  • 2. BUSINESS UNDERSTANDING The company defined that when a client contacts customer service, sometimes an opportunity is presented to offer him a product. Having seven different product types, the question was which product was the ideal one for each client. CRM expressed interest in offering the product closest to each client’s characteristics, avoiding proposals of little interest to him and building a business relationship based on his needs. The ideal product was then the one the client was closest to buying on his own, without considering any direct marketing stimuli. 2013 TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED 2
  • 3. DATA UNDERSTANDING From the client database, general, demographic, products and transactional data history was extracted. Predictive and outcome variables were obtained from this source. From the campaigns database direct marketing sales actions history was extracted. According to the pursued objective, this information was used to exclude clients that during the analized period had some buying stimuli by the company. Data is available summarized for each month, avaialable around the 15th of the next, therefore predictive models are deployed with a one month window. Example: with January data, obtained mid February, predictions to what will happen during March are deployed. 2013 TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED 3
  • 4. DATA PREPARATION Clients tables crossing with a 5 month margin: 1st, 2nd and 3rd for predictive variables history, 4th as window (not used) and 5th to obtain variables to predict. Campaigns tables crossing to exclude clients that received direct buying stimuli during that period. Years of periods of history were obtained. The outcome variable was generated, according to which product variable registered an increment between the 3rd and 5th months. New predictive variables were generated, summarizing and calculating new information. From an initial 180 variables 350 were obtained: the final model uses about 30. The standard necessary transformations were made, according to the needs of each algorithm to be tested. 2013 TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED 4
  • 5. MODELLING With genetic evolution and brute force algorithms, thousands of possible models were tested, based on 6 basic types: support vector machines, neural networks, decision trees, logistic regression, discriminant analysis and naive bayes. 2013 TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED 5 # Generated by Log[com.rapidminer.datatable.SimpleDataTable] # Leaf size Gain Depth Confidence Number of Prepuning Performance 765 0,2307 40 0,2728 25 29% 30 0,0841 14 0,0387 13 36% 485 0,2810 18 0,4842 28 29% 63 0,0566 69 0,1203 20 41% 624 0,0662 86 0,2823 3 38% 765 0,2368 69 0,2611 25 29% 30 0,0877 14 0,0140 13 37% 63 0,0704 40 0,1142 20 39%
  • 6. EVALUATION The following is a simplified confusion matrix of the final evaluation, in addition to the regular statistical validations carried on during modelling. It describes the model’s performance according to the product predicted versus the product actually bought. With a 51% performance, for half of the clients the exact product is predicted. 2013 TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED 6 P R E D I C T E D PROD1 PROD2 PROD2 PROD4 PROD5 PROD6 PROD7 B O U G H T PROD1 6.964 838 846 797 300 1.963 582 PROD2 583 3.217 298 161 56 177 127 PROD3 1.111 245 1.142 182 22 605 176 PROD4 707 214 165 2.169 74 220 59 PROD5 84 131 26 70 297 83 33 PROD6 2.491 346 585 459 304 2.469 388 PROD7 185 83 159 26 56 222 820 Buyers: 33.317 / Correct Predictions: 17.078 / Performance: 51%
  • 7. EVALUATION II: VS NO DATA MINING In the customer service operator’s system, the ideal product is indicated for the client he’s currently serving. If the operator were to offer every client the most frequent product bought, precision would be 37%. Against this scheme, the model’s 51% increases performance by 38% and also allows a varied offer according to each individual client. If the operator were to use a varied offer proportional to the usual sales distribution, precision would be just 23%. Against this scheme, the model increases performance by 122%. 2013 TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED 7
  • 8. MEVSYS.COM INFO@MEVSYS.COM 2013 2013 8 TO PROTECT CUSTOMER CONFIDENTIALITY SOME REFERENCES HAVE BEEN OMMITED AND/OR GENERALIZED PERCENTAGES AND RESULTS ARE KEPT UNTOUCHED