SlideShare a Scribd company logo
1 of 12
Download to read offline
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
IMPROVING	CUSTOMER	INSIGHT	
THROUGH	PREDICTION	MODELS
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
1
EXECUTIVE	SUMMARY
How	to	tell if one of	your clients	will be	leaving you in	the	very next future	or	on	the	contrary if he	will be	
upgrading his offer ?	Which client	segments are	more	prone	to	churning and	which are	the	best	targets	for	
outbound calls promoting a	new	product /	service	?
Companies	are	striving to	get insights from	market	researches,	focus	groups and	the	likes,	forgetting that
most of	the	answers already resides in	their hands and	in	the	tons of	information	contained in	their
databases,	which is now fashionable to	call	“Big	Data”
This	presentation	shows	how	a	prediction	model	can	be	used	to:
- identify	patterns within	your	customer	databases
- express	these	patterns	in	the	form	of	an	equation	to	be	applied	across	the	whole	database
- sort	your	database	in	order	to	group	all	the	similar	clients	in	clusters
- take	actions	targeted	at	relevant	segments
…without	being	a	statistics	guru	or	an	IT	expert
…without	investing	millions	in	expensive	software
…in	a	very	short	time
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
2
The	objective of	prediction models is to	spot	similar
behaviours within your customer base
Client problem:
"I have a customer base of million clients and I lost 2% of them in
the last month, how can I spot who could be the next churner and
take preventive action ?”
”I just launched a new product trial in a region and it was a
success, how can I select the roll out strategy ?”
“I need to revamp sales on a product I sold last year, who should
the new campaign target ?
ID Age Gender Tenure Usage/(min) CC/calls Churner
1 26M 5,6 811 0
2 36M 9,6 124 0
3 41F 0,4 137 0
4 48F 1,1 635 1
5 55M 5,0 655 0
6 34F 4,9 500 0
7 22M 9,4 63 0 1
8 28M 5,2 849 0
9 54M 7,7 577 0
10 23F 3,8 13 0
11 28M 8,6 286 0
12 33F 1,7 407 0
13 52M 6,6 353 0
14 30F 2,7 859 0
15 33M 5,5 544 0
16 36F 8,9 211 1
17 20M 8,7 243 1
18 39M 6,2 520 0
19 27F 0,8 663 1
20 35F 0,5 937 0
145 25F 1,6 679 0
146 48F 1,4 329 0 1
147 29F 9,3 918 0
148 50M 9,2 270 1
149 52M 7,3 741 0
150 23M 3,8 442 0
151 26F 6,4 263 0
152 60F 7,2 14 0
153 23M 0,7 20 0
566 65F 8,8 797 2
567 33M 8,0 798 0 1
568 65F 9,8 412 0
569 67F 5,7 561 0
1343 48M 8,2 52 0
1344 26M 6,0 834 1 1
1345 49F 9,2 664 2
1346 63F 1,7 197 2
1347 35M 3,3 100 0
A predictive model helps to understand whether there is any
correlation between a certain behaviour (flagged in yellow) and a
set of variables related to the client:
- Anagraphics: Age, Gender, Occupation, Address, Family
composition, …
- History as a client: Past purchases, Revenue, Product portfolio, ...
- Channel interations: Visits per store, Complaints, Time spent on my
website, …
- …
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
A	prediction model	helps identifying clusters	in	order
to	take	targeted actions
3
ID Age Gender Tenure Usage (min) CC calls Churner Prospect churner
1 39 M 0,1 222 0 1
2 35 F 2,4 581 0 1
3 30 F 1,8 399 0 1
4 33 M 8,1 536 1 1
5 21 F 5,4 423 0
6 47 M 1,1 187 0
7 29 F 4,8 172 0 1
8 33 F 3,7 946 0
9 55 M 7,9 692 0 1
10 49 M 2,7 309 0
11 44 M 9,2 931 0 1
12 28 F 5,4 334 0
13 43 M 7,4 838 0
14 44 M 4,4 485 0 1
15 29 F 5,0 850 0 1
16 32 F 2,4 640 1
17 21 F 9,3 285 1 1
18 26 F 6,3 336 0 1
19 46 M 6,0 415 1
20 57 F 1,4 890 0 1
21 38 M 6,6 61 0
22 55 F 9,6 806 0 1
23 25 M 7,5 792 0
24 24 M 9,6 763 1 1
25 23 M 10,0 738 0
26 28 F 3,1 455 0
27 21 M 4,2 151 0 1
28 38 M 5,1 275 0
29 55 M 3,2 494 0
30 20 M 3,5 577 2
31 44 F 3,6 343 0 1
32 49 M 6,2 808 0
33 50 F 7,4 532 0
34 24 F 0,5 922 0
35 44 M 9,5 8 1 1
36 56 M 9,1 478 2
37 56 F 4,3 998 2
38 38 F 3,1 840 0 1
39 41 F 7,4 936 0
40 51 M 3,5 440 0 1
41 55 F 7,6 640 0
42 31 M 1,2 702 0 1
43 23 F 9,6 341 0
44 49 M 8,0 719 0
45 44 F 9,4 707 0
46 34 F 8,4 243 0 1
47 26 F 4,9 718 0
48 40 M 8,1 423 0
49 53 F 5,4 664 0
50 41 F 1,5 63 0 1
51 39 F 9,9 787 0
52 49 F 8,4 316 0
53 29 F 9,2 190 0
54 55 F 2,5 113 0
55 43 F 1,5 690 0 1
56 34 M 4,7 643 0
57 56 F 2,7 946 0
58 33 F 7,1 628 0
59 46 M 0,6 551 0
60 59 M 1,8 775 0
61 35 M 7,6 990 0
62 58 F 6,7 115 0
63 21 F 5,0 192 0
64 43 F 5,4 791 0
65 41 F 4,0 158 0
66 57 M 4,6 2 0
67 34 F 2,3 691 0
68 49 F 8,3 642 0
69 32 F 9,2 920 0 1
70 24 F 9,0 475 0
71 50 M 2,7 164 0
72 24 F 7,2 6 0
73 58 F 6,8 114 0
74 24 F 6,8 69 0
75 55 M 7,1 909 0
76 29 M 4,5 306 0
77 35 M 7,3 368 0
78 26 M 5,4 224 0
79 46 M 9,0 547 0
80 22 M 6,7 163 0
81 51 F 2,9 432 0
82 53 M 7,1 464 0
83 36 F 1,8 375 0
84 58 M 8,4 289 0
85 24 F 4,5 796 0
86 34 M 4,4 130 0
Build the DB
Select significant
variables
Group variables
Build the
predictive model
Sort your DB
according to
the prediction
Gather relevant information from your sources
(accounting, marketing, sales channels, loyalty, CRM,
business intelligence, …)
Through uni-variate analysis the relevant variables are
selected and the others are discarded (based on
confidence, R2, etc.)
Bi-variate analysis allows to find cross correlations
(e.g. age and occupation, gender and tenure, etc.) and
to group variables by significance
Log-linear regression allows to define a predictive
function F(client)=a1*var1*a2*var2*… to be applied to
the entire DB
Based on the predictive function, the database is
sorted in order to spot the flagged clients (churners,
previous buyers of a product, etc.) and their look alikes
Real
churners
Potential
churners
STEPS
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
4
The	first	step is to	collect information	from	different
sources and	create	an	offline	database
Work
DB	
Channels
Surveys
Campaigns
Past
churners
External
Mkt info
CRM
(complaints?)
Usage
…
1) Collect information from your departments
2) Define the target of your research, for example:
- Who will leave in the next future ?
- Who will buy more of this product ?
- Who will be a good target of the new campaign ?
- …
3) Don’t be afraid to add variables, the prediction model doesn’t
know what is the meaning of a field. For the same reason do
not exclude variables “a priori” based on your feelings
4) Clearly flag the clients whose behaviour is well known, e.g.
previous churners, previous buyers of a certain product
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
5
The	second step is to	select relevant variables
Number of customers
Voluntary deactivation
rate
17%
17,6%
20,4%
22,0%
25,5%
15,0%
No calls Other calls 1 2 >2
No complaint calls
Average
voluntary
deact. rate
Number of
calls/year to
call center
Number of complaint calls
Usage of the bundle
Average
voluntary
deact. rate
Percentage
of bundle
usage
28,0%
20,2% 16,4% 16,0% 14,5%
56,6%
0% 0%-20% 21%-50% 51%-80% 81%-120% >120%
Very low consumption
of the bundle
17%
High/full exploitation
of the bundle
Customers with high n. of calls
to call center show a high
churn
Customers using bundle offers
have a lower churn
Through univariate
analysis some trends are
spotted
Variables with low
correlation are
discarded from the
model
EXAMPLES
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
6
The	further	step	is	to	group	variables	using	
bi-variate analsys
0,00%
0,20%
0,40%
0,60%
0,80%
1,00%
1,20%
1,40%
1,60%
<21 21-30 30-50 50-60 60-80 >80
< 1 year
1 year
2-4 years
5-10 years
> 10 years
0,550%
Buyers of new product vs. age class and tenure of contract
Higher frequency of prospects
within middle aged customer base
acquired 2-4 years ago
Average of
buyers within
customer base
Every couple of relevant variables
is tested and sorted according to
significance
EXAMPLES
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
7
A	real case	example in	cross-selling
Automatic toll collection player
• +40%	sales	of	insurances after campaign launch
• +30%	of	opening	emails on	the	profiled cluster	vs.	
random	cluster
• 65%	of	people called by	outbound were
interested about the	product for	the	profiled
client	(vs.	less than 30%	of	random)
• Competence,	skills and	instruments were
transferred to	the	client	through the	client	team	
member who worked with	us full	time
• ...	with	an	IT	investment of	2.000	EUR	for	2	SAS	
licenses (now with	software	as a	service	it is no	
longer required)
Context
•Telepass:	Automatic toll collection client	(5	
milion clients)
•Premium	program launched together with	
insurances,	travel agencies and	fuel retailers
(cross	fidelity	program)
•Willingness to	relaunch the	insurance product
•Generic DB	with	info	on	age,	address,	km	of	
highway per	year,	rate	of	opening	of	emailing
campaigns,	...
Activities performed
•DB	preparation correlating internal and	external
sources (demographics)	
•Scoring model	using clients	who had already
purchased an	insurance
•Launch of	a	commercial	multi	channel campaign
on	two client	clusters,	one random	and	one
profiled (to	compare	results)
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
8
Log-linear	regression	allows	to	determine	the	
predictive	function
EXAMPLES
1 indifferent
<1 inversely correlated
>1 proportionally correlated
Insurance
buying
interest
probability
= 0.00013413 x
1
0.78
1
1
1
1
1
1
1
Km Twin Tenure Premium x
#cars x
x x x x
1
0.58
1
1
1
1
1
1
1
1
14.9
1
1
1
1
1
1
1
1
1
0.77
1
1.29
1.26
1
1
1
1
1
1.5
1
1
1
1
1
1
1
1.64
2.08
1
1
1
1
1
2.2
Age
#apparatus Urban Camp1x x x
1
0.62
1
1
1
1
1
1
1
1
3.2
1
1
1
1
1
1
1
1
0.71
1
1
1
1
1
1
1
1
0.89
1
1
1
1
1
1
1
Camp2
After grouping the variables and transforming cluster
variables into vectors, the log linear regression helps
expressing the correlations found:
The prospect buyer is (probably):
- Somebody who does very few km in the motorway
- Somebody acquired 3 years ago
- Somebody who has a premium account
- Somebody within an age range of 50-70
- Not living in a urban area
- ….
0
66,4%
82,5%
89,3%
92,5% 94,3% 95,0% 96,5% 97,6% 98,9% 100,0%
0 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
LIFT CURVE PROSPECTS VS. RANDOM FUNCTION
To verify the model accuracy, a lift curve tells us how
many buyers and prospect buyers do we progressively
find in the database sorted according to the prediction
function.
A parallel database is sorted according to a random
uniform function. The same database is used to extract
control samples and verify the effectiveness of the
campaign
Top of DB Bottom of DB
Predicted
Random
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
9
There	are	many	tools	available	to	build	
a	prediction	model
The choice very much depends on:
- how familiar you are with programming
- whether you want to work on the cloud vs. on local
databases, with all the privacy/security issues
related
- how frequently you want to update your model
- how much you are willing to spend and whether you
want to integrate the prediction suit in your existing
IT systems
- how fast vs. how accurate the model needs to be
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
10
The	prediction factor can	be	combined with	a	proxy of	
client	value to	derive	priorities
“Client	value	@	risk” valuation... …	helps preventing value loss
Churn
probability
Low High ARPU
LTV
Low
High
High	client	
value	@	risk
Value	of	client	loss in	
case	of	churn
(MEUR/month)
Number of	clients
38%
56%
76%
83%
88%
91%
94% 96%
0%
20%
40%
60%
80%
100%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
68%
10%	of	clients	represent more	than 1/3	
of	client	value at risk
First segment
to address
Alessandro Leona – http://www.linkedin.com/in/alessandroleona
11
Lessons learnt
• Consider all possible variables,	do	not exclude “a	priori”
• Group	variables in	bigger categories to	obtain better reliabitlity of	samples
• Measure the	validity of	your model	comparing actions versus	a	random	list
Statistical	
validity of	
prediction
model
• Low number of	segments (keep it simple)
• Always	verify with	external interviews (outbound)	the	level of	interest of	
retention actions or	new	proposals
• Build the	statistical analysis within the	company,	do	not fully rely on	the	
external consultant or	the	IT	integrator,	when they will leave you have to	be	
able to	replicate	the	prediction
• Update	periodically the	predicting model
Other	elements
Database	
preparation
• It is wise to	launch a	survey on	a	sample	of	churning clients	to	identify causes (some	
keywords can	be	isolated and	later utilised in	the	call	center	to	give warning signals
of	possible churn)
• Churners must	be	isolated from	wanted (grasshoppers switching at the	end	of	
contract period)	or	internally driven churners (bad debt)
• The	analysis must	be	done in	a	“quiet”	period,	far	from	campaigns /	offer changes

More Related Content

Similar to Improving customer insight through prediction models

Dmmp presentation[1]
Dmmp presentation[1]Dmmp presentation[1]
Dmmp presentation[1]
marketal123
 
Final Internship Report_Sachin Serigar
Final Internship Report_Sachin SerigarFinal Internship Report_Sachin Serigar
Final Internship Report_Sachin Serigar
Sachin Serigar
 
Digital Marketing Part A and B
Digital Marketing Part A and BDigital Marketing Part A and B
Digital Marketing Part A and B
Jordan Ho
 
Creating a Social Media Analytics Action Plan
Creating a Social Media Analytics Action PlanCreating a Social Media Analytics Action Plan
Creating a Social Media Analytics Action Plan
Taylor Pratt
 
Marketing reserch
Marketing reserchMarketing reserch
Marketing reserch
guest24a3fa
 

Similar to Improving customer insight through prediction models (20)

Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010
 
Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010
 
Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010
 
Using Customer Data to Build Intimacy, Engagement, and Loyalty
Using Customer Data to Build Intimacy, Engagement, and LoyaltyUsing Customer Data to Build Intimacy, Engagement, and Loyalty
Using Customer Data to Build Intimacy, Engagement, and Loyalty
 
Customer segmentation
Customer segmentationCustomer segmentation
Customer segmentation
 
Key factors influence brand trust and brand loyalty a study on smartphone con...
Key factors influence brand trust and brand loyalty a study on smartphone con...Key factors influence brand trust and brand loyalty a study on smartphone con...
Key factors influence brand trust and brand loyalty a study on smartphone con...
 
Dmmp presentation[1]
Dmmp presentation[1]Dmmp presentation[1]
Dmmp presentation[1]
 
IRJET-User Profile based Behavior Identificaton using Data Mining Technique
IRJET-User Profile based Behavior Identificaton using Data Mining TechniqueIRJET-User Profile based Behavior Identificaton using Data Mining Technique
IRJET-User Profile based Behavior Identificaton using Data Mining Technique
 
Final Internship Report_Sachin Serigar
Final Internship Report_Sachin SerigarFinal Internship Report_Sachin Serigar
Final Internship Report_Sachin Serigar
 
Digital Marketing Part A and B
Digital Marketing Part A and BDigital Marketing Part A and B
Digital Marketing Part A and B
 
Protel F&B 360
Protel F&B 360 Protel F&B 360
Protel F&B 360
 
Creating a Social Media Analytics Action Plan
Creating a Social Media Analytics Action PlanCreating a Social Media Analytics Action Plan
Creating a Social Media Analytics Action Plan
 
IoD Sales and Marketing Forum 8oct13
IoD Sales and Marketing Forum 8oct13IoD Sales and Marketing Forum 8oct13
IoD Sales and Marketing Forum 8oct13
 
Longitudinal Panels in the Mobile World - KWP ComTech
Longitudinal Panels in the  Mobile World - KWP ComTechLongitudinal Panels in the  Mobile World - KWP ComTech
Longitudinal Panels in the Mobile World - KWP ComTech
 
E-Learning Market.pdf
E-Learning Market.pdfE-Learning Market.pdf
E-Learning Market.pdf
 
FinalPhasesIp
FinalPhasesIpFinalPhasesIp
FinalPhasesIp
 
Consumer Technographics: новые инструменты анализа "цифрового поведения" потр...
Consumer Technographics: новые инструменты анализа "цифрового поведения" потр...Consumer Technographics: новые инструменты анализа "цифрового поведения" потр...
Consumer Technographics: новые инструменты анализа "цифрового поведения" потр...
 
Marketing reserch
Marketing reserchMarketing reserch
Marketing reserch
 
Demand forecasting
Demand forecastingDemand forecasting
Demand forecasting
 
Emg2015
Emg2015Emg2015
Emg2015
 

Recently uploaded

Riding the Wave of AI Disruption - Navigating the AI Fear Cycle in Marketing ...
Riding the Wave of AI Disruption - Navigating the AI Fear Cycle in Marketing ...Riding the Wave of AI Disruption - Navigating the AI Fear Cycle in Marketing ...
Riding the Wave of AI Disruption - Navigating the AI Fear Cycle in Marketing ...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
The 100x Factor Growth with AI - Susan Diaz
The 100x Factor  Growth with AI - Susan DiazThe 100x Factor  Growth with AI - Susan Diaz
Brand experience Peoria City Soccer Presentation.pdf
Brand experience Peoria City Soccer Presentation.pdfBrand experience Peoria City Soccer Presentation.pdf
Brand experience Peoria City Soccer Presentation.pdf
tbatkhuu1
 

Recently uploaded (20)

Riding the Wave of AI Disruption - Navigating the AI Fear Cycle in Marketing ...
Riding the Wave of AI Disruption - Navigating the AI Fear Cycle in Marketing ...Riding the Wave of AI Disruption - Navigating the AI Fear Cycle in Marketing ...
Riding the Wave of AI Disruption - Navigating the AI Fear Cycle in Marketing ...
 
Creator Influencer Strategy Master Class - Corinne Rose Guirgis
Creator Influencer Strategy Master Class - Corinne Rose GuirgisCreator Influencer Strategy Master Class - Corinne Rose Guirgis
Creator Influencer Strategy Master Class - Corinne Rose Guirgis
 
SEO Master Class - Steve Wiideman, Wiideman Consulting Group
SEO Master Class - Steve Wiideman, Wiideman Consulting GroupSEO Master Class - Steve Wiideman, Wiideman Consulting Group
SEO Master Class - Steve Wiideman, Wiideman Consulting Group
 
Situation Analysis | Management Company.
Situation Analysis | Management Company.Situation Analysis | Management Company.
Situation Analysis | Management Company.
 
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best PracticesInstant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
 
Chat GPT Master Class - Leslie Hughes, PUNCH Media
Chat GPT Master Class - Leslie Hughes, PUNCH MediaChat GPT Master Class - Leslie Hughes, PUNCH Media
Chat GPT Master Class - Leslie Hughes, PUNCH Media
 
The 100x Factor Growth with AI - Susan Diaz
The 100x Factor  Growth with AI - Susan DiazThe 100x Factor  Growth with AI - Susan Diaz
The 100x Factor Growth with AI - Susan Diaz
 
Brand experience Peoria City Soccer Presentation.pdf
Brand experience Peoria City Soccer Presentation.pdfBrand experience Peoria City Soccer Presentation.pdf
Brand experience Peoria City Soccer Presentation.pdf
 
SEO for Revenue, Grow Your Business, Not Just Your Rankings - Dale Bertrand
SEO for Revenue, Grow Your Business, Not Just Your Rankings - Dale BertrandSEO for Revenue, Grow Your Business, Not Just Your Rankings - Dale Bertrand
SEO for Revenue, Grow Your Business, Not Just Your Rankings - Dale Bertrand
 
Labour Day Celebrating Workers and Their Contributions.pptx
Labour Day Celebrating Workers and Their Contributions.pptxLabour Day Celebrating Workers and Their Contributions.pptx
Labour Day Celebrating Workers and Their Contributions.pptx
 
Digital Strategy Master Class - Andrew Rupert
Digital Strategy Master Class - Andrew RupertDigital Strategy Master Class - Andrew Rupert
Digital Strategy Master Class - Andrew Rupert
 
Enjoy Night⚡Call Girls Dlf City Phase 4 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 4 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 4 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 4 Gurgaon >༒8448380779 Escort Service
 
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
 
Turn Digital Reputation Threats into Offense Tactics - Daniel Lemin
Turn Digital Reputation Threats into Offense Tactics - Daniel LeminTurn Digital Reputation Threats into Offense Tactics - Daniel Lemin
Turn Digital Reputation Threats into Offense Tactics - Daniel Lemin
 
personal branding kit for music business
personal branding kit for music businesspersonal branding kit for music business
personal branding kit for music business
 
What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?
 
Google 3rd-Party Cookie Deprecation [Update] + 5 Best Strategies
Google 3rd-Party Cookie Deprecation [Update] + 5 Best StrategiesGoogle 3rd-Party Cookie Deprecation [Update] + 5 Best Strategies
Google 3rd-Party Cookie Deprecation [Update] + 5 Best Strategies
 
Top 5 Breakthrough AI Innovations Elevating Content Creation and Personalizat...
Top 5 Breakthrough AI Innovations Elevating Content Creation and Personalizat...Top 5 Breakthrough AI Innovations Elevating Content Creation and Personalizat...
Top 5 Breakthrough AI Innovations Elevating Content Creation and Personalizat...
 
Navigating the SEO of Tomorrow, Competitive Benchmarking, China as an e-Comme...
Navigating the SEO of Tomorrow, Competitive Benchmarking, China as an e-Comme...Navigating the SEO of Tomorrow, Competitive Benchmarking, China as an e-Comme...
Navigating the SEO of Tomorrow, Competitive Benchmarking, China as an e-Comme...
 
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptx
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptxDigital-Marketing-Into-by-Zoraiz-Ahmad.pptx
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptx
 

Improving customer insight through prediction models

  • 1. Alessandro Leona – http://www.linkedin.com/in/alessandroleona IMPROVING CUSTOMER INSIGHT THROUGH PREDICTION MODELS
  • 2. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 1 EXECUTIVE SUMMARY How to tell if one of your clients will be leaving you in the very next future or on the contrary if he will be upgrading his offer ? Which client segments are more prone to churning and which are the best targets for outbound calls promoting a new product / service ? Companies are striving to get insights from market researches, focus groups and the likes, forgetting that most of the answers already resides in their hands and in the tons of information contained in their databases, which is now fashionable to call “Big Data” This presentation shows how a prediction model can be used to: - identify patterns within your customer databases - express these patterns in the form of an equation to be applied across the whole database - sort your database in order to group all the similar clients in clusters - take actions targeted at relevant segments …without being a statistics guru or an IT expert …without investing millions in expensive software …in a very short time
  • 3. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 2 The objective of prediction models is to spot similar behaviours within your customer base Client problem: "I have a customer base of million clients and I lost 2% of them in the last month, how can I spot who could be the next churner and take preventive action ?” ”I just launched a new product trial in a region and it was a success, how can I select the roll out strategy ?” “I need to revamp sales on a product I sold last year, who should the new campaign target ? ID Age Gender Tenure Usage/(min) CC/calls Churner 1 26M 5,6 811 0 2 36M 9,6 124 0 3 41F 0,4 137 0 4 48F 1,1 635 1 5 55M 5,0 655 0 6 34F 4,9 500 0 7 22M 9,4 63 0 1 8 28M 5,2 849 0 9 54M 7,7 577 0 10 23F 3,8 13 0 11 28M 8,6 286 0 12 33F 1,7 407 0 13 52M 6,6 353 0 14 30F 2,7 859 0 15 33M 5,5 544 0 16 36F 8,9 211 1 17 20M 8,7 243 1 18 39M 6,2 520 0 19 27F 0,8 663 1 20 35F 0,5 937 0 145 25F 1,6 679 0 146 48F 1,4 329 0 1 147 29F 9,3 918 0 148 50M 9,2 270 1 149 52M 7,3 741 0 150 23M 3,8 442 0 151 26F 6,4 263 0 152 60F 7,2 14 0 153 23M 0,7 20 0 566 65F 8,8 797 2 567 33M 8,0 798 0 1 568 65F 9,8 412 0 569 67F 5,7 561 0 1343 48M 8,2 52 0 1344 26M 6,0 834 1 1 1345 49F 9,2 664 2 1346 63F 1,7 197 2 1347 35M 3,3 100 0 A predictive model helps to understand whether there is any correlation between a certain behaviour (flagged in yellow) and a set of variables related to the client: - Anagraphics: Age, Gender, Occupation, Address, Family composition, … - History as a client: Past purchases, Revenue, Product portfolio, ... - Channel interations: Visits per store, Complaints, Time spent on my website, … - …
  • 4. Alessandro Leona – http://www.linkedin.com/in/alessandroleona A prediction model helps identifying clusters in order to take targeted actions 3 ID Age Gender Tenure Usage (min) CC calls Churner Prospect churner 1 39 M 0,1 222 0 1 2 35 F 2,4 581 0 1 3 30 F 1,8 399 0 1 4 33 M 8,1 536 1 1 5 21 F 5,4 423 0 6 47 M 1,1 187 0 7 29 F 4,8 172 0 1 8 33 F 3,7 946 0 9 55 M 7,9 692 0 1 10 49 M 2,7 309 0 11 44 M 9,2 931 0 1 12 28 F 5,4 334 0 13 43 M 7,4 838 0 14 44 M 4,4 485 0 1 15 29 F 5,0 850 0 1 16 32 F 2,4 640 1 17 21 F 9,3 285 1 1 18 26 F 6,3 336 0 1 19 46 M 6,0 415 1 20 57 F 1,4 890 0 1 21 38 M 6,6 61 0 22 55 F 9,6 806 0 1 23 25 M 7,5 792 0 24 24 M 9,6 763 1 1 25 23 M 10,0 738 0 26 28 F 3,1 455 0 27 21 M 4,2 151 0 1 28 38 M 5,1 275 0 29 55 M 3,2 494 0 30 20 M 3,5 577 2 31 44 F 3,6 343 0 1 32 49 M 6,2 808 0 33 50 F 7,4 532 0 34 24 F 0,5 922 0 35 44 M 9,5 8 1 1 36 56 M 9,1 478 2 37 56 F 4,3 998 2 38 38 F 3,1 840 0 1 39 41 F 7,4 936 0 40 51 M 3,5 440 0 1 41 55 F 7,6 640 0 42 31 M 1,2 702 0 1 43 23 F 9,6 341 0 44 49 M 8,0 719 0 45 44 F 9,4 707 0 46 34 F 8,4 243 0 1 47 26 F 4,9 718 0 48 40 M 8,1 423 0 49 53 F 5,4 664 0 50 41 F 1,5 63 0 1 51 39 F 9,9 787 0 52 49 F 8,4 316 0 53 29 F 9,2 190 0 54 55 F 2,5 113 0 55 43 F 1,5 690 0 1 56 34 M 4,7 643 0 57 56 F 2,7 946 0 58 33 F 7,1 628 0 59 46 M 0,6 551 0 60 59 M 1,8 775 0 61 35 M 7,6 990 0 62 58 F 6,7 115 0 63 21 F 5,0 192 0 64 43 F 5,4 791 0 65 41 F 4,0 158 0 66 57 M 4,6 2 0 67 34 F 2,3 691 0 68 49 F 8,3 642 0 69 32 F 9,2 920 0 1 70 24 F 9,0 475 0 71 50 M 2,7 164 0 72 24 F 7,2 6 0 73 58 F 6,8 114 0 74 24 F 6,8 69 0 75 55 M 7,1 909 0 76 29 M 4,5 306 0 77 35 M 7,3 368 0 78 26 M 5,4 224 0 79 46 M 9,0 547 0 80 22 M 6,7 163 0 81 51 F 2,9 432 0 82 53 M 7,1 464 0 83 36 F 1,8 375 0 84 58 M 8,4 289 0 85 24 F 4,5 796 0 86 34 M 4,4 130 0 Build the DB Select significant variables Group variables Build the predictive model Sort your DB according to the prediction Gather relevant information from your sources (accounting, marketing, sales channels, loyalty, CRM, business intelligence, …) Through uni-variate analysis the relevant variables are selected and the others are discarded (based on confidence, R2, etc.) Bi-variate analysis allows to find cross correlations (e.g. age and occupation, gender and tenure, etc.) and to group variables by significance Log-linear regression allows to define a predictive function F(client)=a1*var1*a2*var2*… to be applied to the entire DB Based on the predictive function, the database is sorted in order to spot the flagged clients (churners, previous buyers of a product, etc.) and their look alikes Real churners Potential churners STEPS
  • 5. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 4 The first step is to collect information from different sources and create an offline database Work DB Channels Surveys Campaigns Past churners External Mkt info CRM (complaints?) Usage … 1) Collect information from your departments 2) Define the target of your research, for example: - Who will leave in the next future ? - Who will buy more of this product ? - Who will be a good target of the new campaign ? - … 3) Don’t be afraid to add variables, the prediction model doesn’t know what is the meaning of a field. For the same reason do not exclude variables “a priori” based on your feelings 4) Clearly flag the clients whose behaviour is well known, e.g. previous churners, previous buyers of a certain product
  • 6. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 5 The second step is to select relevant variables Number of customers Voluntary deactivation rate 17% 17,6% 20,4% 22,0% 25,5% 15,0% No calls Other calls 1 2 >2 No complaint calls Average voluntary deact. rate Number of calls/year to call center Number of complaint calls Usage of the bundle Average voluntary deact. rate Percentage of bundle usage 28,0% 20,2% 16,4% 16,0% 14,5% 56,6% 0% 0%-20% 21%-50% 51%-80% 81%-120% >120% Very low consumption of the bundle 17% High/full exploitation of the bundle Customers with high n. of calls to call center show a high churn Customers using bundle offers have a lower churn Through univariate analysis some trends are spotted Variables with low correlation are discarded from the model EXAMPLES
  • 7. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 6 The further step is to group variables using bi-variate analsys 0,00% 0,20% 0,40% 0,60% 0,80% 1,00% 1,20% 1,40% 1,60% <21 21-30 30-50 50-60 60-80 >80 < 1 year 1 year 2-4 years 5-10 years > 10 years 0,550% Buyers of new product vs. age class and tenure of contract Higher frequency of prospects within middle aged customer base acquired 2-4 years ago Average of buyers within customer base Every couple of relevant variables is tested and sorted according to significance EXAMPLES
  • 8. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 7 A real case example in cross-selling Automatic toll collection player • +40% sales of insurances after campaign launch • +30% of opening emails on the profiled cluster vs. random cluster • 65% of people called by outbound were interested about the product for the profiled client (vs. less than 30% of random) • Competence, skills and instruments were transferred to the client through the client team member who worked with us full time • ... with an IT investment of 2.000 EUR for 2 SAS licenses (now with software as a service it is no longer required) Context •Telepass: Automatic toll collection client (5 milion clients) •Premium program launched together with insurances, travel agencies and fuel retailers (cross fidelity program) •Willingness to relaunch the insurance product •Generic DB with info on age, address, km of highway per year, rate of opening of emailing campaigns, ... Activities performed •DB preparation correlating internal and external sources (demographics) •Scoring model using clients who had already purchased an insurance •Launch of a commercial multi channel campaign on two client clusters, one random and one profiled (to compare results)
  • 9. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 8 Log-linear regression allows to determine the predictive function EXAMPLES 1 indifferent <1 inversely correlated >1 proportionally correlated Insurance buying interest probability = 0.00013413 x 1 0.78 1 1 1 1 1 1 1 Km Twin Tenure Premium x #cars x x x x x 1 0.58 1 1 1 1 1 1 1 1 14.9 1 1 1 1 1 1 1 1 1 0.77 1 1.29 1.26 1 1 1 1 1 1.5 1 1 1 1 1 1 1 1.64 2.08 1 1 1 1 1 2.2 Age #apparatus Urban Camp1x x x 1 0.62 1 1 1 1 1 1 1 1 3.2 1 1 1 1 1 1 1 1 0.71 1 1 1 1 1 1 1 1 0.89 1 1 1 1 1 1 1 Camp2 After grouping the variables and transforming cluster variables into vectors, the log linear regression helps expressing the correlations found: The prospect buyer is (probably): - Somebody who does very few km in the motorway - Somebody acquired 3 years ago - Somebody who has a premium account - Somebody within an age range of 50-70 - Not living in a urban area - …. 0 66,4% 82,5% 89,3% 92,5% 94,3% 95,0% 96,5% 97,6% 98,9% 100,0% 0 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% LIFT CURVE PROSPECTS VS. RANDOM FUNCTION To verify the model accuracy, a lift curve tells us how many buyers and prospect buyers do we progressively find in the database sorted according to the prediction function. A parallel database is sorted according to a random uniform function. The same database is used to extract control samples and verify the effectiveness of the campaign Top of DB Bottom of DB Predicted Random
  • 10. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 9 There are many tools available to build a prediction model The choice very much depends on: - how familiar you are with programming - whether you want to work on the cloud vs. on local databases, with all the privacy/security issues related - how frequently you want to update your model - how much you are willing to spend and whether you want to integrate the prediction suit in your existing IT systems - how fast vs. how accurate the model needs to be
  • 11. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 10 The prediction factor can be combined with a proxy of client value to derive priorities “Client value @ risk” valuation... … helps preventing value loss Churn probability Low High ARPU LTV Low High High client value @ risk Value of client loss in case of churn (MEUR/month) Number of clients 38% 56% 76% 83% 88% 91% 94% 96% 0% 20% 40% 60% 80% 100% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 68% 10% of clients represent more than 1/3 of client value at risk First segment to address
  • 12. Alessandro Leona – http://www.linkedin.com/in/alessandroleona 11 Lessons learnt • Consider all possible variables, do not exclude “a priori” • Group variables in bigger categories to obtain better reliabitlity of samples • Measure the validity of your model comparing actions versus a random list Statistical validity of prediction model • Low number of segments (keep it simple) • Always verify with external interviews (outbound) the level of interest of retention actions or new proposals • Build the statistical analysis within the company, do not fully rely on the external consultant or the IT integrator, when they will leave you have to be able to replicate the prediction • Update periodically the predicting model Other elements Database preparation • It is wise to launch a survey on a sample of churning clients to identify causes (some keywords can be isolated and later utilised in the call center to give warning signals of possible churn) • Churners must be isolated from wanted (grasshoppers switching at the end of contract period) or internally driven churners (bad debt) • The analysis must be done in a “quiet” period, far from campaigns / offer changes