Data Insight Leaders Summit Barcelona 2017

Case Study Interactive: How To Work With Structured And Unstructured
Data To Increase Customer Acquisition And Reduce Churn With Relevant
Communication

Harvinder Atwal
MoneySuperMarket.com
Web
dunnhumby
• previous : Insight Director, Tesco Clubcard
Lloyds Banking Group
• previous : Senior Manager, Customer Strategy and Insight
• Head of Data Strategy and Advanced Analytics
@harvindersatwal
British Airways
• previous : Senior Operational Research Analyst
{“about” : “me”}
@gmail.com

3
£1.8B
SAVINGS
2016 estimate total of UK savings
1993 22M 6M MSM 14M MSE £316M 980
We started life as
mortgages 2000
Adults choose to
share their data
with us
Average monthly
users
2016
Revenue
2016
Providers

How can analytics improve your attribution model
accuracy to highlight and transform your most
successful marketing channels?
How can you introduce predictive analytics to
increase your customer segmentation
competency?
How can insights from consumer data help you to
predict customer lifetime value and focus on your
top customers?
How can split testing consumer data help to
improve your customer offering and boost
retention rates?
What you wanted to know

Warning: A data-driven customer
focussed strategy will not paper
over cracks in operational
performance or product deficiency

Unstructured
data can give you
important insight
to prioritise

Modelling
https://notebooks.azure.com/latitude51north/libraries/data-insight-leader-summit

Profitably acquire
customers (and
acquire profitable
Get

Display
Ad
Video
Ad
Social Email Search Website
Physical
Store
TV/
Radio/
Press
Outdoor
$88 $132
Time
Affiliates

Display
Ad
Video
Ad
Physical
Store
TV/
Radio/
Press
Outdoor
$88 $132
Time

“Last click” is still the most
common approach to attribution

Last
View
Linear
or Fair
Share
First
Click
Linear or
Weighted
Share
Assumes only the “last
viewed” advert, email
or click counts – no
earlier activities are
given share of the
credit.
Weightings can be
arbitrary and need to
be constantly
updated
Not all interactions are
equally
Valuable. Not all activity
can easily be counted
e.g. offline
Assumes only the
first activity counts
– no later activities
are given any
credit

Time
decay
Frequency
and
Recency
Markov
Chain
Model or
Bayesian
Networks
Positional
or U
model
Assumes recently
viewed advert, email
or click counts more –
earlier activities is
given less share of the
credit.
Weightings can be
arbitrary and need to
be constantly
updated
Ignores the customer
path
Requires
comprehensive
tracking, fooled by
correlations and
doesn’t take into
account brand equity

Determine possible influences
sequentially and gather data

Display
Ad
Video
Ad
Physical
Store
TV/
Radio/
Press
Outdoor
$88 $132
Time
Measure the direct
effect
What is the impact
of Outdoor
advertising on sales?

Display
Ad
Video
Ad
Physical
Store
TV/
Radio/
Press
Outdoor
$88 $132
Time
Measure the indirect effects too!
Use nested models

There are many econometric techniques to
measure outcomes
Regression
Discontinuity
Design
Controlled
Regression
Fixed
Effects
Regression
Difference-
in-
Differences
Instrumental
Variables

Google’s Causal Impact package is great for
analysing Difference-in-differences

Display
Ad
Video
Ad
Physical
Store
TV/
Radio/
Press
Outdoor
$88 $132
Time
Repeatedly iterate and
model. You can then
apply weightings
What is the impact
of TV/Display,
Video, Social…
spend on sales?

What are all the ways you
could communicate to a
stadium full of customers?

What if you could walk
up to ANYONE in the
stadium and have a
conversation knowing
their individual needs
and preferences?

It’s 2017, nobody
should be asked how
they want to be treated

Predictive modelling can help you
treat different customers differently

Think beyond
products, demographics
and loyalty

Think actionable needs, preferences
and states
Borrowing
Saving
Risk averse
Price-sensitive
Brand conscious
Financially Cautious
Financially confident
Time poor

Exercise:
What are some of the
actionable Customer
needs, preferences and
states for your
organisations?

Time of day responsiveness
Day of week responsiveness
Device preference
Marketing channel preference
Offer responsiveness
Help preferences
Social proof/review responsiveness

Traditional propensity modelling and
recommendation engine techniques can help
you if you have past outcome data
Customer
Data
Model
Highest probability
+

You can also think of customer history as a
sequence and predict using Deep Learning
Email
Open
Page view
Product
click
Product
click
Time
Customer history
Sale?
Future period
RNN
Cell
RNN
Cell
RNN
Cell
RNN
Cell
Prediction

Show pictures of
cats
Show pictures of
dogs
Show pictures of
People (control)
Test treatments at random
Conversion = 5% Conversion = 3% Conversion = 3%
But we’re not interested in which treatment
works best on average

Find the best treatment for each customer
Total Customers
(100% of customers)
(3% conversion)
Live alone
(30% of customers)
(4% conversion)
Don’t live alone
(70% of customers)
(2.6% conversion)
Urban
(56% of customers)
(2.7% conversion)
Rural
(14% of customers)
(2.1% conversion)
Live in apartment
(9% of customers)
(4% conversion)
Live in house
(21% of customers)
(4% conversion)
Cat conversion = 18% Cat conversion = 1% Cat conversion = 5.4% Cat conversion = 1%
Dog conversion = 2% Dog conversion = 2% Dog conversion = 2.5% Dog conversion = 7%
Cat segment People segment Dog segmentCat segment
Total segmented conversion =
6.5% vs 4% for best treatment
on average (Cat pictures for all)

A finite number of
predictive micro-
segmentations can
be combined to
create highly
personalised
individual
experiences

Test &
Collect
Model Embed Roll Out
Feedback
Plan
Pilot test
Collect Data
Build Model
Identify segments
Adjust model to fit
organisation
Re-engineer business
processes to support
segmented execution
Train organisation
Incorporate segments into
daily execution
Provide differentiated
services, products and
content

Keep
Retain Profitable customers longer
Win Back profitable customers
Eliminate unprofitable customers

Traditional techniques like RFM and Pareto-NBD omit
many factors influencing Customer Lifetime Value
Contribution
Time
Buys second product
Complaint
Loss Leader
High Servicing costs
Complaint
resolution
Subscription revenues

Training Features
Random Forest Regression can create more
accurate CLV predictions
Training period Model Test period
Training period Prediction period
Time
Product Purchases
VisitsSpend
Demographics
Acquisition channel
Complaints
Future period
Location
Historic period
Segmenting models may improve accuracy further
User BehaviourShipping preferences
Payment preferences
Costs

Beware of survivorship bias when calculating lifetime
value!

Don’t forget
potential
customer
value

Most
Growable
Customers
Super
Growth
Customers
Low
Maintenance
Customers
Most
Valuable
Customers
Actual Value (CLV)
Low
High
Low
High
Potential
Value
Below Zero
Customers
CLV is more powerful when combined with
potential value

A-B (Split) testing is an effective way to boost
revenue and retention when you don’t have
existing data to model

Do not spend time AB testing small
cosmetic details
Simple UI changes are
ineffective.
Colour (changing the colour of
elements on a website) +0.0% uplift
Buttons (modifying website buttons) -
0.2% uplift
Calls to action (changing the wording
on a website to be more suggestive) -
0.3% uplift
Best test categories are:
Scarcity (stock pointers) +2.9% uplift
Social proof (informing users of others’
behaviour) +2.3% uplift
Urgency (countdown timers) +1.5%
uplift
Abandonment recovery (messaging to
keep users on-site) +1.1% uplift
Product recommendations (suggesting
other products to purchase) +0.4% uplift
Qubit meta-analysis of 6,700
experiments (2017)

SELECT
PERFORMANCE
METRIC
SELECT
TREATMENT
AND
CONTROL
UNITS
SELECT
EXPERIMENTAL
AND CONTROL
VARIABLES
RUN TEST
ANALYZE
RESULTS
DETERMINE
DURATION
AND SAMPLE
SIZE

You’re testing promotion of a new product in
an email campaign
What is the target variable?
C) Revenue per customer
B) Sales of the product
A) Click-through on the email

You’re testing an outbound telesales campaign
What is the unit of measurement for the
target variable (sales)?
A) A call
C) A telesales agent
B) A customer

A null hypothesis H 0 ('no effect') is tested against an
alternative hypothesis H 1 ('some effect'). The results
pass a test of statistical significance (P-value <0.05) in
favour of H 1.
What’s been shown?
1. H 0 is false.
2. H 1 is true.
3. H 0 is probably false.
4. H 1 is probably true.
5. Both (1) and (2).
6. Both (3) and (4).
7. None of the above.

https://notebooks.azure.com/latitude51north/libraries/data-insight-leader-summit

Before you go anywhere near data
you need to do Situational Analysis

Eat the elephant one bite at a
time

It’s still possible to measure
even if you can’t employ the
gold standard of randomised
control trials

Be clear about your objectives and metrics
Avoid Vanity Metrics!

Data Insight Leaders Summit Barcelona 2017

Data Insight Leaders Summit Barcelona 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (13)

Similar to Data Insight Leaders Summit Barcelona 2017

Similar to Data Insight Leaders Summit Barcelona 2017 (20)

More from Harvinder Atwal

More from Harvinder Atwal (8)

Recently uploaded

Recently uploaded (20)

Data Insight Leaders Summit Barcelona 2017

Editor's Notes