Annalect presentation at Superweek 2017: Predictive Conversion Modeling - Lifting Web Analytics to the next level. Presented by Petri Mertanen, Director of Digital Analytics and Ron Luhtanen, Data Science Analyst. #SPWK
Petri MertanenDirector, Digital Analytics at Omnicom Media Group Finland
9. • Time consuming
• Not cost effective
• Human brains are not able to work with
large amount of complex data
• Outputs depends too much on the analyst
• Insights are too simple
• Predicting in a very rough level
What are the problems with traditional
Web Analytics?
9
12. • Tag website features and elements like
never before, more is more in this case!
• Collect session ID
• Save browser ID
• Think about User ID
• Adform cookie ID or similar
Setup for the Modeling
12
13. This means we see
every interaction
that each user has
during each visit.
The granularity of the
data greatly increases
the possible model
selection as well as
the accuracy of the
models.
15. • The bulk of the modeling is done by
Extreme Gradient Boosting
• The method is a decision tree based
algorithm
• Gradient Boosting can handle
regresssion as well as multiclass
classification
• We have great flexibility with selecting
the KPIs that we want to model and
predict, without having to change the
core modeling algorithm
About the modeling
15
odor=none
Cover: 1628,25
Gain: 4000,53
spore-print-color = green
Cover: 703,75
Gain: 198,174
< -9,53674e-07
< -9,53674e-07
< -9,53674e-07
stalk-root=club
Cover: 924,5
Gain: 1158,21
Leaf
Cover: 13,25
Gain: 1,85965
Leaf
Cover: 690,5
Gain: -1,94071
Leaf
Cover: 112,5
Gain: -1,70044
Leaf
Cover: 812
Gain: 1,7128
Leaf
Cover: 309,453
Gain: -0,96853
< -9,53674e-07
< -9,53674e-07
Leaf
Cover: 458,937
Gain: 0,784718
Leaf
Cover: 20,4624
Gain: -6,23624
odor=none
Cover: 768,39
Gain: 569,725
stalk-root=rooted
Cover: 788,852
Gain: 832,545
16. • Incredibly accurate, hard to overfit and
very fast
• Ability to extract complicated non-linear
relationships from very varied data
• The Algorithm uses only the relevant data
from all the data that is available to it
• Huge improvement over some other
regression models that break if they are
fed with irrelevant data
About the modeling
16
https://github.com/dmlc/xgboost
18. Outputs from Predictive Conversion Modeling
• Generally the output of the analysis is a predictive model that gives a predictions
for the measurement we are modeling against.
• The predictions can be used by themselves or further analysis can be done on
the model to further explain the dependencies in the user interactions.
• The model will be available for digital marketers and analysts.
• Following are 4 example uses for the modeling.
18
19. Data-to-output in Predictive Conversion Model application
Input Output
Enhanced Web Analytics data
Profiling by clustering
customers based on on-site
behavior
Retargeting based on
predicted responses
Twinning to expand reach
to the most prospective
customer profiles
Conversion optimization
19
Machine learning
based predictive
modelling
20. The predictions can be used in more effective retargeting. Instead of bombarding all
the past site visitors with advertisements we can target the advertisements based
on the specific interactions as well as the likelihood of having converted. For
instance we can create a rule that targets people who have over 20% probability of
purchase and have visited the promotion page of a specific product.
Output Application 1: Enhanced Retargeting
IF THEN
Probability of purchase>20%
Visited product page
Target advertisement to
specific people
Recipe
Trigger Action
20
21. The modeling process can also be used in acquiring valuable information on the
behavioral differences of the users. Uncovering certain dependencies in their interactions
allows the marketers to design (and later automate) their marketing messages
differently and more effectively to each of their visitor groups (segments).
Output Application 2: Clustering and Profiling
Person A
Person B
WEB
BEHAVIOR
On-site behavior Off-site behavior
Likes gambling
sites
Buys clothes
online
Has visited booking
page twice
Has visited promotion
page three times
Visits homepage
regularly
Has read product description
page for three minutes
Reads
gardening blogs
Watches regularly
movie trailers online
21
22. The machine learning models can help in conversion optimization. We are not
restricted with just A/B testing, but instead we can create rules that change the site in
order to maximize the likelihood of purchase or conversion of each and every user.
By leveraging the trained model we can direct the user towards the
interactions that are most effective in increasing the likelihood of conversion.
Output Application 3: Conversion Optimization
WEBSITE
CONTENT
RULES
Activated rule Not actived rule
22
23. Once we have identified the most beneficial behavioral patterns, we can use the cookie
data of the most prospective visitors in order to build larger target groups out of
similar web users. The groups can then be used in programmatic buying of advertisements.
Output Application 4: Twinning
BUYING
RULES
for different target
groups
23
26. • Finland’s largest shipyard – builds and
operates cruise ships
• Operates in a very competitive online
environment
• High maturity with online optimization and
data-driven marketing
• Large portion of sales through online
Case Tallink Silja
26
9 mil. Passengers *
Annually
945 mil. Turnover *
27. • Very accurate predictions for non-
converting visitors
• Possibility to adjust prediction
treshold for different actions
The model
27
ROC CurveAccuracy 98%
• Sensitivity 99%
• Specificity 75%
28. • Previously possible only to create
custom segments
• Now clustering using
unsupervised machine learning
over 240 dimensions
• Four distinct behavioral groups
• Heavy users
• Intermediate users
• Reactivated
• Just visiting
Clustering using on-site behavioral data
28
Mean Conversion % - Indexed
1
9,5
8,4
8,8
29. Exploring differences time spent on site
29
Mean Duration from past Session* - IndexedMean Session Duration - Indexed
10,5
10
27
*Calculated as a cumulative sum with 50% daily decay
1
21
1,6
2,4
30. Not limited to averages
30
3
1
4,3
Session Duration – Just Visting
ConvertedNo convertion
1,5
0,28
ConvertedNo convertion
Session Duration – Heavy User
240
280
213
315
250
216
31. Proportional differences depending on source
31
Proportion of visitors from DisplayProportion of visitors from Direct
*Calculated as a cumulative sum with 50% daily decay
33. • Partial dependencies
• Change inputs
• Observe outputs
• Automate
• Can be applied to advertisement
messages, channels, or on-site
elements
• Possible to use smart optimization
algorithms to identify actions that
maximize conversion probability
Gaining insight from a complex model
33
https://github.com/fmfn/BayesianOptimization
34. Discount campaign’s effect on mean propability for conversion
34
Heavy Users Intermediate Users Reactivated Just visiting
36. Machine learning
leveraged analytics
and real time
predictive
modelling
Input Output
Enchanced Web Analytics data
Basic Web Analytics data
Client’s Customer Data
Semantic data
ID’s from ad serving platforms
Organic clustering based on off-
and on-site data
Immediate onsite adaptations
based on off-site data
AI driven marketing:
test and modify content based
on predicted behaviours
Retarget to increase
conversion percentage
36
Future Development Streams
Automated optimization of online
advertising spending
37. • Spend less time on manual analysis
• No more headache with complex data and pressure for outputs
• Think more about the business questions
• The model will do the counting and give answers with a high
confidelity level
• You will interpret results for the business and edit the model for
more in-depth analysis
• You are able to enable analysts with tools previously available
only to data scientists
• Shift the focus from simple metrics to the actual business objects
• Set up automatically optimizing feedback loops in order to
continiously increase conversion rates
How this is changing our work?
37
39. • No more time consuming, labor heavy and
expensive manual analysis
• Enable analysts with machine learning
• Fast to implement and quick to show results
• Ask another question
• Continiously improve marketing efficiency
and ROI
• Get real competitive edge with analytics
Executive Summary
39
Petri Mertanen
Director, Digital Analytics
petri.mertanen@annalect.com
+358 400 792 616
Ron Luhtanen
Analyst, Data Science
ron.luhtanen@annalect.com
+358 50 431 8166
40. Q&A
Petri Mertanen
Director, Digital Analytics
petri.mertanen@annalect.com
+358 400 792 616
Annalect Finland is a part of Omnicom Media Group.
Ron Luhtanen
Analyst, Data Science
ron.luhtanen@annalect.com
+358 50 431 8166
Annalect Finland
www.annalect.fi
info.finland@annalect.fi
@annalect_fi