4. Presention Snowplow
Meetup
19-05-2016 Page 4
Who are we?
SDU is a publisher that supplies current information on law and regulations to
lawyers, tax experts, policy makers and other legal professionals
Traditional company in transition
300+ employees
We believe in creating content / product to the wishes of our customers , because
progress is different for everybody
Both off- and online content/products
5. Presention Snowplow
Meetup
19-05-2016 Page 5
Why did we want this?
• Ownership data
• Open generic tools (no vendor
lock-in)
• Ability to give support internally
And not be reliable on external
suppliers
IT
• Improving customer journey
• Insights in product use
• Future wish: reacting realtime to
triggers in market
Marketing
• Insights in Acquisition –
development – retention – winback
• Ask and answer business
questions
• Integration of customer behavior in
marketing database
Marketing
intelligence
• Integration offline and online.
• In depth analytical possibilities on
top of google analytics
• Optimal mix of advertising budget
E-commerce
6. Presention Snowplow
Meetup
19-05-2016 Page 6
What steps did we take?
Develop
Powerpitch
Longlist
Shortlist
Choice
Management
Decision based
on PAP
Implementation
in POC
Transfer to
organisation
Proof in use
cases
Learning
7. Presention Snowplow
Meetup
19-05-2016 Page 7
• Implementing Snowplow in the cloud
• Implementing Apache Spark in the cloud
• Incloud database with all the captured data
• Alignment with Google Universal
Delivering the Intelligence Platform:
Snowplow + Spark
8. Presention Snowplow
Meetup
19-05-2016 Page 8
The Delivered Intelligence Platform Using Snowplow and Spark
Behavioral
Data
Click
data
Capture and store data Analyse the data
9. Presention Snowplow
Meetup
19-05-2016 Page 9
The Delivered Intelligence Platform – Alignment with Google Universal
Intelligence platform - Snowplow / Spark
• Unlimited external data
• Advanced reporting through tools
• Advanced Machine Learning options
• Customer id + fingerprint + IP
• Full export options
Universal Analytics
• Limited external data
• Slice and dice in frontend user system
• No machine learning options
• Upload a customer id in a dimension
• Limited export options
10. Presention Snowplow
Meetup
19-05-2016 Page 10
Planning 6 weeks Proof Of Concept (POC)
Week 1
•Security certificates
•First (generic) tags and triggers in GTM
Week 2
•Second batch of tags and triggers in GTM
•Test of the snowplow data and first EDA
Week 3
•Implementation of Databricks / Spark
•Setting the connection to Snowplow S3 and Redshift
Week 4
•Start of use cases
Week 5
•Finalization of use cases
• Budget calculations for future tools (with cloud computing not so straightforward)
Week 6
•Wrap up project
•End presentation
11. Presention Snowplow
Meetup
19-05-2016 Page 11
What were our Technical learnings / findings
Security certifications in AWS
IT expertise with experience in
network and AWS
Complex Google Analytics
implementation
Completeness of the tracking
Combining off- and online data
Account structure in AWS
Using multiple accounts good
for governance, more complex
in use (whitelisting IP)
Data collection through GTM (=
browser side) is not 100%
complete. Neither is GA.
Implement key in datalayer.
You need web developers
Either start with clean
implementation, or plan
accordingly
13. Presention Snowplow
Meetup
19-05-2016 Page 13
Use Case 1: The Correlation Between Site Visits and Products Put in the Basket
• Products (below, right) are visited frequently,
but are not often added to the basket.
• Products (upper left) are not frequently visited,
but are often added to the basket
• Is the price of some products too high or too
low?
• Are pages difficult to find?
• Is there a difference between our high valued
customers vs low valued customers?
Insights
Implications
Information
14. Presention Snowplow
Meetup
19-05-2016 Page 14
Use Case 2: Most Frequently Visited Service Pages
• Top 10 of webpages related to service
• The top (detailed) service webpage is
‘abonnement-opzeggen’ (cancel subscription)
• 75% (57% + 19%) of the sessions that visit this
page, continues to the cancellation form.
• In 25% of the sessions the customer uses
another form, i.e. the general contact form
(instead of or on top of the cancellation form)
• Cancellations reach Sdu not in different ways.
Are the forms processed similarly?
Insights
Implications
Information
Cancellation form
No Yes
Contact No 19% 57%
form Yes 5% 19%
15. Presention Snowplow
Meetup
19-05-2016 Page 15
Use Case 3: Search Pages
• 6 Distinct clusters, of which ‘zoekers’
(searchers) is a small group with relatively high
revenue
• What can we do to leverage the relatively large
group of visits with no revenue that visits
predominantly in the evening? Are these
private people visiting our site?
• Hypothesis: the searchers have a need for a
specific product. Further research and a/b
testing is advised; specifically on search.
Insights
Implications
Information
17. Presention Snowplow
Meetup
19-05-2016 Page 17
How are we organized for Snowplow?
Sdu
Marketing &
Sales
Marketing
Intelligence
- Analyses
using SQL
(Redshift)
and R and
Python
(Databricks)
E-commerce
- Google Tag
Manager
implementation
IT
Architecture
and
infrastructure
- Alignment with
current and
future business
architecture
- Technical
support
Business
Analist
- Translating
Business needs
into technical
design
18. Presention Snowplow
Meetup
19-05-2016 Page 18
Which are the next steps for Sdu?
• Duplicates: create a script to deduplicate current and future records.
• Implement server-side tracker as a solution to prevent missing web shop transactions.
• Assess low-cost alternative to the use of the Redshift database (AWS) for the long term.
• Structural solution for security Redshift database (whitelisting IP address of Databricks cluster)
Technical next steps
• Determining KPI’s
• Measuring product use
• Analysing data and determine next action
Supporting lean startup
• Answer Business questions on customer behaviour
• Answer questions not asked
• Tracking product use
Learning
19. Presention Snowplow
Meetup
19-05-2016 Page 19
Key take-aways and recommendation
Involve senior management from start
POC of 6 weeks is realistic
Share quick wins / successes for acceptance of the
project