SlideShare a Scribd company logo
1 of 20
Engaging with Caserta to
ADVANCE YOUR BUSINESS
September 26th, 2017
November 15th, 2017November 15th, 2017
Maxwell Goldbas, Director of Caserta Innovation Labs
Multi-Touch
Attribution Modeling
with Spark
• Who am I?
• Raised on the Upper West Side
• Data Engineer
• Director, Caserta Innovations Lab
• Topics today
• Multi-touch attribution
• Data science with Spark
Introduction
2
• Caserta recently did a cloud migration
• Large media client
• Client could not join us today
• Client was not familiar with Spark
• Hesitant to change to open source code
• We want to demonstrate its power
Background
3
• Client: Which consumer touch points
drive engagement in rewards program?
• Snail Mail
• Texts
• Member Events
• Email
• Site Activity
• Caserta: Get client excited about our
Infrastructure
• Identity Resolution
• Unified Data Source
Objectives
4
• Databricks
• User Access to Data Lake
• Several Spark Clusters
• Graph Dataframes
• AWS
• Data Lake in S3
• Redshift
• EC2 for Clusters
• Caserta
• Airflow
• Docker
• RabbitMQ
Infrastructure
5
• Get data in useable format
• Required knowledge:
• Number of touch points that happened in between
each conversion
• Impact each touch point had on final conversion
• Pull all engagements
• Pull distinct conversions by individual key, event type,
date
• Conversions is the engagement with rewards program
• Do not want multiple conversions, by the same person
on the same day to create noise
• 15 billion rows of event data
Preparation
6
Process
7
Events
Paths
Models
• Order events by individual
• Flag each conversion event
• Flag each a new individual
• Path for each flag for conversion and individua
• Group touch points into paths
• Build Models from Paths
Conversion Paths – Event Data
8
Individual Key Activity Type
Key
Conversion New User Conversion
Path
1 Email 0 0 0
1 Text 0 0 0
1 Conversion 1 0 0
1 Email 0 0 1
2 Text 0 1 2
2 Conversion 1 0 2
2 Text 0 0 3
Conversion Paths – Path Data
9
Individual key Conversion
Path
Total Emails Total Texts
1 0 2 1
1 1 1 0
2 2 0 1
2 3 0 1
Conversion Paths – Conversion Flag
10
Individual
key
Conversion
Path
Total
Emails
Total
Texts
Converted?
1 0 2 1 1
1 1 1 0 0
2 2 0 1 1
2 3 0 1 0
Conversion Paths – Conversion Data
11
Total Emails Total Texts Converted?
2 1 1
1 0 0
0 1 1
0 1 0
LabelFeatures
• Darling child of data science
• Flexible, easy to use, accurate
• Prediction for whether or not a certain
number of events will lead to a
conversion
• Each conversion should have the
number of touch points that lead it
• Results:
• Email and Web Traffic are king
First Model: Logistic Regression
12
• Does not take time between engagements
and conversions into account
• 1000 ads over a year is not 10 times greater
than 100 ads in a week
• Survival analysis to the rescue
• Offset the total number of ads by the
duration they were seen in
• Highest Survival Rate – Web Traffic
• The steeper the curve, the more powerful
the ad
First Model is Wrong: Survival Analysis
13
Survival Analysis
14
Emails Duration
(days)
Survival
Probability
Emails (adj.)
13 6 .94 12.2
13 30 .91 11.8
21 53 .82 17.2
40 214 .61 24.4
52 345 .31 16.1
• Reduce touch points in a long conversion path
• Web traffic activity was effected the most
• More messages means easier to forget
• Less impact
• Multiply number of events by probability they
will convert after that number of events in
their duration
• Results:
• Email and Events are king
Second Model: Discrete Time Survival Model based
conversation paths
15
• Survival Analysis is currently univariate
• Multivariate would could demonstrate
covariance
• Did not have social media data
• Use deep learning
• Account for correlation across channels
• Add parameter for heavy web users,
balance between offline and online focus
Further Analysis
16
• Parallelism is good
• Use Redshift and Spark
• Watch your bottlenecks
• Actions like show and count can cost precious
time
• Bottlenecks can be mitigated by using less,
bigger instances
• Survival Analysis gave us a good amount of
data
• Duration of time before someone would
convert based on a channel
• Caching helped for frequently access data
Notes
17
Thank You
• Maxwell Goldbas, Director, Caserta Innovation Labs:
max@caserta.com
References
• http://gseacademic.harvard.edu/alda/Handouts/ALDA%20Chapter%
2011.pdf
• https://www.jmp.com/support/help/13-2/Survival_Analysis.shtml
References
• http://gseacademic.harvard.edu/alda/Handouts/ALDA%20Chapter%
2011.pdf
• https://www.jmp.com/support/help/13-2/Survival_Analysis.shtml

More Related Content

What's hot

What's hot (12)

Microservices with Spring Cloud
Microservices with Spring CloudMicroservices with Spring Cloud
Microservices with Spring Cloud
 
Living on the Edge (Service): Bundling Microservices to Optimize Consumption ...
Living on the Edge (Service): Bundling Microservices to Optimize Consumption ...Living on the Edge (Service): Bundling Microservices to Optimize Consumption ...
Living on the Edge (Service): Bundling Microservices to Optimize Consumption ...
 
The Data Dichotomy- Rethinking the Way We Treat Data and Services
The Data Dichotomy- Rethinking the Way We Treat Data and ServicesThe Data Dichotomy- Rethinking the Way We Treat Data and Services
The Data Dichotomy- Rethinking the Way We Treat Data and Services
 
Event Sourcing with RSocket—Now We’re Talking!
Event Sourcing with RSocket—Now We’re Talking!Event Sourcing with RSocket—Now We’re Talking!
Event Sourcing with RSocket—Now We’re Talking!
 
SDL Trados Studio 2014... what's new?
SDL Trados Studio 2014... what's new?SDL Trados Studio 2014... what's new?
SDL Trados Studio 2014... what's new?
 
Communication in a Microservice Architecture (Ljubljana Backend Meetup 2021)
Communication in a Microservice Architecture (Ljubljana Backend Meetup 2021)Communication in a Microservice Architecture (Ljubljana Backend Meetup 2021)
Communication in a Microservice Architecture (Ljubljana Backend Meetup 2021)
 
Effective Akka v2
Effective Akka v2Effective Akka v2
Effective Akka v2
 
Microsoft Flow (by Susie Moore)
Microsoft Flow (by Susie Moore)Microsoft Flow (by Susie Moore)
Microsoft Flow (by Susie Moore)
 
Increasing agility with php and kafka
Increasing agility with php and kafkaIncreasing agility with php and kafka
Increasing agility with php and kafka
 
Event Carried State Transfer @ LeanIX
Event Carried State Transfer @ LeanIXEvent Carried State Transfer @ LeanIX
Event Carried State Transfer @ LeanIX
 
Microservices Minus The Hype
Microservices Minus The HypeMicroservices Minus The Hype
Microservices Minus The Hype
 
Connecting Akka with Oracle Event Hub Cloud Service
Connecting Akka with Oracle Event Hub Cloud ServiceConnecting Akka with Oracle Event Hub Cloud Service
Connecting Akka with Oracle Event Hub Cloud Service
 

Similar to Multi touch attribution

Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Lightbend
 
The Network Effect - Open Source and the Internet Of Things - Helsinki 2013 K...
The Network Effect - Open Source and the Internet Of Things - Helsinki 2013 K...The Network Effect - Open Source and the Internet Of Things - Helsinki 2013 K...
The Network Effect - Open Source and the Internet Of Things - Helsinki 2013 K...
Michael Koster
 
The Network Effect - Open Source and the Internet Of Things - Helsinki Keynote
The Network Effect - Open Source and the Internet Of Things - Helsinki KeynoteThe Network Effect - Open Source and the Internet Of Things - Helsinki Keynote
The Network Effect - Open Source and the Internet Of Things - Helsinki Keynote
Michael Koster
 
MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013
MongoDB
 

Similar to Multi touch attribution (20)

Event Driven Microservices architecture
Event Driven Microservices architectureEvent Driven Microservices architecture
Event Driven Microservices architecture
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data Platform
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
 
The Network Effect - Open Source and the Internet Of Things - Helsinki 2013 K...
The Network Effect - Open Source and the Internet Of Things - Helsinki 2013 K...The Network Effect - Open Source and the Internet Of Things - Helsinki 2013 K...
The Network Effect - Open Source and the Internet Of Things - Helsinki 2013 K...
 
The Network Effect - Open Source and the Internet Of Things - Helsinki Keynote
The Network Effect - Open Source and the Internet Of Things - Helsinki KeynoteThe Network Effect - Open Source and the Internet Of Things - Helsinki Keynote
The Network Effect - Open Source and the Internet Of Things - Helsinki Keynote
 
Datastax - Why Your RDBMS fails at scale
Datastax - Why Your RDBMS fails at scaleDatastax - Why Your RDBMS fails at scale
Datastax - Why Your RDBMS fails at scale
 
ADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented MiddlewareADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
 
Falconide Triggered/Transactional Emails Product Presentation
Falconide Triggered/Transactional Emails Product PresentationFalconide Triggered/Transactional Emails Product Presentation
Falconide Triggered/Transactional Emails Product Presentation
 
Michael Koster's Iotweek 2013 keynote
Michael Koster's Iotweek 2013 keynoteMichael Koster's Iotweek 2013 keynote
Michael Koster's Iotweek 2013 keynote
 
Webinar: Leveraging New Technologies with Migration
Webinar: Leveraging New Technologies with MigrationWebinar: Leveraging New Technologies with Migration
Webinar: Leveraging New Technologies with Migration
 
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
 
Get the Message Across: Seamlessly Transport Data to Apps, Anywhere
Get the Message Across: Seamlessly Transport Data to Apps, AnywhereGet the Message Across: Seamlessly Transport Data to Apps, Anywhere
Get the Message Across: Seamlessly Transport Data to Apps, Anywhere
 
Cloud Readiness 101: Analyzing and Visualizing Your IT Infrastructure
Cloud Readiness 101: Analyzing and Visualizing Your IT InfrastructureCloud Readiness 101: Analyzing and Visualizing Your IT Infrastructure
Cloud Readiness 101: Analyzing and Visualizing Your IT Infrastructure
 
Event-Driven Architectures Done Right | Tim Berglund, Confluent
Event-Driven Architectures Done Right | Tim Berglund, ConfluentEvent-Driven Architectures Done Right | Tim Berglund, Confluent
Event-Driven Architectures Done Right | Tim Berglund, Confluent
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBWebinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDB
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 
Neo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperativeNeo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperative
 
MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 

Recently uploaded

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 

Recently uploaded (20)

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Multi touch attribution

  • 1. Engaging with Caserta to ADVANCE YOUR BUSINESS September 26th, 2017 November 15th, 2017November 15th, 2017 Maxwell Goldbas, Director of Caserta Innovation Labs Multi-Touch Attribution Modeling with Spark
  • 2. • Who am I? • Raised on the Upper West Side • Data Engineer • Director, Caserta Innovations Lab • Topics today • Multi-touch attribution • Data science with Spark Introduction 2
  • 3. • Caserta recently did a cloud migration • Large media client • Client could not join us today • Client was not familiar with Spark • Hesitant to change to open source code • We want to demonstrate its power Background 3
  • 4. • Client: Which consumer touch points drive engagement in rewards program? • Snail Mail • Texts • Member Events • Email • Site Activity • Caserta: Get client excited about our Infrastructure • Identity Resolution • Unified Data Source Objectives 4
  • 5. • Databricks • User Access to Data Lake • Several Spark Clusters • Graph Dataframes • AWS • Data Lake in S3 • Redshift • EC2 for Clusters • Caserta • Airflow • Docker • RabbitMQ Infrastructure 5
  • 6. • Get data in useable format • Required knowledge: • Number of touch points that happened in between each conversion • Impact each touch point had on final conversion • Pull all engagements • Pull distinct conversions by individual key, event type, date • Conversions is the engagement with rewards program • Do not want multiple conversions, by the same person on the same day to create noise • 15 billion rows of event data Preparation 6
  • 7. Process 7 Events Paths Models • Order events by individual • Flag each conversion event • Flag each a new individual • Path for each flag for conversion and individua • Group touch points into paths • Build Models from Paths
  • 8. Conversion Paths – Event Data 8 Individual Key Activity Type Key Conversion New User Conversion Path 1 Email 0 0 0 1 Text 0 0 0 1 Conversion 1 0 0 1 Email 0 0 1 2 Text 0 1 2 2 Conversion 1 0 2 2 Text 0 0 3
  • 9. Conversion Paths – Path Data 9 Individual key Conversion Path Total Emails Total Texts 1 0 2 1 1 1 1 0 2 2 0 1 2 3 0 1
  • 10. Conversion Paths – Conversion Flag 10 Individual key Conversion Path Total Emails Total Texts Converted? 1 0 2 1 1 1 1 1 0 0 2 2 0 1 1 2 3 0 1 0
  • 11. Conversion Paths – Conversion Data 11 Total Emails Total Texts Converted? 2 1 1 1 0 0 0 1 1 0 1 0 LabelFeatures
  • 12. • Darling child of data science • Flexible, easy to use, accurate • Prediction for whether or not a certain number of events will lead to a conversion • Each conversion should have the number of touch points that lead it • Results: • Email and Web Traffic are king First Model: Logistic Regression 12
  • 13. • Does not take time between engagements and conversions into account • 1000 ads over a year is not 10 times greater than 100 ads in a week • Survival analysis to the rescue • Offset the total number of ads by the duration they were seen in • Highest Survival Rate – Web Traffic • The steeper the curve, the more powerful the ad First Model is Wrong: Survival Analysis 13
  • 14. Survival Analysis 14 Emails Duration (days) Survival Probability Emails (adj.) 13 6 .94 12.2 13 30 .91 11.8 21 53 .82 17.2 40 214 .61 24.4 52 345 .31 16.1
  • 15. • Reduce touch points in a long conversion path • Web traffic activity was effected the most • More messages means easier to forget • Less impact • Multiply number of events by probability they will convert after that number of events in their duration • Results: • Email and Events are king Second Model: Discrete Time Survival Model based conversation paths 15
  • 16. • Survival Analysis is currently univariate • Multivariate would could demonstrate covariance • Did not have social media data • Use deep learning • Account for correlation across channels • Add parameter for heavy web users, balance between offline and online focus Further Analysis 16
  • 17. • Parallelism is good • Use Redshift and Spark • Watch your bottlenecks • Actions like show and count can cost precious time • Bottlenecks can be mitigated by using less, bigger instances • Survival Analysis gave us a good amount of data • Duration of time before someone would convert based on a channel • Caching helped for frequently access data Notes 17
  • 18. Thank You • Maxwell Goldbas, Director, Caserta Innovation Labs: max@caserta.com