Your SlideShare is downloading. ×
0
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Time-to-Event Models, presented by DataSong and Revolution Analytics

2,757

Published on

Companies are doing a better and better job of collecting data that explains why consumers behave the way they do. These diverse data sets cause us to rethink some of the workhorse algorithms for data …

Companies are doing a better and better job of collecting data that explains why consumers behave the way they do. These diverse data sets cause us to rethink some of the workhorse algorithms for data analysis. Specifically, the traditional binary response model leaves much room for improvement in how it embraces time. Cross–sectional models allow much rich data to fall through the cracks. We’ll discuss real-world scenarios and how to better use data with time to event modeling.

Published in: Business, Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,757
On Slideshare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
69
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. © 2013 DataSong, all rights reserved / 234 Front Street, 3rd Floor, San Francisco CA 94111 Time-to-Event Models Consumer Interaction Insight October 2013
  • 2. Today’s Presenters Tess A Nesbitt, PhD Senior Data Scientist John Wallace Founder & CEO
  • 3. Agenda About us Problem statement High level modeling approach. Use cases Scoring systems Q&A
  • 4. DataSong at a Glance Approaching $1 trillion in revenue analyzed. $2 billion in marketing spend under our lens. Experienced 60 person team based out of San Francisco, with offices in Seattle, LA and India. Founded in 2003 with a proven history of solving difficult analytics problems. Evolved from consulting through close partnerships with clients. Customer interaction insight that powers applications for customer level revenue attribution, targeting, media optimization Actionable and accurate information that drives customer acquisition and revenue growth for modern direct marketers. Patented big data approach models behavior at the individual consumer level.
  • 5. DataSong Offerings 1. A regression modeling framework for prediction and inference 2. Automation of modelsets in Hadoop 3. Enterprise grade scoring in Hadoop
  • 6. Modelset Creation: Current State Flatten out the data • 1. Aggregate a fact table (sum, count) • 2. Join a dimension to a fact table and aggregate it (sum, count) • 3. Superpose time • If we have a dimension with a cardinality of 25 and 6 time periods of interest, that’s 150 variables for 1 dimension AccountNo #SiteVisits 123456 5 AccountNo #Visits_SEO #Visits_EmailClick #Visits_SEM #Visits_... 123456 3 1 1 … AccountNo #Visits_SEO_1Mo #Visits_SEO_3Mo #Visits_SEO_6Mo #Visits_SEO_... 123456 1 2 3 …
  • 7. In Our Opinion “Feature Engineering” • Creating good variables is many times more important than choice of algorithm Don’t lose track of time • Age old practice of flattening data into 1 row per customer with 1000s of variables is limiting Aggregations can obfuscate Time series without customer- level data overlook important causal relationships
  • 8. New Challenges for Predictive Modeling More and more of our input data is generated from log files • Large observational data (or if you want to call it Big Data, you can) • We are approaching an infinite number of variables to test Increasing # of use cases for real time scoring Increasing # of opportunities to use models for inference
  • 9. Understanding the Baseline Hazard
  • 10. What Are We Doing About it? Survival Response Model • Explains differences in response rate as we change exposure to marketing • Know what was significant and what wasn’t Account ID-level analysis follows customers and cookies over time Time-dependent Outcome had an event or was censored Time-dependent Covariates the effect of an event is not constant Time-varying Covariates time may modify an event effect Controls for non-marketing effects: Baseline Hazard Rate Customer-driven activity many customers are driven by loyalty vs. marketing Anniversary Effects many sales driven by season demand vs. marketing
  • 11. CUSTOMER INTERACTION BEHAVIORAL LOYALTY IDLEVELTIMESTAMPDATA MARKETING SERVICE LOYALTY TELEMATIC Prior Transactions Email Impressions DM Referring clicks In-store service Call center Inbound email/forms Redemptions Point balances GPS data Smart devices EXAMPLE
  • 12. CUSTOMER SERVICE CUSTOMER INTERACTION OBJECTIVE TIME APPROACH OUTCOME BEHAVIORAL LOYALTY SITE VISIT SUBSCRIPTION-CENTRIC IDLEVELTIMESTAMPDATA INFERENCE PREDICTION TIME-TO-EVENT POINT-IN-TIME Response ModelMARKETING SERVICE LOYALTY TELEMATIC PRICE/ PROMOTION COMPETITION SEASONALITY UPGRADE LEAVE DEFAULT MACRODATA SITE VISIT PURCHASE
  • 13. CUSTOMER SERVICE CUSTOMER INTERACTION OBJECTIVE TIME APPROACH OUTCOME BEHAVIORAL LOYALTY SITE VISIT SUBSCRIPTION-CENTRIC IDLEVELTIMESTAMPDATA INFERENCE PREDICTION TIME-TO-EVENT POINT-IN-TIME Voluntary Churn Model MARKETING SERVICE LOYALTY TELEMATIC PRICE/ PROMOTION COMPETITION SEASONALITY UPGRADE LEAVE DEFAULT MACRODATA SITE VISIT PURCHASE
  • 14. CUSTOMER SERVICE CUSTOMER INTERACTION OBJECTIVE TIME APPROACH OUTCOME BEHAVIORAL LOYALTY SITE VISIT SUBSCRIPTION-CENTRIC IDLEVELTIMESTAMPDATA INFERENCE PREDICTION TIME-TO-EVENT POINT-IN-TIME Involuntary Churn Model MARKETING SERVICE LOYALTY TELEMATIC PRICE/ PROMOTION COMPETITION SEASONALITY UPGRADE LEAVE DEFAULT MACRODATA SITE VISIT PURCHASE
  • 15. SITE VISIT CUSTOMER SERVICE PURCHASE CUSTOMER INTERACTION OBJECTIVE TIME APPROACH OUTCOME BEHAVIORAL LOYALTY SUBSCRIPTION-CENTRIC IDLEVELTIMESTAMPDATA INFERENCE PREDICTION TIME-TO-EVENT POINT-IN-TIME Simple Attribution Model MARKETING SERVICE LOYALTY TELEMATIC PRICE/ PROMOTION COMPETITION SEASONALITY UPGRADE LEAVE DEFAULT MACRODATA
  • 16. SITE VISIT CUSTOMER SERVICE PURCHASE CUSTOMER INTERACTION OBJECTIVE TIME APPROACH OUTCOME BEHAVIORAL LOYALTY SUBSCRIPTION-CENTRIC IDLEVELTIMESTAMPDATA INFERENCE PREDICTION TIME-TO-EVENT POINT-IN-TIME Incremental Attribution Model MARKETING SERVICE LOYALTY TELEMATIC PRICE/ PROMOTION COMPETITION SEASONALITY UPGRADE LEAVE DEFAULT MACRODATA
  • 17. Customer 3 Customer 2 Customer 1 What Would the Model Say? JANUARY FEBRUARY MARCH APRIL MAY JUNE PURCHASE CATALOG EMAIL CATALOG EMAIL EMAIL EMAIL CATALOG EMAIL $100 PURCHASE PURCHASE $100 PURCHASE PURCHASE $100 PURCHASEPURCHASE DAYS SINCE TREATMENT SALES ALLOCATION customer sales Catalog Email Retarget Cumulative Orders Catalog Email Retarget Brand Loyalty #1 $ 100 20 40 0 1 $ 95.66 $ 0.02 $ - $ 4.32 #2 $ 100 20 10 0 1 $ 77.52 $ 18.16 $ - $ 4.32 #3 $ 100 20 10 0 2 $ 69.94 $ 17.74 $ - $ 12.32
  • 18. Functions Used Purpose rxImport read in data from flat files READ/WRITE rxDataStep read from XDF file, output to xdf file rxReadXdf read from XDF file, can output to dataframe rxSummary calculate summary stats on XDF file rxCrossTabs build contingency tables of factors EDA rxCube build contingency tables of factors rxHistogram create histograms of numeric vars rxQuantile calculate quantiles of numeric vars rxLogit build logistic regression models MODELING rxPredict score data from xdf with specifed model rxRocCurve evaluate false and true positives of models rxDTree* build classification and regression trees Revolution R Enterprise ScaleR Functions Used Run time for 30MM rows and 30 variables is approx 5 min
  • 19. Prediction: Current State How did we deliver? Propensity Score (LOW  HIGH) Other models only use one dimension to predict likelihood to purchase: PROPENSITY
  • 20. Prediction: DataSong Approach Incrementality Metric Sensitivity Score ● Breakthrough results from adding customer sensitivity score: 14% increase in response rate ● Reallocated marketing circulation: Identified best prospects to not mail that were likely to purchase without receiving catalog Propensity Score (LOW  HIGH) (LOWHIGH) Response modeling single channel: swap set usage INCREMENTALITY metric predicts sensitivity of the next marketing treatment
  • 21. Scoring Discussion Scoring systems are like picture frames: good art is never without one Your best model may never see the light of day • Sharing your parameter estimates isn’t enough Who should own scoring ? • IT: Production support, high uptime mentality • Analytics: often missing the software engineering discipline Scale Analytics teams should be able to manage dozens of models and score billions of records everyday
  • 22. DataSong Architecture • ETL • N marketing channels • Behavioral variables • Promotional data • Overlay data • Functions to read Hadoop output; xdf creation • Exploratory data analysis • GAM survival models • Scoring for inference • Scoring for prediction • 5 billion scores per day per customer DATASONG DATA FORMAT (DDF) CUSTOM VARIABLES (PMML)
  • 23. DataSong Contact 1. A regression modeling framework for prediction and inference 2. Automation of modelsets in Hadoop 3. Enterprise grade scoring in Hadoop Linked In: www.linkedin.com/company/datasong Facebook: www.facebook.com/datasong Twitter: www.twitter.com/datasong Phone: 877.540.5910 Email: info@datasong.com

×