Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Programmable Decision Tree @Scale for Programmatic Media Buying - Rohit Srivastava (MIQ)

29 views

Published on

Programmable Decision tree is a logical decisioning framework for optimizations in programmatic media buying.This session would outline the evolution of framework over time with various big data tech stack usages and their choices preferred over each other.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Programmable Decision Tree @Scale for Programmatic Media Buying - Rohit Srivastava (MIQ)

  1. 1. Programmable Decision Tree @Scale for Programmatic Media Buying ROHIT SRIVASTAVA Engineering Lead - MiQ
  2. 2. 2 ❑ Marketing intelligence and Analytics partner to many of the world’s most prominent brands and media agencies American Express, Avis, Lenovo, Unilever, Microsoft, GroupM, Publicis and IPG. ❑ AiQ is our technology that provides modular, API-based analytics services to rapidly build data solutions for successful real-time business outcomes. MiQ - Activating Marketing Intelligence via AiQ DATA SCIENTIST & ENGINEERS DATA SCIENTIST & ANALYSTS SOL’n ENG & TRADERS
  3. 3. 3 Daily Scale @ MiQ 80 Billion Ad Impressions 5000+ Strategies 10+TB Data 900,000 CPU mins 1000+ Campaigns 750 million users INSIGHTS DATA MUTATION & COPIES SECURITY SCALE ● #campaigns ● #no. of people ● varied experience levels QUALITY MULTIPLE LANGUAGES & TOOLS
  4. 4. 4 Connecting DATASETS - Building the big picture
  5. 5. 5 DATA PROCESSING ECOSYSTEM
  6. 6. 6 Appnexus Programmable Bidder
  7. 7. 7 APB - Problem Statement Ebay.com London, Firefox Ebay.com London Ebay.com + Firefox Ebay.com + Firefox + London 1 1 1 1 apple.com Manchester, Chrome apple.com Chrome Manchester + Chrome Manchester + apple.com + Chrome 0 0 0 0 7
  8. 8. 8 APB - Problem @ Scale with HIVE ….. ~5 TB per day * GZIP Compressed ~60 GB per day * TXT ** 7 days feed ** 10 days feed 4 C2 =6 20 C10 = 184,756 8 C4 = 70 6C3 = 20 Higher per script run-time Unstable workflows EMR Autoscaling - Big PITFALL !! Random MR Failures
  9. 9. 9 Rock-climbing Journey from HOURS to MINUTES ….. Learning Curve with Big Data Ecosystem. Big data tech choices. Key-Val store, HBase in Action Async IO Prog - Could be a curse sometimes !! Every CPU cycle matters. Garbage Collection - Debugging Pitfall - Disk IO & Spills Databricks in ACTION !! JOINS, JOINS & JOINS .. Broadcast JOIN to rescue “SCALA” - Optimized Functioned methods. Data Explosion - 1.5 hours to 20 min Advanced Auto-Scaling Overall Cost savings
  10. 10. 10 Data Science - AFTERMATH OPTIMIZED PIPELINE SITE DOMAIN BROWSER CREATIVE- SIZE CVR ebay.com chrome X 0.618 ebay.com X 120x120 0.623 flipkart.com chrome 120x120 0.655 ebay.com chrome 120x120 0.786 BROWSER CHROME MOZILLA …… CREATIVE 120*150 120*120 …… SITE DOMAIN 0.655 0.786 FLIPKART EBAY MODEL TRAINING DECODED TREE
  11. 11. 11 Q&A
  12. 12. 12 Thank You

×