SlideShare a Scribd company logo
1 of 48
Proprietary & Confidential. Copyright © 2014.
Behavioral Targeting @ Scale
- How did we know that this Ad was relevant for you ?
Savin Goyal
Sivasankaran Chandrasekar
Proprietary & Confidential. Copyright © 2014.Proprietary & Confidential. Copyright © 2014.
ADVERTISER ROCKET FUEL
200+
RTB
advertising
supply
partners
50+ Mn
Websites
50+ Bn
Daily impressions
3B WW CONSUMERS
100,000+ DEVICES
Proprietary & Confidential. Copyright © 2014.
Exchanges
Ad
Exchange
Rocket Fuel Platform
Auto
Optimization
Real-Time
Bidding
Agencies
Data Partners
Display Advertising Ecosystem
Proprietary & Confidential. Copyright © 2014.
Bid on Ad
User
Data
Bid Request
Rocket Fuel
Winning Ad
Ad Request
Ad Served to User
Page RequestWeb Browser
Rocket Fuel Platform
Smart Ad Servers
Response
Prediction
Models
1
8
2 7
Calculate
Propensity
Score
5
User
Engagement
Recorded
9 User Engages with Ad
Publishers
Refresh
learning
Campaign &
Audience
Data
4
Qualify
Campaign
1
0
3
6
Data Partners
Exchange Partners
Programmatic Buying
Proprietary & Confidential. Copyright © 2014.
1.25
$2.11
$1.26
$2.78
$1.256
$1.809
$2.42
1.25
$2.11
$1.26
$2.78
$0.586
$2.009
1.25
$2.11
$1.26
$2.78
$1.56
$0.00
Site/PageGeo/WeatherTime of DayBrand AffinityUser
[ + ][ + ]
Real Time Auction
Proprietary & Confidential. Copyright © 2014.
Goal:
Leads
& sales
Goal:
Coupon
downloads
Goal:
Brand
awareness
Site/PageGeo/WeatherTime of DayBrand AffinityDemo
Impression Scorecard
Demo
Brand Affinity
Time of Day
Geo/Weather
Site/Page
Ad Position
In-market
Behavior
Response
Impression Scorecard
Demo
Brand Affinity
Time of Day
Geo/Weather
Site/Page
Ad Position
In-Market
BehaviorResponse
X
Impression Scorecard
Demo
Brand Affinity
Time of Day
Geo/Weather
Site/Page
Ad Position
In-Market
Behavior
Response
+100
+40
-20
+20
+15
+10
+40
+35
+9.7%
+40
-70
-20
+10
+15
-25
-40
-
18+0.7
%
+10
-10
-20
+20
+10
-35
-25
+10
+1.4%
X✓
Real Time Auction
Proprietary & Confidential. Copyright © 2014.
Scalable Predictive Models
Age/Gender
Occupation
IncomeEthnicity
Purchase Intent
Online
Purchases
Offline
Purchases
Browsing
Behavior
Site Actions
Zip CodeCity/DMA
Search
Sites
Search
Categories
Recency
Search
Keywords
Web Site/Page
Referral URL
Site
Category
Bizographics
Social
Interests Lifestyle
Positive Lift
Marginal Impact
Negative Lift
-7
+17
X
-2
+8
+14
X
-9
-13
-12
X
+19
+13
+11
X
+11
X
X
X
+25
+6
X
-7 +17
-2
+28
X
+11
X
X
-9
+14
+17 +19
+8 +11
X
X
-9
+17
-23
+6
X
+17
-7
X
-2
-13
-12
X
+13
+6
+11
X
X
X
-9 X
+17
X
+19
+8
+14
+18
-23
+17
-12
+11
-9
+8 +14
X
+11
-13
-12
+13
+11
X
X
-7
+17 +8
+18X
+11
X -12-10
+6
+14
X
+8
+11
-10+13
+28 +6
+13
+19
X
+8
+11
-10
+13
-12
+17
X
-7
+8
X
Automated Feature Selection
▪ Infinite number of models
▪ Determine perfect model size
▪ Balance past data fit
and future generalization
Learn-Test-Refine
▪ Automatically learn from
each response
▪ Cross-validate - A / B testing
infrastructure
▪ Training pipeline
Proprietary & Confidential. Copyright © 2014.
Throughput
Proprietary & Confidential. Copyright © 2014.
Rocket Fuel Scale
▪ 34,474 CPU Processor Cores
▪ 2655 servers
▪ 187.4 Teraflops of computing
▪ 188 Terabytes of memory
▪ 13X the memory of Jeopardy-
winning IBM Watson
▪ 42 Petabytes of storage
▪ 106X the data volume of entire
Library of Congress
Proprietary & Confidential. Copyright © 2014.
200 Servers 1400 Servers
5 PB
41 PB
8x
Data Warehouse Growth
Proprietary & Confidential. Copyright © 2014.
Behavioral Targeting
Proprietary & Confidential. Copyright © 2014.
Behavioral Targeting
▪ Leverage online activities on the web to learn about user’s
▪ Long Term Interests
▪ User is interested in luxury cars
▪ Short Term Interests
▪ User is looking for a pizza right now
▪ Expand user set beyond retargeting
▪ Explore v/s Exploit
▪ Identify relevant users even if they have never been targeted
previously
Proprietary & Confidential. Copyright © 2014.
Behavioral Targeting @ Rocket Fuel
Label Data
Train
Model
Back Test
Calibrate
Training
Events
Pixel
Stream
Ad Logs
BT Features
(HBase)
Feature
Generation
Score
Profiles
Profile
Generation
Scoring
Ad Serving Data Centers Model
Proprietary & Confidential. Copyright © 2014.
Hadoop/HBase @ Rocket Fuel
▪ Cluster Highlights
▪ 650+ Slaves (64 GB + 12 *3 TB)
▪ 20 PB Storage
▪ HA Name Node Set Up
▪ 9k Map Slots + 5.5k Reduce Slots
▪ Co-located to run HBase for offline processing
▪ HBase 0.94.15
▪ 5 Node ZooKeeper quorum
▪ Monitoring with OpenTSDB
▪ Dual Master Setup
Proprietary & Confidential. Copyright © 2014.
Behavioral Targeting @ Rocket Fuel
bmw.com 11:23
Cars 11:23
pizzahut.com 11:26
Food 11:26
honda.com 11:27
Cars 11:27
30 minutes
honda.com 11:27 Recent 6 hours: 5 Between 6 and 12 hours: 3 Between 12 hours and …
Food 11:26 Recent 6 hours: 2 Between 6 and 12 hours: 7 Between 12 hours and …
Read events of
last N days
Recency
Frequency
Others..
Behavioral Targeting Profile
11:23 11:26 11:27
Proprietary & Confidential. Copyright © 2014.
HBase Data Model
11:23ABCD06EFG
2014060416:site:bmw.com 2014060416:category:food
11:26
row_key: user_id
Single Column Family “u”
Column Qualifier:
<date><hour>:<type>:<value>
Cell Value: [Protobuf]
Most recent timestamp, Event details
relative to timestamp
Event details relative to 11:23 Event details relative to 11:26
• Efficient look up for a given user
• Access range of events by event date, hour and type
Proprietary & Confidential. Copyright © 2014.
Proprietary & Confidential. Copyright © 2014.
Key Challenges
User Profile Freshness Scaling Issues Pipeline Failures
Proprietary & Confidential. Copyright © 2014.
User Profile Freshness
▪ Strict latency requirements
▪ Recent activity much better
predictor
Solutions -
▪ Staggered Pipelines
▪ Real Time Behavioral Targeting
Proprietary & Confidential. Copyright © 2014.
Staggered Pipelines
Extract Score Filter Upload
Extract Score Filter UploadSource Data
Extract Score Filter Upload
Extract Score Filter Upload
Extract Score Filter Upload
Proprietary & Confidential. Copyright © 2014.
Real Time
Behavioral Targeting
Proprietary & Confidential. Copyright © 2014.
Batched Profile
Blackbird – HBase instance tuned for 2ms latencies
Refreshed
every N hours
Real Time Behavioral Targeting
Offline BT
Pipeline
BT Profile
Ad Servers Merge Profiles
Logs
Blackbird
Online Profile
Record events for
users in real time
Request
Response
Proprietary & Confidential. Copyright © 2014.
Batched Updates vs. Real Time Updates
Event Granularity
Aggregated over
several hours/days
Raw recorded events
appended for recent
N hours
Processing Load
Requires minimal CPU
processing
Needs aggregation
on-the-fly
Disk Footprint
Compact
representation
captures several days
Strict limits to ensure
read times are
acceptable
Coverage All interactions
Only interactions at a
data center
▪ Real Time Profile updated in milliseconds
▪ Batched Profile refreshed every N hours
Batched Profile Real Time Profile
Proprietary & Confidential. Copyright © 2014.
Scaling Issues
▪ 3X growth in events processed/year
▪ First Party Data
▪ App Interactions
▪ Geo-location Data
▪ …
▪ Case Studies
▪ HBase Region Hot-spotting
▪ Network Bandwidth Troubles
Proprietary & Confidential. Copyright © 2014.
HBase Region
Hot Spotting
Proprietary & Confidential. Copyright © 2014.
HBase
Region
HBase Region Hot-spotting
High Write Load
HBase
Region
HBase
Region
Region Split (painful!)
Some users more active than others
No control on user id’s generated
Still
problematic
Non-uniform
distribution!
Proprietary & Confidential. Copyright © 2014.
HBase Region Hot-spotting
▪ Uneven write-load distribution
▪ Non-Uniform Row Key Distribution
▪ Salt row key’s to ensure uniform distribution
▪ Fixed length hashed prefix
▪
Murmur hash
based prefix
Original User ID
▪ Uniform pre-splits
Proprietary & Confidential. Copyright © 2014.
HBase Region Hot-spotting
▪ Don’t stop at salting
▪ Map input splits configured for region boundaries
Region 1
x03x85x1ExB8ZZZZZZ
Region 2
x07x5CxF5xC2928ZZ
Region m
xFFxAEx14xE1Z28ZZ
1234557
1234568
1234579
1234583
1234594
..
..
..
..
ZZAHT654
ZZZGT934
ZZZZNGA2
ZZZZKLO1
Key
Partitioner
‘k’ splits ‘m’ regions‘m’ splits
x01x85x1ExB811ZKL1
x01x86x1ExB8129542
..
x03x85x1ExB8ZZZKL1
x05x35x9Ex18087KL1
x06x86x1ExB8AHV24
..
x07x5CxF5xC16534Z
xEBx27x92x1508RKL1
xFEx86x1ExB8AHV24
..
xFFxAEx14x126534Z
Proprietary & Confidential. Copyright © 2014.
HBase Key Partitioner
▪ As many splits as regions to maximize parallelism
▪ Key Partitioner (MR) –
▪ Reads region boundaries of HBase table
▪ Salts and sorts row key accordingly
▪ Multiple Output Format to optimize reduce phase
▪ Each generated split file corresponds to a single region
▪ Drastically reduces read latencies
Proprietary & Confidential. Copyright © 2014.
Network Bandwidth
Troubles
Proprietary & Confidential. Copyright © 2014.
Data Center Expansion
Proprietary & Confidential. Copyright © 2014.
Network Bandwidth Constraints
▪ Consistently overshot bandwidth limit during uploads
▪ All sorts of delays (Redis, MySQL, Blackbird…)
▪ Bidding hampered
Proprietary & Confidential. Copyright © 2014.
Solutions
▪ Intelligent storage – protobufs everywhere
▪ Throttle writes
▪ Geo-splitting
Proprietary & Confidential. Copyright © 2014.
Geo Splitting
Proprietary & Confidential. Copyright © 2014.
Geo-splitting
▪ Tag user’s location history & predict future data center visits
▪ ⨍(dc, geo_history, bt_profile)
▪ A separate workflow periodically generates geo-split rules:
▪ Clusters users & analyzes migration patterns
▪ Ensures maximal look-up coverage of profiles
▪ Minimizes total number of profiles stored
▪ Ensures efficient use of resources, with minimal impact on perf
Proprietary & Confidential. Copyright © 2014.
Geo-splitting
Label Data
Train
Model
Back Test
Calibrate
Training
Events
Pixel
Stream
Ad Logs
BT Features
(HBase)
Feature
Generation
Score
Profiles
Profile
Generation
Scoring
Ad Serving Data Centers Model
Cluster
Users
Analyze
Patterns
Generate
Rules
Geo-split
Proprietary & Confidential. Copyright © 2014.
Proprietary & Confidential. Copyright © 2014.
Quick Recovery From Failures
▪ Break pipeline into short payloads
▪ Fail fast, recover fast!
▪ Actionable alerts, cut down noise
Proprietary & Confidential. Copyright © 2014.
Quick Recovery From Failures
▪ Materialize data as frequently as possible
▪ Cross system fault tolerance
▪ Idempotency
▪ Backfill at EOD to plug holes if needed
Proprietary & Confidential. Copyright © 2014.
Shout-outs!
Proprietary & Confidential. Copyright © 2014.
Shout-outs!
Proprietary & Confidential. Copyright © 2014.
Shout-outs!
Proprietary & Confidential. Copyright © 2014.
Shout-outs!
Proprietary & Confidential. Copyright © 2014.
We Are Hiring!
Proprietary & Confidential. Copyright © 2014.
Proprietary & Confidential. Copyright © 2014.
Questions ?
Thank You!
Sivasankaran Chandrasekar
chandra@rocketfuel.com
Savin Goyal
savin@rocketfuel.com
Proprietary & Confidential. Copyright © 2014.
We are hiring! (as always)
http://rocketfuel.com/careers
savin@rocketfuel.com
chandra@rocketfuel.com
Proprietary & Confidential. Copyright © 2014.

More Related Content

What's hot

How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesCloudera, Inc.
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingGreat Wide Open
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform WebinarCloudera, Inc.
 
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitAmazon Web Services
 
Scalable HiveServer2 as a Service
Scalable HiveServer2 as a ServiceScalable HiveServer2 as a Service
Scalable HiveServer2 as a ServiceDataWorks Summit
 
Using Apache Geode: Lessons Learned at Southwest Airlines
Using Apache Geode: Lessons Learned at Southwest AirlinesUsing Apache Geode: Lessons Learned at Southwest Airlines
Using Apache Geode: Lessons Learned at Southwest AirlinesVMware Tanzu
 
Building Effective Apache Geode Applications with Spring Data GemFire
Building Effective Apache Geode Applications with Spring Data GemFireBuilding Effective Apache Geode Applications with Spring Data GemFire
Building Effective Apache Geode Applications with Spring Data GemFireJohn Blum
 
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon Web Services
 
How to build leakproof stream processing pipelines with Apache Kafka and Apac...
How to build leakproof stream processing pipelines with Apache Kafka and Apac...How to build leakproof stream processing pipelines with Apache Kafka and Apac...
How to build leakproof stream processing pipelines with Apache Kafka and Apac...Cloudera, Inc.
 
The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop MapR Technologies
 
Fraud Detection with Hadoop
Fraud Detection with HadoopFraud Detection with Hadoop
Fraud Detection with Hadoopmarkgrover
 
Challenges for running Hadoop on AWS - AdvancedAWS Meetup
Challenges for running Hadoop on AWS - AdvancedAWS MeetupChallenges for running Hadoop on AWS - AdvancedAWS Meetup
Challenges for running Hadoop on AWS - AdvancedAWS MeetupAndrei Savu
 
Best Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopBest Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopHortonworks
 
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とはCloudera Japan
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8Rahul Gupta
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14iwrigley
 
Hazelcast Jet - January 08, 2018
Hazelcast Jet - January 08, 2018Hazelcast Jet - January 08, 2018
Hazelcast Jet - January 08, 2018Rahul Gupta
 
Is Cloud a right Companion for Hadoop
Is Cloud a right Companion for HadoopIs Cloud a right Companion for Hadoop
Is Cloud a right Companion for HadoopDataWorks Summit
 

What's hot (20)

How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issues
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform Webinar
 
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
 
Scalable HiveServer2 as a Service
Scalable HiveServer2 as a ServiceScalable HiveServer2 as a Service
Scalable HiveServer2 as a Service
 
Using Apache Geode: Lessons Learned at Southwest Airlines
Using Apache Geode: Lessons Learned at Southwest AirlinesUsing Apache Geode: Lessons Learned at Southwest Airlines
Using Apache Geode: Lessons Learned at Southwest Airlines
 
Building Effective Apache Geode Applications with Spring Data GemFire
Building Effective Apache Geode Applications with Spring Data GemFireBuilding Effective Apache Geode Applications with Spring Data GemFire
Building Effective Apache Geode Applications with Spring Data GemFire
 
Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
 
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
 
How to build leakproof stream processing pipelines with Apache Kafka and Apac...
How to build leakproof stream processing pipelines with Apache Kafka and Apac...How to build leakproof stream processing pipelines with Apache Kafka and Apac...
How to build leakproof stream processing pipelines with Apache Kafka and Apac...
 
The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop
 
Fraud Detection with Hadoop
Fraud Detection with HadoopFraud Detection with Hadoop
Fraud Detection with Hadoop
 
Challenges for running Hadoop on AWS - AdvancedAWS Meetup
Challenges for running Hadoop on AWS - AdvancedAWS MeetupChallenges for running Hadoop on AWS - AdvancedAWS Meetup
Challenges for running Hadoop on AWS - AdvancedAWS Meetup
 
Best Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopBest Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache Hadoop
 
Hadoop and OpenStack
Hadoop and OpenStackHadoop and OpenStack
Hadoop and OpenStack
 
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
 
Hazelcast Jet - January 08, 2018
Hazelcast Jet - January 08, 2018Hazelcast Jet - January 08, 2018
Hazelcast Jet - January 08, 2018
 
Is Cloud a right Companion for Hadoop
Is Cloud a right Companion for HadoopIs Cloud a right Companion for Hadoop
Is Cloud a right Companion for Hadoop
 

Viewers also liked

CES - C Space Storytelling Session - Programmatic TV Advertising
CES - C Space Storytelling Session - Programmatic TV AdvertisingCES - C Space Storytelling Session - Programmatic TV Advertising
CES - C Space Storytelling Session - Programmatic TV AdvertisingRocket Fuel Inc.
 
Rocket fuel cross device and ptv 12-9-15 sharedv2
Rocket fuel cross device and ptv   12-9-15 sharedv2Rocket fuel cross device and ptv   12-9-15 sharedv2
Rocket fuel cross device and ptv 12-9-15 sharedv2Rocket Fuel Inc.
 
Digital Media Buying Chart
Digital Media Buying ChartDigital Media Buying Chart
Digital Media Buying ChartRocket Fuel Inc.
 
Destination-Marketing Webinar Presentation
Destination-Marketing Webinar PresentationDestination-Marketing Webinar Presentation
Destination-Marketing Webinar PresentationRocket Fuel Inc.
 
What are the opportunities for video advertising on mobile?
What are the opportunities for video advertising on mobile?What are the opportunities for video advertising on mobile?
What are the opportunities for video advertising on mobile?Videoplaza
 
A Future for TV: The Publisher as Audience Architect
A Future for TV: The Publisher as Audience ArchitectA Future for TV: The Publisher as Audience Architect
A Future for TV: The Publisher as Audience ArchitectVideoplaza
 
Guide to Programmatic Marketing Webinar Deck
Guide to Programmatic Marketing Webinar DeckGuide to Programmatic Marketing Webinar Deck
Guide to Programmatic Marketing Webinar DeckRocket Fuel Inc.
 
Top-of-mind Insurance Webinar Presentation
Top-of-mind Insurance Webinar PresentationTop-of-mind Insurance Webinar Presentation
Top-of-mind Insurance Webinar PresentationRocket Fuel Inc.
 
Ooyala's 25 Stories of Video Success
Ooyala's 25 Stories of Video SuccessOoyala's 25 Stories of Video Success
Ooyala's 25 Stories of Video SuccessOoyala
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackDataWorks Summit/Hadoop Summit
 

Viewers also liked (13)

CES - C Space Storytelling Session - Programmatic TV Advertising
CES - C Space Storytelling Session - Programmatic TV AdvertisingCES - C Space Storytelling Session - Programmatic TV Advertising
CES - C Space Storytelling Session - Programmatic TV Advertising
 
Rocket fuel cross device and ptv 12-9-15 sharedv2
Rocket fuel cross device and ptv   12-9-15 sharedv2Rocket fuel cross device and ptv   12-9-15 sharedv2
Rocket fuel cross device and ptv 12-9-15 sharedv2
 
Digital Media Buying Chart
Digital Media Buying ChartDigital Media Buying Chart
Digital Media Buying Chart
 
Destination-Marketing Webinar Presentation
Destination-Marketing Webinar PresentationDestination-Marketing Webinar Presentation
Destination-Marketing Webinar Presentation
 
What are the opportunities for video advertising on mobile?
What are the opportunities for video advertising on mobile?What are the opportunities for video advertising on mobile?
What are the opportunities for video advertising on mobile?
 
A Future for TV: The Publisher as Audience Architect
A Future for TV: The Publisher as Audience ArchitectA Future for TV: The Publisher as Audience Architect
A Future for TV: The Publisher as Audience Architect
 
Traffic Quality Webinar
Traffic Quality WebinarTraffic Quality Webinar
Traffic Quality Webinar
 
OOYALA
OOYALAOOYALA
OOYALA
 
Guide to Programmatic Marketing Webinar Deck
Guide to Programmatic Marketing Webinar DeckGuide to Programmatic Marketing Webinar Deck
Guide to Programmatic Marketing Webinar Deck
 
Top-of-mind Insurance Webinar Presentation
Top-of-mind Insurance Webinar PresentationTop-of-mind Insurance Webinar Presentation
Top-of-mind Insurance Webinar Presentation
 
Ooyala's 25 Stories of Video Success
Ooyala's 25 Stories of Video SuccessOoyala's 25 Stories of Video Success
Ooyala's 25 Stories of Video Success
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
Big Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud DetectionBig Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud Detection
 

Similar to How did you know this Ad will be relevant for me?!

How did you know this ad would be relevant for me?
How did you know this ad would be relevant for me?How did you know this ad would be relevant for me?
How did you know this ad would be relevant for me?DataWorks Summit
 
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...Cloudera, Inc.
 
AWS Media Day- AWS Media Tailor를 사용한 서버 사이드 광고 삽입으로 컨텐츠 수익화 (Mark Cousins통합 시...
AWS Media Day- AWS Media Tailor를 사용한 서버 사이드 광고 삽입으로 컨텐츠 수익화 (Mark Cousins통합 시...AWS Media Day- AWS Media Tailor를 사용한 서버 사이드 광고 삽입으로 컨텐츠 수익화 (Mark Cousins통합 시...
AWS Media Day- AWS Media Tailor를 사용한 서버 사이드 광고 삽입으로 컨텐츠 수익화 (Mark Cousins통합 시...Amazon Web Services Korea
 
How we solved Real-time User Segmentation using HBase
How we solved Real-time User Segmentation using HBaseHow we solved Real-time User Segmentation using HBase
How we solved Real-time User Segmentation using HBaseDataWorks Summit
 
Dawn of YARN @ Rocket Fuel
Dawn of YARN @ Rocket FuelDawn of YARN @ Rocket Fuel
Dawn of YARN @ Rocket FuelDataWorks Summit
 
Drawbridge_MeetUp_June19_072414
Drawbridge_MeetUp_June19_072414Drawbridge_MeetUp_June19_072414
Drawbridge_MeetUp_June19_072414Nitin Panjwani
 
Webinar: LiveAction 4.0 single pane of glass visibility for large enterprise ...
Webinar: LiveAction 4.0 single pane of glass visibility for large enterprise ...Webinar: LiveAction 4.0 single pane of glass visibility for large enterprise ...
Webinar: LiveAction 4.0 single pane of glass visibility for large enterprise ...LiveAction IT
 
Serverless Applications at Global Scale with Multi-Regional Deployments - AWS...
Serverless Applications at Global Scale with Multi-Regional Deployments - AWS...Serverless Applications at Global Scale with Multi-Regional Deployments - AWS...
Serverless Applications at Global Scale with Multi-Regional Deployments - AWS...Amazon Web Services
 
RedisConf18 - The Versatility of Redis - Powering our critical business using...
RedisConf18 - The Versatility of Redis - Powering our critical business using...RedisConf18 - The Versatility of Redis - Powering our critical business using...
RedisConf18 - The Versatility of Redis - Powering our critical business using...Redis Labs
 
[Redis conf18] The Versatility of Redis
[Redis conf18] The Versatility of Redis[Redis conf18] The Versatility of Redis
[Redis conf18] The Versatility of RedisEiti Kimura
 
Virtual SAN: It’s a SAN, it’s Virtual, but what is it really?
Virtual SAN: It’s a SAN, it’s Virtual, but what is it really?Virtual SAN: It’s a SAN, it’s Virtual, but what is it really?
Virtual SAN: It’s a SAN, it’s Virtual, but what is it really?DataCore Software
 
Couchbase Cloud No Equal (Rick Jacobs, Couchbase) Kafka Summit 2020
Couchbase Cloud No Equal (Rick Jacobs, Couchbase) Kafka Summit 2020Couchbase Cloud No Equal (Rick Jacobs, Couchbase) Kafka Summit 2020
Couchbase Cloud No Equal (Rick Jacobs, Couchbase) Kafka Summit 2020HostedbyConfluent
 
Aditya - Hacking Client Side Insecurities - ClubHack2008
Aditya - Hacking Client Side Insecurities - ClubHack2008Aditya - Hacking Client Side Insecurities - ClubHack2008
Aditya - Hacking Client Side Insecurities - ClubHack2008ClubHack
 
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...Amazon Web Services
 
Cloud native Microservices using Spring Boot
Cloud native Microservices using Spring BootCloud native Microservices using Spring Boot
Cloud native Microservices using Spring BootSufyaan Kazi
 
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...Amazon Web Services
 
NEW LAUNCH! Learn how Fubo is monetizing their content with server side ad in...
NEW LAUNCH! Learn how Fubo is monetizing their content with server side ad in...NEW LAUNCH! Learn how Fubo is monetizing their content with server side ad in...
NEW LAUNCH! Learn how Fubo is monetizing their content with server side ad in...Amazon Web Services
 
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...VMware Tanzu
 

Similar to How did you know this Ad will be relevant for me?! (20)

How did you know this ad would be relevant for me?
How did you know this ad would be relevant for me?How did you know this ad would be relevant for me?
How did you know this ad would be relevant for me?
 
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
HBaseCon 2013: Realtime User Segmentation using Apache HBase -- Architectural...
 
Hado"ops" or Had"oops"
Hado"ops" or Had"oops"Hado"ops" or Had"oops"
Hado"ops" or Had"oops"
 
AWS Media Day- AWS Media Tailor를 사용한 서버 사이드 광고 삽입으로 컨텐츠 수익화 (Mark Cousins통합 시...
AWS Media Day- AWS Media Tailor를 사용한 서버 사이드 광고 삽입으로 컨텐츠 수익화 (Mark Cousins통합 시...AWS Media Day- AWS Media Tailor를 사용한 서버 사이드 광고 삽입으로 컨텐츠 수익화 (Mark Cousins통합 시...
AWS Media Day- AWS Media Tailor를 사용한 서버 사이드 광고 삽입으로 컨텐츠 수익화 (Mark Cousins통합 시...
 
How we solved Real-time User Segmentation using HBase
How we solved Real-time User Segmentation using HBaseHow we solved Real-time User Segmentation using HBase
How we solved Real-time User Segmentation using HBase
 
Dawn of YARN @ Rocket Fuel
Dawn of YARN @ Rocket FuelDawn of YARN @ Rocket Fuel
Dawn of YARN @ Rocket Fuel
 
Drawbridge_MeetUp_June19_072414
Drawbridge_MeetUp_June19_072414Drawbridge_MeetUp_June19_072414
Drawbridge_MeetUp_June19_072414
 
Webinar: LiveAction 4.0 single pane of glass visibility for large enterprise ...
Webinar: LiveAction 4.0 single pane of glass visibility for large enterprise ...Webinar: LiveAction 4.0 single pane of glass visibility for large enterprise ...
Webinar: LiveAction 4.0 single pane of glass visibility for large enterprise ...
 
Serverless Applications at Global Scale with Multi-Regional Deployments - AWS...
Serverless Applications at Global Scale with Multi-Regional Deployments - AWS...Serverless Applications at Global Scale with Multi-Regional Deployments - AWS...
Serverless Applications at Global Scale with Multi-Regional Deployments - AWS...
 
RedisConf18 - The Versatility of Redis - Powering our critical business using...
RedisConf18 - The Versatility of Redis - Powering our critical business using...RedisConf18 - The Versatility of Redis - Powering our critical business using...
RedisConf18 - The Versatility of Redis - Powering our critical business using...
 
[Redis conf18] The Versatility of Redis
[Redis conf18] The Versatility of Redis[Redis conf18] The Versatility of Redis
[Redis conf18] The Versatility of Redis
 
Virtual SAN: It’s a SAN, it’s Virtual, but what is it really?
Virtual SAN: It’s a SAN, it’s Virtual, but what is it really?Virtual SAN: It’s a SAN, it’s Virtual, but what is it really?
Virtual SAN: It’s a SAN, it’s Virtual, but what is it really?
 
Couchbase Cloud No Equal (Rick Jacobs, Couchbase) Kafka Summit 2020
Couchbase Cloud No Equal (Rick Jacobs, Couchbase) Kafka Summit 2020Couchbase Cloud No Equal (Rick Jacobs, Couchbase) Kafka Summit 2020
Couchbase Cloud No Equal (Rick Jacobs, Couchbase) Kafka Summit 2020
 
Aditya - Hacking Client Side Insecurities - ClubHack2008
Aditya - Hacking Client Side Insecurities - ClubHack2008Aditya - Hacking Client Side Insecurities - ClubHack2008
Aditya - Hacking Client Side Insecurities - ClubHack2008
 
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
How to Effectively Plan for Disaster Recovery on AWS (CMP204-S) - AWS re:Inve...
 
Cloud native Microservices using Spring Boot
Cloud native Microservices using Spring BootCloud native Microservices using Spring Boot
Cloud native Microservices using Spring Boot
 
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
 
NEW LAUNCH! Learn how Fubo is monetizing their content with server side ad in...
NEW LAUNCH! Learn how Fubo is monetizing their content with server side ad in...NEW LAUNCH! Learn how Fubo is monetizing their content with server side ad in...
NEW LAUNCH! Learn how Fubo is monetizing their content with server side ad in...
 
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
 
RubyonRails Development
RubyonRails DevelopmentRubyonRails Development
RubyonRails Development
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

How did you know this Ad will be relevant for me?!

  • 1. Proprietary & Confidential. Copyright © 2014. Behavioral Targeting @ Scale - How did we know that this Ad was relevant for you ? Savin Goyal Sivasankaran Chandrasekar
  • 2. Proprietary & Confidential. Copyright © 2014.Proprietary & Confidential. Copyright © 2014. ADVERTISER ROCKET FUEL 200+ RTB advertising supply partners 50+ Mn Websites 50+ Bn Daily impressions 3B WW CONSUMERS 100,000+ DEVICES
  • 3. Proprietary & Confidential. Copyright © 2014. Exchanges Ad Exchange Rocket Fuel Platform Auto Optimization Real-Time Bidding Agencies Data Partners Display Advertising Ecosystem
  • 4. Proprietary & Confidential. Copyright © 2014. Bid on Ad User Data Bid Request Rocket Fuel Winning Ad Ad Request Ad Served to User Page RequestWeb Browser Rocket Fuel Platform Smart Ad Servers Response Prediction Models 1 8 2 7 Calculate Propensity Score 5 User Engagement Recorded 9 User Engages with Ad Publishers Refresh learning Campaign & Audience Data 4 Qualify Campaign 1 0 3 6 Data Partners Exchange Partners Programmatic Buying
  • 5. Proprietary & Confidential. Copyright © 2014. 1.25 $2.11 $1.26 $2.78 $1.256 $1.809 $2.42 1.25 $2.11 $1.26 $2.78 $0.586 $2.009 1.25 $2.11 $1.26 $2.78 $1.56 $0.00 Site/PageGeo/WeatherTime of DayBrand AffinityUser [ + ][ + ] Real Time Auction
  • 6. Proprietary & Confidential. Copyright © 2014. Goal: Leads & sales Goal: Coupon downloads Goal: Brand awareness Site/PageGeo/WeatherTime of DayBrand AffinityDemo Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-market Behavior Response Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-Market BehaviorResponse X Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-Market Behavior Response +100 +40 -20 +20 +15 +10 +40 +35 +9.7% +40 -70 -20 +10 +15 -25 -40 - 18+0.7 % +10 -10 -20 +20 +10 -35 -25 +10 +1.4% X✓ Real Time Auction
  • 7. Proprietary & Confidential. Copyright © 2014. Scalable Predictive Models Age/Gender Occupation IncomeEthnicity Purchase Intent Online Purchases Offline Purchases Browsing Behavior Site Actions Zip CodeCity/DMA Search Sites Search Categories Recency Search Keywords Web Site/Page Referral URL Site Category Bizographics Social Interests Lifestyle Positive Lift Marginal Impact Negative Lift -7 +17 X -2 +8 +14 X -9 -13 -12 X +19 +13 +11 X +11 X X X +25 +6 X -7 +17 -2 +28 X +11 X X -9 +14 +17 +19 +8 +11 X X -9 +17 -23 +6 X +17 -7 X -2 -13 -12 X +13 +6 +11 X X X -9 X +17 X +19 +8 +14 +18 -23 +17 -12 +11 -9 +8 +14 X +11 -13 -12 +13 +11 X X -7 +17 +8 +18X +11 X -12-10 +6 +14 X +8 +11 -10+13 +28 +6 +13 +19 X +8 +11 -10 +13 -12 +17 X -7 +8 X Automated Feature Selection ▪ Infinite number of models ▪ Determine perfect model size ▪ Balance past data fit and future generalization Learn-Test-Refine ▪ Automatically learn from each response ▪ Cross-validate - A / B testing infrastructure ▪ Training pipeline
  • 8. Proprietary & Confidential. Copyright © 2014. Throughput
  • 9. Proprietary & Confidential. Copyright © 2014. Rocket Fuel Scale ▪ 34,474 CPU Processor Cores ▪ 2655 servers ▪ 187.4 Teraflops of computing ▪ 188 Terabytes of memory ▪ 13X the memory of Jeopardy- winning IBM Watson ▪ 42 Petabytes of storage ▪ 106X the data volume of entire Library of Congress
  • 10. Proprietary & Confidential. Copyright © 2014. 200 Servers 1400 Servers 5 PB 41 PB 8x Data Warehouse Growth
  • 11. Proprietary & Confidential. Copyright © 2014. Behavioral Targeting
  • 12. Proprietary & Confidential. Copyright © 2014. Behavioral Targeting ▪ Leverage online activities on the web to learn about user’s ▪ Long Term Interests ▪ User is interested in luxury cars ▪ Short Term Interests ▪ User is looking for a pizza right now ▪ Expand user set beyond retargeting ▪ Explore v/s Exploit ▪ Identify relevant users even if they have never been targeted previously
  • 13. Proprietary & Confidential. Copyright © 2014. Behavioral Targeting @ Rocket Fuel Label Data Train Model Back Test Calibrate Training Events Pixel Stream Ad Logs BT Features (HBase) Feature Generation Score Profiles Profile Generation Scoring Ad Serving Data Centers Model
  • 14. Proprietary & Confidential. Copyright © 2014. Hadoop/HBase @ Rocket Fuel ▪ Cluster Highlights ▪ 650+ Slaves (64 GB + 12 *3 TB) ▪ 20 PB Storage ▪ HA Name Node Set Up ▪ 9k Map Slots + 5.5k Reduce Slots ▪ Co-located to run HBase for offline processing ▪ HBase 0.94.15 ▪ 5 Node ZooKeeper quorum ▪ Monitoring with OpenTSDB ▪ Dual Master Setup
  • 15. Proprietary & Confidential. Copyright © 2014. Behavioral Targeting @ Rocket Fuel bmw.com 11:23 Cars 11:23 pizzahut.com 11:26 Food 11:26 honda.com 11:27 Cars 11:27 30 minutes honda.com 11:27 Recent 6 hours: 5 Between 6 and 12 hours: 3 Between 12 hours and … Food 11:26 Recent 6 hours: 2 Between 6 and 12 hours: 7 Between 12 hours and … Read events of last N days Recency Frequency Others.. Behavioral Targeting Profile 11:23 11:26 11:27
  • 16. Proprietary & Confidential. Copyright © 2014. HBase Data Model 11:23ABCD06EFG 2014060416:site:bmw.com 2014060416:category:food 11:26 row_key: user_id Single Column Family “u” Column Qualifier: <date><hour>:<type>:<value> Cell Value: [Protobuf] Most recent timestamp, Event details relative to timestamp Event details relative to 11:23 Event details relative to 11:26 • Efficient look up for a given user • Access range of events by event date, hour and type
  • 17. Proprietary & Confidential. Copyright © 2014.
  • 18. Proprietary & Confidential. Copyright © 2014. Key Challenges User Profile Freshness Scaling Issues Pipeline Failures
  • 19. Proprietary & Confidential. Copyright © 2014. User Profile Freshness ▪ Strict latency requirements ▪ Recent activity much better predictor Solutions - ▪ Staggered Pipelines ▪ Real Time Behavioral Targeting
  • 20. Proprietary & Confidential. Copyright © 2014. Staggered Pipelines Extract Score Filter Upload Extract Score Filter UploadSource Data Extract Score Filter Upload Extract Score Filter Upload Extract Score Filter Upload
  • 21. Proprietary & Confidential. Copyright © 2014. Real Time Behavioral Targeting
  • 22. Proprietary & Confidential. Copyright © 2014. Batched Profile Blackbird – HBase instance tuned for 2ms latencies Refreshed every N hours Real Time Behavioral Targeting Offline BT Pipeline BT Profile Ad Servers Merge Profiles Logs Blackbird Online Profile Record events for users in real time Request Response
  • 23. Proprietary & Confidential. Copyright © 2014. Batched Updates vs. Real Time Updates Event Granularity Aggregated over several hours/days Raw recorded events appended for recent N hours Processing Load Requires minimal CPU processing Needs aggregation on-the-fly Disk Footprint Compact representation captures several days Strict limits to ensure read times are acceptable Coverage All interactions Only interactions at a data center ▪ Real Time Profile updated in milliseconds ▪ Batched Profile refreshed every N hours Batched Profile Real Time Profile
  • 24. Proprietary & Confidential. Copyright © 2014. Scaling Issues ▪ 3X growth in events processed/year ▪ First Party Data ▪ App Interactions ▪ Geo-location Data ▪ … ▪ Case Studies ▪ HBase Region Hot-spotting ▪ Network Bandwidth Troubles
  • 25. Proprietary & Confidential. Copyright © 2014. HBase Region Hot Spotting
  • 26. Proprietary & Confidential. Copyright © 2014. HBase Region HBase Region Hot-spotting High Write Load HBase Region HBase Region Region Split (painful!) Some users more active than others No control on user id’s generated Still problematic Non-uniform distribution!
  • 27. Proprietary & Confidential. Copyright © 2014. HBase Region Hot-spotting ▪ Uneven write-load distribution ▪ Non-Uniform Row Key Distribution ▪ Salt row key’s to ensure uniform distribution ▪ Fixed length hashed prefix ▪ Murmur hash based prefix Original User ID ▪ Uniform pre-splits
  • 28. Proprietary & Confidential. Copyright © 2014. HBase Region Hot-spotting ▪ Don’t stop at salting ▪ Map input splits configured for region boundaries Region 1 x03x85x1ExB8ZZZZZZ Region 2 x07x5CxF5xC2928ZZ Region m xFFxAEx14xE1Z28ZZ 1234557 1234568 1234579 1234583 1234594 .. .. .. .. ZZAHT654 ZZZGT934 ZZZZNGA2 ZZZZKLO1 Key Partitioner ‘k’ splits ‘m’ regions‘m’ splits x01x85x1ExB811ZKL1 x01x86x1ExB8129542 .. x03x85x1ExB8ZZZKL1 x05x35x9Ex18087KL1 x06x86x1ExB8AHV24 .. x07x5CxF5xC16534Z xEBx27x92x1508RKL1 xFEx86x1ExB8AHV24 .. xFFxAEx14x126534Z
  • 29. Proprietary & Confidential. Copyright © 2014. HBase Key Partitioner ▪ As many splits as regions to maximize parallelism ▪ Key Partitioner (MR) – ▪ Reads region boundaries of HBase table ▪ Salts and sorts row key accordingly ▪ Multiple Output Format to optimize reduce phase ▪ Each generated split file corresponds to a single region ▪ Drastically reduces read latencies
  • 30. Proprietary & Confidential. Copyright © 2014. Network Bandwidth Troubles
  • 31. Proprietary & Confidential. Copyright © 2014. Data Center Expansion
  • 32. Proprietary & Confidential. Copyright © 2014. Network Bandwidth Constraints ▪ Consistently overshot bandwidth limit during uploads ▪ All sorts of delays (Redis, MySQL, Blackbird…) ▪ Bidding hampered
  • 33. Proprietary & Confidential. Copyright © 2014. Solutions ▪ Intelligent storage – protobufs everywhere ▪ Throttle writes ▪ Geo-splitting
  • 34. Proprietary & Confidential. Copyright © 2014. Geo Splitting
  • 35. Proprietary & Confidential. Copyright © 2014. Geo-splitting ▪ Tag user’s location history & predict future data center visits ▪ ⨍(dc, geo_history, bt_profile) ▪ A separate workflow periodically generates geo-split rules: ▪ Clusters users & analyzes migration patterns ▪ Ensures maximal look-up coverage of profiles ▪ Minimizes total number of profiles stored ▪ Ensures efficient use of resources, with minimal impact on perf
  • 36. Proprietary & Confidential. Copyright © 2014. Geo-splitting Label Data Train Model Back Test Calibrate Training Events Pixel Stream Ad Logs BT Features (HBase) Feature Generation Score Profiles Profile Generation Scoring Ad Serving Data Centers Model Cluster Users Analyze Patterns Generate Rules Geo-split
  • 37. Proprietary & Confidential. Copyright © 2014.
  • 38. Proprietary & Confidential. Copyright © 2014. Quick Recovery From Failures ▪ Break pipeline into short payloads ▪ Fail fast, recover fast! ▪ Actionable alerts, cut down noise
  • 39. Proprietary & Confidential. Copyright © 2014. Quick Recovery From Failures ▪ Materialize data as frequently as possible ▪ Cross system fault tolerance ▪ Idempotency ▪ Backfill at EOD to plug holes if needed
  • 40. Proprietary & Confidential. Copyright © 2014. Shout-outs!
  • 41. Proprietary & Confidential. Copyright © 2014. Shout-outs!
  • 42. Proprietary & Confidential. Copyright © 2014. Shout-outs!
  • 43. Proprietary & Confidential. Copyright © 2014. Shout-outs!
  • 44. Proprietary & Confidential. Copyright © 2014. We Are Hiring!
  • 45. Proprietary & Confidential. Copyright © 2014.
  • 46. Proprietary & Confidential. Copyright © 2014. Questions ? Thank You! Sivasankaran Chandrasekar chandra@rocketfuel.com Savin Goyal savin@rocketfuel.com
  • 47. Proprietary & Confidential. Copyright © 2014. We are hiring! (as always) http://rocketfuel.com/careers savin@rocketfuel.com chandra@rocketfuel.com
  • 48. Proprietary & Confidential. Copyright © 2014.