Isaac Mosquera, Socialize CTO SplunkLive! presentation

Using Splunk To
Evaluate 20 Billion
Ad Impressions
Monthly
Isaac Mosquera, CTO
@imosquera • isaac.mosquera@getsocialize.com

A Little Bit About Real-Time Bidding

Ad Request Bid Request
R
Winning Bidder's Ad T Bid Response
B Socialize
Bidder
Ad Impression

Ad Click

All this needs to happen in less than 100 milliseconds!

So what are some of our problems?
Operational
● Evaluating more than 10,000 bid requests per second
● Which bids are > 100ms
● Quickly finding any errors within the system
● Problems tracking clicks and impressions means loss of
revenue.

Decision Making & Bid Algorithms
● Merging RTB data with our Social data
● Campaign spending
● Campaign efficiency
● Dissect data by:
○ apps
○ users
○ devices

Analyzing Big Data Efficiently

1. Collection
2. Storage
3. Analyzation/Aggregation
4. Retrieval

Some Options
● RDBMS: SQL functions like count() creates
presents problems at scale

● RDBMS: Write operations too high for a single DB,
as well as a single point of failure.

● NoSQL: Would work well for high inserts and
queries, however we would lose the simple
alerting, charting and reporting dashboards.

● Hadoop: simple querying using Hive, however it's
a new environment to manage... and again lose
alerting, charting and reporting.

Splunk Fits the Bill
● Operational Reporting: Easily identify problems
and prevent erroneous spending. When an alert
goes off we hit a script which shuts off the bidder.

● AdHoc Queries: Allows us to find patterns in the
data to improve our bid algorithms

● Application Reporting: Instantly know campaign
metrics for us and our clients.
"This has got to be the most thorough mobile campaign report I've
ever received, so major props to all of you." - Hipmunk Marketing

● Scalability: Adding new RTB Service providers
means billions of new ad requests. Scaling
horizontally is key.

Data Collection
● Although Splunk works great with unstructured data, we
need some structure to make querying easy.

● Created a small client to push events to Splunk indexers:

● Very Simple, accepts only 2 fields: event name, Metadata
(dictionary)

● Events are application data like bid requests, clicks,
impressions, and application installs

Storage
● Performance and redundancy using new Provisioned IOPS
for high I/O

● Nightly snapshots to S3 Socialize Bidder

● Logs are gzipped by Splunk
before being snapshotted for
Splunk Indexer Splunk Indexer
70% compression gains.
EBS EBS
● Continuously indexed by
Splunk so reports can even
be done in real-time

S3 Backups

Using Splunk to Analyze Operational Data
Allows you to write MapReduce jobs with SQL style
querying language:
source="nginx-prod.log" | stats avg(ResponseTime) as
avg_rtime, p95(ResponseTime) as p95_rtime , stdev
(ResponseTime) as stdev_rtime

Easily digest information through charts

Analyzation/Aggregation
index=ad_events displayed_ad
| spath
| bin _time span=1m
| stats count(displayed_ad) as displays
sum(price/1000) as dollars_spent
avg(price) as avg_cpm_price
by campaign_id _time
| mysqloutput spec=ads-prod table=ads_analytics
insert="campaign_id, stat_date, displays, dollars_spent, avg_cpm_price"

Splunk

Indexer

Search
Indexer RDBMS
Head
(Generated Reports)

Indexer

Retrieval
● MySQL and Memcache allows for super fast retrieval of
aggregated reports

● Use aggregated information to make smarter bids

Socialize Bidder

Cache Cluster

Memcache Memcache Memcache

RDBMS

Final Architecture
Socialize Bidder

Splunk Cache Cluster
Indexer Memcache Memcache Memcache

Indexer

Indexer

Search
RDBMS
Head (Generated Reports)
S3
Snapshots

Thank you!
isaac.mosquera@getsocialize.com | @imosquera

Isaac Mosquera, Socialize CTO SplunkLive! presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Isaac Mosquera, Socialize CTO SplunkLive! presentation

Similar to Isaac Mosquera, Socialize CTO SplunkLive! presentation (20)

More from getsocialize

More from getsocialize (8)

Recently uploaded

Recently uploaded (20)

Isaac Mosquera, Socialize CTO SplunkLive! presentation