Isaac Mosquera, Socialize CTO SplunkLive! presentation
1. Using Splunk To
Evaluate 20 Billion
Ad Impressions
Monthly
Isaac Mosquera, CTO
@imosquera • isaac.mosquera@getsocialize.com
2. A Little Bit About Real-Time Bidding
Ad Request Bid Request
R
Winning Bidder's Ad T Bid Response
B Socialize
Bidder
Ad Impression
Ad Click
All this needs to happen in less than 100 milliseconds!
3. So what are some of our problems?
Operational
● Evaluating more than 10,000 bid requests per second
● Which bids are > 100ms
● Quickly finding any errors within the system
● Problems tracking clicks and impressions means loss of
revenue.
Decision Making & Bid Algorithms
● Merging RTB data with our Social data
● Campaign spending
● Campaign efficiency
● Dissect data by:
○ apps
○ users
○ devices
4. Analyzing Big Data Efficiently
1. Collection
2. Storage
3. Analyzation/Aggregation
4. Retrieval
5. Some Options
● RDBMS: SQL functions like count() creates
presents problems at scale
● RDBMS: Write operations too high for a single DB,
as well as a single point of failure.
● NoSQL: Would work well for high inserts and
queries, however we would lose the simple
alerting, charting and reporting dashboards.
● Hadoop: simple querying using Hive, however it's
a new environment to manage... and again lose
alerting, charting and reporting.
6. Splunk Fits the Bill
● Operational Reporting: Easily identify problems
and prevent erroneous spending. When an alert
goes off we hit a script which shuts off the bidder.
● AdHoc Queries: Allows us to find patterns in the
data to improve our bid algorithms
● Application Reporting: Instantly know campaign
metrics for us and our clients.
"This has got to be the most thorough mobile campaign report I've
ever received, so major props to all of you." - Hipmunk Marketing
● Scalability: Adding new RTB Service providers
means billions of new ad requests. Scaling
horizontally is key.
7. Data Collection
● Although Splunk works great with unstructured data, we
need some structure to make querying easy.
● Created a small client to push events to Splunk indexers:
● Very Simple, accepts only 2 fields: event name, Metadata
(dictionary)
● Events are application data like bid requests, clicks,
impressions, and application installs
9. Storage
● Performance and redundancy using new Provisioned IOPS
for high I/O
● Nightly snapshots to S3 Socialize Bidder
● Logs are gzipped by Splunk
before being snapshotted for
Splunk Indexer Splunk Indexer
70% compression gains.
EBS EBS
● Continuously indexed by
Splunk so reports can even
be done in real-time
S3 Backups
10. Using Splunk to Analyze Operational Data
Allows you to write MapReduce jobs with SQL style
querying language:
source="nginx-prod.log" | stats avg(ResponseTime) as
avg_rtime, p95(ResponseTime) as p95_rtime , stdev
(ResponseTime) as stdev_rtime
Easily digest information through charts
11. Analyzation/Aggregation
index=ad_events displayed_ad
| spath
| bin _time span=1m
| stats count(displayed_ad) as displays
sum(price/1000) as dollars_spent
avg(price) as avg_cpm_price
by campaign_id _time
| mysqloutput spec=ads-prod table=ads_analytics
insert="campaign_id, stat_date, displays, dollars_spent, avg_cpm_price"
Splunk
Indexer
Search
Indexer RDBMS
Head
(Generated Reports)
Indexer
12. Retrieval
● MySQL and Memcache allows for super fast retrieval of
aggregated reports
● Use aggregated information to make smarter bids
Socialize Bidder
Cache Cluster
Memcache Memcache Memcache
RDBMS