Isaac Mosquera, Socialize CTO SplunkLive! presentation
Using Splunk ToEvaluate 20 Billion Ad Impressions Monthly Isaac Mosquera, CTO @imosquera • email@example.com
A Little Bit About Real-Time Bidding Ad Request Bid Request R Winning Bidders Ad T Bid Response B Socialize Bidder Ad Impression Ad ClickAll this needs to happen in less than 100 milliseconds!
So what are some of our problems? Operational ● Evaluating more than 10,000 bid requests per second ● Which bids are > 100ms ● Quickly finding any errors within the system ● Problems tracking clicks and impressions means loss of revenue. Decision Making & Bid Algorithms ● Merging RTB data with our Social data ● Campaign spending ● Campaign efficiency ● Dissect data by: ○ apps ○ users ○ devices
Analyzing Big Data Efficiently1. Collection2. Storage3. Analyzation/Aggregation4. Retrieval
Some Options● RDBMS: SQL functions like count() creates presents problems at scale● RDBMS: Write operations too high for a single DB, as well as a single point of failure.● NoSQL: Would work well for high inserts and queries, however we would lose the simple alerting, charting and reporting dashboards.● Hadoop: simple querying using Hive, however its a new environment to manage... and again lose alerting, charting and reporting.
Splunk Fits the Bill● Operational Reporting: Easily identify problems and prevent erroneous spending. When an alert goes off we hit a script which shuts off the bidder.● AdHoc Queries: Allows us to find patterns in the data to improve our bid algorithms● Application Reporting: Instantly know campaign metrics for us and our clients. "This has got to be the most thorough mobile campaign report Ive ever received, so major props to all of you." - Hipmunk Marketing● Scalability: Adding new RTB Service providers means billions of new ad requests. Scaling horizontally is key.
Data Collection ● Although Splunk works great with unstructured data, we need some structure to make querying easy. ● Created a small client to push events to Splunk indexers: ● Very Simple, accepts only 2 fields: event name, Metadata (dictionary) ● Events are application data like bid requests, clicks, impressions, and application installs
Storage● Performance and redundancy using new Provisioned IOPS for high I/O● Nightly snapshots to S3 Socialize Bidder● Logs are gzipped by Splunk before being snapshotted for Splunk Indexer Splunk Indexer 70% compression gains. EBS EBS● Continuously indexed by Splunk so reports can even be done in real-time S3 Backups
Using Splunk to Analyze Operational Data Allows you to write MapReduce jobs with SQL style querying language: source="nginx-prod.log" | stats avg(ResponseTime) as avg_rtime, p95(ResponseTime) as p95_rtime , stdev (ResponseTime) as stdev_rtime Easily digest information through charts
Analyzation/Aggregationindex=ad_events displayed_ad| spath| bin _time span=1m| stats count(displayed_ad) as displays sum(price/1000) as dollars_spent avg(price) as avg_cpm_price by campaign_id _time| mysqloutput spec=ads-prod table=ads_analytics insert="campaign_id, stat_date, displays, dollars_spent, avg_cpm_price" Splunk Indexer Search Indexer RDBMS Head (Generated Reports) Indexer
Retrieval● MySQL and Memcache allows for super fast retrieval of aggregated reports● Use aggregated information to make smarter bids Socialize Bidder Cache Cluster Memcache Memcache Memcache RDBMS
Final Architecture Socialize Bidder Splunk Cache Cluster Indexer Memcache Memcache Memcache Indexer Indexer Search RDBMS Head (Generated Reports) S3Snapshots