SplunkLive! New York Dec 2012 - SNAP Interactive
 

SplunkLive! New York Dec 2012 - SNAP Interactive

on

  • 1,078 views

 

Statistics

Views

Total Views
1,078
Views on SlideShare
1,078
Embed Views
0

Actions

Likes
0
Downloads
8
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • snap is located down the street, our flagship product is areyouinterested.com one of the largest social discovery sites on the web with more than 5 M MAU. publicly traded company, most revenue from subs
  • lead the architecture team, we build the core infrastructure of the site, explore new technologies and work closely with ops. This includes delivering analytics platforms for produce and monitoring/debugging tools for developers.
  • we use splunk for a lot of stuff; today i only have time to highlight a few of the more interesting things. specifically i’ll explain what we give splunk, and how; the monitoring capabilities that it gives us and most interesting the analytics we can now perform.
  • so what do we index? most importantly our custom application log which contains structured... data ;we also profile parts of our application and log that. and of course we index our error_logs
  • when writing logs from our application we use a centralized logging class that gives us some simple but highly beneficial functionality.
  • anytime the structured data contains a uid, we integrate key information about that user. considered lookup tables but this is much more performant.
  • this information gives us tremendous power. we can...
  • as i mentioned, we also log performance metrics, we simply wrap parts of our code with timers, and log the time spent working
  • so that’s pretty much what we give splunk, so what does it give us?
  • With regards to monitoring, we are a continuous deployment shop; deploying site changes 15-50 times a day.
  • our goal with monitoring is to know immediately if we break something. for performance we use realtime background searches that each power multiple dashboard views
  • we also have a frontend error monitoring dashboard
  • we also monitor email and performance very closely
  • in an attempt to maximize engagement we recently started scheduling all of our email sends. now in addition to monitoring sends we need to monitor the health of the schedule queue
  • when a server gets hot, one of the first things we try to do is correlate
  • make data driven product decisions. product team doing lots of analysis, summary indexing for performance. recently upgraded to 5.0 smooth 6min downtime; will continue to use summary indexing.
  • these mysterious unlabled lines are an example of how we can monitor open and click rates for our emails. targeting specific templates, countries, genders, ISPs, or outbound IP addresses.
  • this is one of the most interesting dashboards we have. This allows us to perform cohort analysis on our users, tracking a group that joined on a particular day and measure their lifetime behavior

SplunkLive! New York Dec 2012 - SNAP Interactive SplunkLive! New York Dec 2012 - SNAP Interactive Presentation Transcript

  • High Velocity Intelligence Application Monitoring with Splunk SNAP Interactive, Inc. Presented by: Nicholas DiSanto Architecture Team Lead
  • Company Overview • SNAP Interactive, Inc. • www.AreYouInterested.com • Believes it is one of the largest social discovery platforms on the web (based on monthly active users) • More than 5 million monthly active users • Over 1 billion total pieces of structured data from its users • Synced to millions of Facebook profiles • Receives over 1,000 real-time updates per minute on like actions from Facebook www.snap-interactive.com
  • About Nick • Developing on LAMP stack at tech startups for 10 years • Leading a team of core engineers • Passionate about experimentation & data driven iteration • Striving to eliminate all technical blockers to speed and innovation • @NicholasDiSanto www.snap-interactive.com
  • Summary • We use splunk for many, many things! • Today, I will share some of our more interesting applications • How we get data into splunk • What we do with that data • Various types of monitoring • Extensive user behavior analysis www.snap-interactive.com
  • What We Give Splunk • Custom application logs • Structured, minified, event data • De-normalized user demographics • Application profiling data • Error logs www.snap-interactive.com
  • Sending Splunk Data • Centralize logging functions and: • Format arbitrary structured data into splunk extractable field/value pairs: field=”value” • Normalize and minify field names • Detect user_id and augment logs • Optionally log a percent of events • Target different log files (error, info, analytics) www.snap-interactive.com
  • User Demographics • Our analytics log contains application events, triggered by real users • We augment these event logs with useful demographic data to classify the events ✦ Gender ✦ Seeking gender ✦ Country ✦ Ethnicity ✦ Date of birth www.snap-interactive.com
  • Demographic Power • By augmenting event logs with user demographics we can perform powerful and detailed analysis of user behavior • Target analysis at countries, genders, or age ranges • Classify events by days since: registration, login, email open, etc. • ...and much more www.snap-interactive.com
  • Performance Metrics • We time key algorithms in our application, and log: • server name • query name • time spent working • This lets us graph the average, min and max times of these algorithms per server • We also dark launch features, benchmarking performance prior to official launch. www.snap-interactive.com
  • Performance • Average query time for key algorithms by server www.snap-interactive.com
  • What Splunk Gives Us • Monitoring - to measure application health • Analysis - to drive future product decisions • AB test evaluation - to validate hypotheses • Detection - to find patterns & classify users www.snap-interactive.com
  • Monitoring • With continuous deployment, detailed monitoring is absolutely essential. • Each deploy we watch changes in: • Realtime classified error graphs • Core event stat graphs • We also monitor email deliverability, revenue, and performance (although not every deploy) www.snap-interactive.com
  • Error & Event Monitoring • Alert us immediately after deploy if something has gone wrong • Use realtime background searches • Single dashboard with multiple graphs and tables • We are exploring realtime sms alerts to the ‘developer on call’ • Use historial data to identify min/max expected thresholds (weighted averages: same time of day, same day of week) • Detect consistent deviations and alert www.snap-interactive.com
  • Error Monitoring • Count of all errors : past 30 seconds www.snap-interactive.com
  • Error Monitoring • All errors: past 5 minutes w/deploys www.snap-interactive.com
  • Error Monitoring • Rolled up errors: past 5 minutes www.snap-interactive.com
  • Error Monitoring • Rolled up filtered errors: past 5 minutes www.snap-interactive.com
  • Error Monitoring • Rolled up JS errors: past 3 hours www.snap-interactive.com
  • Event Monitoring • We monitor ~20 event stats, in realtime, each deploy www.snap-interactive.com
  • Event Monitoring • Overview and detail views, powered by a single realtime background search www.snap-interactive.com
  • Monitor Email & Performance • Email • Deliverability is essential to business • Need to maximize engagement • Performance • What async jobs may be contributing to high DB load? • What performance are end users experiencing? • Are particular servers overloaded? www.snap-interactive.com
  • Email Deliverability • Overview of key metrics www.snap-interactive.com
  • Email Monitoring • Inserts into email scheduled send queue www.snap-interactive.com
  • Performance • Asynchronous process timers • Can correlate spikes with site issues www.snap-interactive.com
  • Analysis • We heavily leverage summary indexing for performance gains • Daily rollups are grouped judiciously, giving us fast, flexible, analysis over long periods • We summarize: revenue, email deliverability, core KPI, and general stats data • Custom dashboards facilitate easy searching • Lots of ad hoc searching by product team www.snap-interactive.com
  • Email Analysis • Sends opens, clicks, bounces & FBL rates by email type www.snap-interactive.com
  • Email Analysis • Monitor changes in open & click rates by email, ISP, country, etc. www.snap-interactive.com
  • Email Analysis • Analysis dashboard *Sample data www.snap-interactive.com
  • Core KPI Dashboard • Powerful targeted cohort analysis *Sample data www.snap-interactive.com
  • AB Test Results • We are constantly running a variety of AB experiments on our live users • We divide our user population in to nine 10% segments and ten 1% segments • Each segment can be targeted with an experiment • All event logs are annotated with the appropriate AB experiment name • This allows us to measure behavior changes between experiment and control groups www.snap-interactive.com
  • Easy AB Analysis • All event logs contain the an AB field • This identifies the experiment group of the user at that point in time • Fully integrated into core analysis dashboards • Ad hoc analysis becomes simple `my_search` (AB=my_test OR AB=ctrl) | `my_reporting_command` by AB www.snap-interactive.com
  • AB Dashboards • key metrics: experiment vs control group www.snap-interactive.com
  • Latest Splunking - Detection • The way our users interact with one another is insightful • We can use this data to classify users: • Identifying “attractive” users • Identifying spammers & scammers • We test hypotheses with ad hoc searches • Find reliable patterns then setup scheduled searches that interface with MySQL • This data then feeds into our application in various ways www.snap-interactive.com
  • Contact Us • SNAP Interactive, Inc. www.snap-interactive.com • Nicholas DiSanto Architecture Team Lead ndisanto@snap-interactive.com 301-BIG-TREE @NicholasDiSanto • Lindsay Bubbico www.snap-interactive.com