• Save
SplunkLive! Salt Lake City June 2013 - Ancestry.com
 

SplunkLive! Salt Lake City June 2013 - Ancestry.com

on

  • 719 views

 

Statistics

Views

Total Views
719
Views on SlideShare
719
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Ancestry.com Inc. is the world's largest online family history resource, with more than 2 million paying subscribers. More than 11 billion records have been added to the site in the past 15 years. Ancestry users have created more than 40 million family trees containing approximately 4 billion profiles. In addition to its flagship site, Ancestry.com offers several localized Web sites designed to empower people to discover, preserve and share their family history.
  • Exception logsPrimarily IIS logsCustom application logsWindows Event Logs

SplunkLive! Salt Lake City June 2013 - Ancestry.com SplunkLive! Salt Lake City June 2013 - Ancestry.com Presentation Transcript

  • Copyright © 2013 Splunk Inc.Ancestry.comPragnya Dash, Development Engineer, WebopsRoss Asay, Principal Engineer, SoftwareArchitect-Commerce & Data Division
  • Pragnya Dash• Development Engineer, Web Operations• Splunk administrator at Ancestry.com• Setup, maintain and support Splunk environments“Our developers are very excited about using Splunk. Theproduct is very easy to use.”2
  • Ross AsayPrincipal EngineerSoftware Architect of Commerce & Data DivisionDevelopment lead over Splunk proof of concept project3
  • Ancestry.comFounded in 1983World’s largest onlinefamily history resource>10B records>38M Family trees>2M paying subscribers44
  • Before Splunk: Limited Operational VisibilityNo central area togather and analyzethe logsDifficult to collectdataCumbersome Rootcause analysisLong time toresolve issuesNo single tool toidentify functionaland performanceissuesCustom, silo-edtools have highsupport costsDisparateSystemsProblemsTroubleshootingNo Enterprisewide Solution5
  • Why Splunk?Business risk of system downtime andslow issue identification and resolutionEnterprise-grade – standard platformacross all teams and implementationsTroubleshoot both functionality andperformanceProactive performance monitoringNeed for Operational Excellence6
  • Where We Are Today20200750+stackshostssourcetypesStarted withSplunk in late20127
  • How Do We Use Splunk?Network OperationsApplication troubleshootingDevelopmentDevOps8
  • Finding Value During the POC9
  • Adding User Accounts During Thundering Herd10
  • What’s All This Traffic to our CDN?11
  • Is an Edge Server Misbehaving?12
  • Is One URI Requested More Than it Should be?13
  • 60% Decrease in Traffic and Cost14
  • User Session Limit Problem15
  • Looking for User Session Limit Problems16
  • Exceeding User Session Limit17
  • Auto-complete18
  • • Custom appender for logging – compatible with Java and .NET appsExample Pseudo-Code:void submitPurchase(purchaseId){log.info("action=submitPurchaseStart, purchaseId=%d", purchaseId)//these calls throw an exception on errorsubmitToCreditCard(...)generateInvoice(...)generateFullfillmentOrder(...)log.info("action=submitPurchaseCompleted, purchaseId=%d", purchaseId)}• Human-readable events• Properly formatted timestamps• Key-Value pairs (JSON Logging)• Separate out multi-value events• Unique transaction identifiersStandardizing App Logs - Semantic Logging19
  • Splunk for Application DevelopmentIdentifying Defects– Find and fix quickly to reduce time-to-marketComplying with SLAs– Benchmarking application performance– Monitoring API endpointsExtracting data via the Splunk REST API– In JSON format to put into HDFS for long-term storage and machine learning20
  • Reaching Operational Excellence• Transaction visibility : Track transactions across multiplecomponents using common IDs– Stitching transactions together across various machines by tracingcommon user and session IDs in the logs• Network monitoring: Real-time and historical view intonetwork activity– When network switches reboot, they doesnt retain the logs locally,but Splunk captures all the data for fast issue identification andresolution21
  • Future Goals• Security use cases (just beginning work)• Modifying Development process to maximize value of Splunk• Integration with Hadoop– Hadoop team is currently leveraging the Splunk API• Dashboards for product management and marketing forinsights into customer behavior and feature usage22
  • Best Practice RecommendationsCommunicate the value of Splunk to the executive team– Enterprise-wide solution– Opportunity cost of lost revenue brought on by downtime– Reducing operational support resources2
  • Thank You