SplunkLive! Salt Lake City June 2013 - Ancestry.com


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Ancestry.com Inc. is the world's largest online family history resource, with more than 2 million paying subscribers. More than 11 billion records have been added to the site in the past 15 years. Ancestry users have created more than 40 million family trees containing approximately 4 billion profiles. In addition to its flagship site, Ancestry.com offers several localized Web sites designed to empower people to discover, preserve and share their family history.
  • Exception logsPrimarily IIS logsCustom application logsWindows Event Logs
  • SplunkLive! Salt Lake City June 2013 - Ancestry.com

    1. 1. Copyright © 2013 Splunk Inc.Ancestry.comPragnya Dash, Development Engineer, WebopsRoss Asay, Principal Engineer, SoftwareArchitect-Commerce & Data Division
    2. 2. Pragnya Dash• Development Engineer, Web Operations• Splunk administrator at Ancestry.com• Setup, maintain and support Splunk environments“Our developers are very excited about using Splunk. Theproduct is very easy to use.”2
    3. 3. Ross AsayPrincipal EngineerSoftware Architect of Commerce & Data DivisionDevelopment lead over Splunk proof of concept project3
    4. 4. Ancestry.comFounded in 1983World’s largest onlinefamily history resource>10B records>38M Family trees>2M paying subscribers44
    5. 5. Before Splunk: Limited Operational VisibilityNo central area togather and analyzethe logsDifficult to collectdataCumbersome Rootcause analysisLong time toresolve issuesNo single tool toidentify functionaland performanceissuesCustom, silo-edtools have highsupport costsDisparateSystemsProblemsTroubleshootingNo Enterprisewide Solution5
    6. 6. Why Splunk?Business risk of system downtime andslow issue identification and resolutionEnterprise-grade – standard platformacross all teams and implementationsTroubleshoot both functionality andperformanceProactive performance monitoringNeed for Operational Excellence6
    7. 7. Where We Are Today20200750+stackshostssourcetypesStarted withSplunk in late20127
    8. 8. How Do We Use Splunk?Network OperationsApplication troubleshootingDevelopmentDevOps8
    9. 9. Finding Value During the POC9
    10. 10. Adding User Accounts During Thundering Herd10
    11. 11. What’s All This Traffic to our CDN?11
    12. 12. Is an Edge Server Misbehaving?12
    13. 13. Is One URI Requested More Than it Should be?13
    14. 14. 60% Decrease in Traffic and Cost14
    15. 15. User Session Limit Problem15
    16. 16. Looking for User Session Limit Problems16
    17. 17. Exceeding User Session Limit17
    18. 18. Auto-complete18
    19. 19. • Custom appender for logging – compatible with Java and .NET appsExample Pseudo-Code:void submitPurchase(purchaseId){log.info("action=submitPurchaseStart, purchaseId=%d", purchaseId)//these calls throw an exception on errorsubmitToCreditCard(...)generateInvoice(...)generateFullfillmentOrder(...)log.info("action=submitPurchaseCompleted, purchaseId=%d", purchaseId)}• Human-readable events• Properly formatted timestamps• Key-Value pairs (JSON Logging)• Separate out multi-value events• Unique transaction identifiersStandardizing App Logs - Semantic Logging19
    20. 20. Splunk for Application DevelopmentIdentifying Defects– Find and fix quickly to reduce time-to-marketComplying with SLAs– Benchmarking application performance– Monitoring API endpointsExtracting data via the Splunk REST API– In JSON format to put into HDFS for long-term storage and machine learning20
    21. 21. Reaching Operational Excellence• Transaction visibility : Track transactions across multiplecomponents using common IDs– Stitching transactions together across various machines by tracingcommon user and session IDs in the logs• Network monitoring: Real-time and historical view intonetwork activity– When network switches reboot, they doesnt retain the logs locally,but Splunk captures all the data for fast issue identification andresolution21
    22. 22. Future Goals• Security use cases (just beginning work)• Modifying Development process to maximize value of Splunk• Integration with Hadoop– Hadoop team is currently leveraging the Splunk API• Dashboards for product management and marketing forinsights into customer behavior and feature usage22
    23. 23. Best Practice RecommendationsCommunicate the value of Splunk to the executive team– Enterprise-wide solution– Opportunity cost of lost revenue brought on by downtime– Reducing operational support resources2
    24. 24. Thank You