Big Data at the Speed of Business: Lessons Learned from Leading at the Edge


Published on

How do you make big data accessible, usable and valuable for everyone? And mine your data for intelligence in minutes and hours, not weeks and months? What about getting real-time insights from your data, even before you persist and replicate it? In this talk, we’ll examine compelling, real-world examples that offer a blueprint for integrating big data technologies (Splunk, Hadoop, RDBMS, Cassandra, HBase), delivering rapid visibility and insights to IT professionals, data analysts and business users, and that accelerate the adoption of big data in the enterprise.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Splunk                 $186 million        Turns machine data into valuable insightsSplunk now has more than 600 employees worldwide, with headquarters in San Francisco and 14 offices around the world.Since first shipping its software in 2006, Splunk now has over 4,400 customers in 80+ countries. These organizations are using Splunk software to improve service levels, reduce operations costs, mitigate security risks, enable compliance, enhance DevOps collaboration and create new product and service offerings. Please always refer to latest company data found here:
  • Talk specifically about how Splunk supports:Volume – scalable real-time architecture.Velocity – horizontal scalability.Variety – universal forwarding and indexing for highly diverse data from thousands of heterogeneous sources.Variability –late-binding schema for maximum search time analysis.
  • When we look more closely at the data we see that it contains critical information – customer id, order id, time waiting on hold, twitter id … what was tweeted. What’s important is first of all the ability to actually see across all these disparate data sources, but then to correlate related events across disparate sources, to deliver If you can correlate and visualize related events across these disparate sources, you can build a picture of activity, behavior and experience. And what if you can do all of this in real-time? You can respond more quickly to events that matter.  You can extrapolate this example to a wide range of use cases – security and fraud, transaction monitoring and analysis, web analytics, IT operations and so on.
  • Splunk turns raw machine data to new visibility, insights and analytics for IT and business professionals. Intelligence from operational data can help organizations meaningfully improve performance in a wide range of areas e.g. meet service levels, reduce costs, mitigate security risks, maintain compliance and gain insights. As well as providing analysis of real-time activity and behavior of products, users, services, servers.Example users of Splunk today include:Customer supportOperations teamsSysadminsApp developersSecurity analystsAuditorsIT execsWeb/biz analystsLOB owners / execs
  • API -> Notification Server -> Either Apple or Google -> At some time later, they will respond back with whether there were any real problems. With Splunk I can look at each individual piece as a whole and look at how the message traversed through the system.Without Splunk – would not know how to do it.
  • This is why we are announcing a new product from Splunk. It’s in Beta, it’s called “Hunk” and it’s SPLUNK ANALYTICS FOR HADOOP.This is a NEW PRODUCT from Splunk that delivers INTERACTIVE DATA EXPLORATION, ANALYSIS and VISUALIZATIONS FOR HADOOP.
  • Because it’s based on proven Splunk technology – deployed at thousands of organizations, we’ve naturally made it easy to deploy.Simply point it at your Hadoop cluster and start interacting with and analyzing data immediately.
  • Big Data at the Speed of Business: Lessons Learned from Leading at the Edge

    1. 1. Copyright © 2013 Splunk Inc. Big Data at the Speed of Business Isaac Mosquera Director of Mobile, ShareThis Clint Sharp Principal Big Data Product Manager, Splunk Copyright © 2013 Splunk Inc.
    2. 2. What We’ll Talk About • Our quest for visibility • Analyzing at scale • Splunk and Big Data • Where do you start? • Q&A
    3. 3. About Splunk Company (NASDAQ: SPLK) Founded 2004, first software release in 2006 HQ: San Francisco Business Model / Products Industry-leading machine data platform On-premise, in the cloud and SaaS 5,600+ Customers 63 of the Fortune 100 Largest license: 100 Terabytes per day #1 Big Data Innovator* * Fast Company's Most Innovative Companies Issue (March 2013)
    4. 4. About ShareThis and Socialize ShareThis makes the world more connected, trusted and valuable through sharing Powers the social web, touching the lives of 95 percent of U.S. Acquires Socialize, which makes mobile and social more engaging Socialized integrated into thousands of iOS and Android Apps Installed on 80M+ devices
    5. 5. Evaluating 20 Billion Ad Impressions Monthly
    6. 6. Copyright © 2013 Splunk Inc.
    7. 7. Copyright © 2013 Splunk Inc.
    8. 8. Copyright © 2013 Splunk Inc.
    9. 9. Copyright © 2013 Splunk Inc.
    10. 10. Copyright © 2013 Splunk Inc.
    11. 11. Copyright © 2013 Splunk Inc.
    12. 12. Copyright © 2013 Splunk Inc.
    13. 13. Final Architecture RDBMS (Generated Reports) S3 Snapshots Search Head Socialize Bidder Splunk Indexer Indexer Indexer Cache Cluster Memcache Memcache Memcache
    14. 14. So, What is Splunk? 14
    15. 15. Expanding Universe of Data Sources Machine-generated DataBusiness Application Data Human-generated Data Highly Structured Arbitrarily Structured 2012-12-05 07:04:44 Id=00Q000000Rd910EAJ City=New York Country=US CreatedDate=“2012-12-05 07:06:44” Email_Opt_In_c Customer_Street _Address_c=“123 Main St.” purchased_product_id= product_i BD-01 twitter_username john_t_doe
    16. 16. Industry Leading Platform for Machine Data Any Machine Data Operational Intelligence HA Indexes and Storage Commodity Servers Developer Platform Custom dashboards Monitor and alert Ad hoc search Report and analyze
    17. 17. Analyzing Heterogeneous Data Universal Index Schema-on-the-fly Flexibility and Fast Time to Value • No data normalization • Automatically handles timestamps • Parsers not required • Index every term & pattern “blindly” • No attempt to “understand” up front • Structure applied at search-time • No brittle schema to work around • Automatically find transactions, patterns and trends • Normalization as it’s needed • Faster implementation • Easy search language • Multiple views into the same data
    18. 18. Gain Critical Insights … in Real-time Order ID Customer’s Tweet Time Waiting On Hold Product ID Company’s Name Sources Twitter Care IVR Middleware Error Order Processing Order ID Customer ID Twitter ID Customer ID Customer ID
    19. 19. Deep Visibility and Insight for IT and Business IT Operations Management Web Intelligence Business AnalyticsApplication Management Security and Compliance Industrial Data / Internet of Things Over 5,600 organizations using Splunk across IT and business users
    20. 20. Driving Insights from Big Data
    21. 21. Hadoop The ShareThis Insights Platform On Father’s day: “Who were the most shared about topics?” “What type of type of beers do people drink?” API ETL Pre- aggregation Analytics ?
    22. 22. Finding the Optimal Approach Hadoop and MapReduce are great for complex data science on data at rest – the previous architecture took 9 months with a team of engineers, data architects, etc. The Splunk platform delivers real-time, interactive analysis – we can build many of the same insights within 1 hour What should be the core focus or competency of your team? Conclusion: find the most optimal approach for the business
    23. 23. What About Ad Hoc Analysis?
    24. 24. PR Insights Example What was the situation? (e.g. fast moving business, needed real-time insights) What was the PR team struggling with? Difficult to find useful data to build interesting use-cases What did they want? They wanted a flexible real-time reporting environment to extract insights useful for the market How my team helped? Delivered a single dashboard that contained real-time data into the sharing behaviors across our network
    25. 25. PR Insights Dashboard
    26. 26. Let’s not forget The low-hanging fruit
    27. 27. Operational Analytics for an Online World website API Notification Google (GCM) Feedback Processor Apple (APNS) ? ! Notifications Systems Driving Superior Customer Experience How many 500 errors have I had over time? Look for anomalies and spikes! Zone in directly to the customer!! Online Device Notifications
    28. 28. One More Thing … 28
    29. 29. Copyright © 2013 Splunk Inc. New product from Splunk delivers interactive data exploration, analysis and visualizations for Hadoop Announcing Hunk Beta Splunk Analytics for Hadoop
    30. 30. Derive Actionable Insights from Raw Data 30 Hadoop Storage Immediately start exploring, analyz ing and visualizing raw data in Hadoop 1 2Point Splunk at Hadoop Cluster Explore Analyze Visualize Dashboards Share
    31. 31. Learn More 31
    32. 32. Copyright © 2013 Splunk Inc. Questions?