Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine Learning in Action


Published on

Machine Learning enables you to generate valuable insights from your machine data. In this talk, we will focus in particular on various methods for detecting anomalies. The presented methods can be used for applications in security and IT operations as well as in the area of IoT and business analytics. Because nothing is more unusual than the habit.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Machine Learning in Action

  1. 1. © 2019 SPLUNK INC.© 2019 SPLUNK INC. Machine Learning in Action How to derive meaningful and actionable business insights from your data Philipp Drieger | Staff Machine Learning Architect Tony Read | Staff Sales Engineer Greg Ainslie-Malik | Senior Sales Engineer London | June 13, 2019
  2. 2. © 2019 SPLUNK INC. During the course of this presentation, we may make forward-looking statements regarding future events or the expected performance of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-looking statements, please review our filings with the SEC. The forward-looking statements made in this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or accurate information. We do not assume any obligation to update any forward-looking statements we may make. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include any such feature or functionality in a future release. Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2019 Splunk Inc. All rights reserved. Forward-Looking Statements
  3. 3. © 2019 SPLUNK INC.
  4. 4. © 2019 SPLUNK INC.
  5. 5. © 2019 SPLUNK INC.
  6. 6. © 2019 SPLUNK INC.
  7. 7. © 2019 SPLUNK INC. Agenda
  8. 8. © 2019 SPLUNK INC. 1. Quick Intro to Machine Learning and a bit of theory about Anomaly Detection 2. Anomaly Detection Use Case: How TalkTalk detects anomalies in broadband access 3. Predictive Analytics Use Case: Predicting Student Outcomes 4. Wrap Up, Q&A Agenda
  9. 9. © 2019 SPLUNK INC. A Bit Theory First
  10. 10. © 2019 SPLUNK INC. Splunk Customers Want Answers from their Data ► Deviation from past behavior ► Deviation from peers ► (aka Multivariate AD or Cohesive AD) ► Unusual change in features ► ITSI MAD Anomaly Detection ► Predict Service Health Score Predicting Churn ► Predicting Events ► Trend Forecasting ► Detecting influencing entities ► Early warning of failure – predictive maintenance ► Identify peer groups ► Event Correlation ► Reduce alert noise ► Behavioral Analytics ► ITSI Event Analytics Anomaly detection Predictive Analytics Clustering
  11. 11. © 2019 SPLUNK INC. ▶ From Latin anomalia, from Ancient Greek ἀνωμαλία (anōmalía, “irregularity, anomaly”), from ἀνώμαλος (anṓmalos, “irregular, uneven”), negating the meaning of ὁμαλός (homalós, “even”), from ὁμός (homós, “same”). ▶ A deviation from a rule or from what is regarded as normal; an outlier. Synonyms: abnormality, deviance, deviation, exception, inconsistency, irregularity, phenomenon ▶ In the natural sciences, especially in atmospheric and Earth sciences involving applied statistics, an anomaly is the deviation in a quantity from its expected value, e.g., the difference between a measurement and a mean or a model prediction. […] Perspectives on Anomalies and and
  12. 12. © 2019 SPLUNK INC. ▶ Only 72 pages ▶ A comprehensive report of most common classic methodologies and algorithmic approaches
  13. 13. © 2019 SPLUNK INC. Why Anomalies Matter
  14. 14. © 2019 SPLUNK INC. • Network traffic • Access pattern • … • Service outages • Infrastructure problems • … • Equipment degradation • Preventative Maintenance • … • Fraud Detection • Insider Threats • … Interesting Anomalies Across Your Business Security – IT Operations – IoT/OT – Business Analytics
  15. 15. © 2019 SPLUNK INC. How to Spot Anomalies
  16. 16. © 2019 SPLUNK INC. ▶ “Can Splunk detect anomalies in my data?” ▶ “Can Splunk help me identify unknown things?” ▶ “Can Splunk find answers for questions that I don’t know?” ▶ Ask yourself what questions you are asking! Questions… there are so many questions…
  17. 17. © 2019 SPLUNK INC. Search Processing Language (SPL) Machine Learning Toolkit (MLTK) Cheat Sheet for Anomaly Detection in Splunk Command Description analyzefields, af Analyze numerical fields for their ability to predict another discrete field. anomalies Computes an "unexpectedness" score for an event. anomalousvalue Finds and summarizes irregular, or uncommon, search results. anomalydetection Identifies anomalous events by computing a probability for each event and then detecting unusually small probabilities. cluster Clusters similar events together. kmeans Performs k-means clustering on selected fields. outlier Removes outlying numerical values. rare Displays the least common values of a field. Method / Algorithm Description DensityFunction The DensityFunction algorithm provides a consistent and streamlined workflow to create and store density functions and utilize them for anomaly detection… LocalOutlierFactor The LocalOutlierFactor algorithm measures the local deviation of density of a given sample with respect to its neighbors… OneClassSVM The OneClassSVM algorithm fits a model from a set of features or fields for detecting anomalies and outliers… Clustering Algorithms Spot point anomalies or anomaleous clusters. Inspect e.g. cluster_distance with KMeans, cluster=-1 with DBSCAN… Classifiers and Regressors Inspect strong residuals when applying your well fitted model to new incoming data points. ML SPL API Wrap your own algorithms of choice
  18. 18. © 2019 SPLUNK INC. Customer Use Case: TalkTalk
  19. 19. © 2019 SPLUNK INC. TalkTalk Circa 100,000 Access Nodes connect millions of broadband customers to the internet. Extensive Monitoring. But customers still experience broadband issues. Call Centre experience often culminates in dispatch new router / engineer. Expensive! Financially and NPS. And no chance of fixing the issue. Continuously emit START, STOP, INTERIM_UPDATE events (RADIUS data). Hypothesis…”Each of those Access Nodes should emit a similar number of each event at any time of day”. We want to know which are behaving uncharacteristically?
  20. 20. © 2019 SPLUNK INC.
  21. 21. © 2019 SPLUNK INC.
  22. 22. © 2019 SPLUNK INC.
  23. 23. © 2019 SPLUNK INC.
  24. 24. © 2019 SPLUNK INC.
  25. 25. © 2019 SPLUNK INC.
  26. 26. © 2019 SPLUNK INC. TalkTalk 2 phase approach: 1. Use historic data to establish a baseline for the upcoming week 2. As the upcoming week progresses compare each interval with the baseline.
  27. 27. © 2019 SPLUNK INC.
  28. 28. © 2019 SPLUNK INC.
  29. 29. © 2019 SPLUNK INC.
  30. 30. © 2019 SPLUNK INC.
  31. 31. © 2019 SPLUNK INC.
  32. 32. © 2019 SPLUNK INC.
  33. 33. © 2019 SPLUNK INC.
  34. 34. © 2019 SPLUNK INC.
  35. 35. © 2019 SPLUNK INC. Predictive Analytics for Student Success
  36. 36. © 2019 SPLUNK INC. Classification So what is it anyway? Duck? ..or Rabbit?
  37. 37. © 2019 SPLUNK INC. • Predicting the presence of a botnet • Identifying potential DGAs/malware • … • Predicting outage conditions • Predicting root cause of IT incidents • … • Identifying potential part failures • Assuring quality in manufacturing • … • Predicting customer churn • Grouping customers by attribute and activity • … Prediction in Action Security – IT Operations – IoT/OT – Business Analytics
  38. 38. © 2019 SPLUNK INC. Student Success Predicting Student Outcomes
  39. 39. © 2019 SPLUNK INC. Student Success Analysing and Predicting Dropouts
  40. 40. © 2019 SPLUNK INC. Student Success Tracking Progress
  41. 41. © 2019 SPLUNK INC. Wrap up
  42. 42. © 2019 SPLUNK INC. consider your ML dataset’s dimensional and computational complexity computational complexity dimensional complexity Machine Learning Toolkit In general: for most common ML tasks: use MLTK + MLSPL API extensibility Case #1: need for specific algo / framework Case #2: need for distributed / gpu compute extensibility Recommendation Matrix
  43. 43. © 2019 SPLUNK INC. I want to learn more!
  44. 44. © 2019 SPLUNK INC. Where Can I Learn More About Anomaly Detection? 4 must read blog posts – don’t miss them!
  45. 45. © 2019 SPLUNK INC. • DGA App for Splunk • Sec. Essentials • UBA • MLTK • ITSI • Splunk Essentials for Predictive Maintenance • Splunk Security Essentials for Fraud Detection Where to Find Ready Made Apps… … for my business area of interest?
  46. 46. © 2019 SPLUNK INC. 4 Days of Innovation 350 Education Sessions 20 Hours of Networking “Hands down the most beneficial and attendee focused conference I have attended!” – Michael Mills, Senior Consultant, Booz Allen Hamilton sign up for notifications @ .conf19 October 21-24, 2019 Splunk University October 19-21, 2019 Las Vegas, NV The Venetian Sands Expo
  47. 47. © 2019 SPLUNK INC. Splunk Machine Learning Advisory Program
  48. 48. © 2019 SPLUNK INC. Your Logo Here? Get started on your specific use case with the guidance of Splunk Data Scientists Consider the ML Advisory Program
  49. 49. © 2018 SPLUNK INC. ▶ Early access to new and enhanced Machine Learning features ▶ Opportunity to shape the development of the product ▶ Complimentary assistance in operationalizing a production quality ML model What is the ML Advisory Program? Complimentary support of Splunk data science resources to help build a ML use case resulting in a public reference
  50. 50. © 2019 SPLUNK INC.© 2019 SPLUNK INC. Thank You.