Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Splunk AI & Machine Learning Roundtable 2019 - Zurich

506 views

Published on

Splunk Artificial Intelligence and Machine Learning Roundtable held in Zurich on November 6th 2019. Presented by Philipp Drieger, Staff Machine Learning Architect.

Published in: Technology
  • Be the first to comment

Splunk AI & Machine Learning Roundtable 2019 - Zurich

  1. 1. © 2 0 1 9 S P L U N K I N C . Splunk Artificial Intelligence & Machine Learning Roundtable Zurich, November 6, 2019 Philipp Drieger | Staff Machine Learning Architect
  2. 2. © 2 0 1 9 S P L U N K I N C . During the course of this presentation, we may make forward-looking statements regarding future events or the expected performance of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-looking statements, please review our filings with the SEC. The forward-looking statements made in this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or accurate information. We do not assume any obligation to update any forward-looking statements we may make. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include any such feature or functionality in a future release. Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2019 Splunk Inc. All rights reserved. Forward-Looking Statements THIS SLIDE IS REQUIRED, DO NOT DELETE
  3. 3. © 2 0 1 9 S P L U N K I N C . Agenda 1) Roundtable quick Intros 2) Introduction to AI and ML Features in Splunk 3) Customer Use Cases 4) Live Demo of Machine Learning Toolkit, with examples: Methods for Anomaly Detection Predictive Analytics and Forecasting Clustering 5) Custom Machine Learning, including: Expansion with MLSPLAPI Advanced Containerization 6) Panel and Q&A 7) Networking Lunch
  4. 4. © 2 0 1 9 S P L U N K I N C . • | where _time @ Splunk > 4.5y • Previous: • +15y in research, software development, visual arts • +3y SE across portfolio & domains in CEMEA & EE • Specializations • Anomaly Detection, Data Mining, NLP, Advanced Analytics and Visualizations • Applied Data Science, Machine Learning, Graph Theory and Network Science • GPU Computing, Deep Learning • Role @ Splunk • Staff Machine Learning Architect (Central EMEA) • Author of DGA App for Splunk • Author of MLTK Container for Splunk • Author of Deep Learning Toolkit for Splunk • Blog posts, conf talks, hackathons etc. • Ensure Customer and Partner Success with ML Philipp Drieger
  5. 5. © 2 0 1 9 S P L U N K I N C . Intro
  6. 6. © 2 0 1 9 S P L U N K I N C . Our World Never Stops Evolving. New Ideas. New Devices. New Processes. © 2 0 1 9 S P L U N K I N C .
  7. 7. © 2 0 1 9 S P L U N K I N C . * Idc- Data Age 2025: The Digitization Of The World- November 2018 Every Company Has a Universe of Real-time Data Creating More Opportunities and Threats than Ever Before New Data Streams & Devices New Apps & App Logs Financial Account & Operating Systems Database Logs Network Logs New Technology ATM Sensor Data Transaction Data Proxy Data Firewall Logs © 2 0 1 9 S P L U N K I N C .
  8. 8. © 2 0 1 9 S P L U N K I N C . Turning Real-time Data Into Action is Hard Data Lakes Master Data Management ETL Point Data Management Solutions Data Silos © 2 0 1 9 S P L U N K I N C .
  9. 9. © 2 0 1 9 S P L U N K I N C . IT Security IoT Biz Analytics The Data-to-Everything Platform © 2 0 1 9 S P L U N K I N C .
  10. 10. © 2 0 1 9 S P L U N K I N C . Any Structure Any Source Any Time Scale ACT INVESTIGATEANALYZE MONITOR IT Security IoT Biz Analytics © 2 0 1 9 S P L U N K I N C .
  11. 11. © 2 0 1 9 S P L U N K I N C . Splunk: The Data-to-Everything Platform Bring data to every question, decision and action Cloud Monitoring Application Lifecycle Analytics Application Release Analytics Container Monitoring Infrastructure Monitoring Advanced Threat Detection Insider Threats Incident Investigation and Forensics SOC Automation Compliance Real-Time Monitoring and Diagnostics ICS Security Predictive Analytics Facilities Management Business Process Mining Customer Experience Optimization Incident Management Digital Marketing Optimization IoT Biz AnalyticsIT Security
  12. 12. © 2 0 1 9 S P L U N K I N C . Intro AI | ML | DL
  13. 13. © 2019 SPLUNK INC. “Humans are good at Learning… but we get lost in volume and detail.”
  14. 14. © 2 0 1 9 S P L U N K I N C . AI, ML, DL “A Function that maps features to an output” = AI “A Function that learns patterns in your data without being explicitly programmed” = ML Types of ML Supervised Unsupervised Reinforcement Lots of opinions exist. Myths as well…
  15. 15. © 2 0 1 9 S P L U N K I N C . What ML & AI are not Machine Learning is not MagicAI Bu zzGarbage Data = Useless Predictions • Data Scientists spend 80% of their time cleaning, munging and collecting data • Throwing more data at an algorithm will not result in solving all of your SOC issues • Machine Learning requires a solid understanding of statistics and the scientific method ML & AI require you to understand the fundamental business problem you want to solve.
  16. 16. © 2 0 1 9 S P L U N K I N C . What ML & AI are not Machine Learning is not Magic ML is not a replacement for expert analysts, or engineers. ML requires Subject Matter Experts to enhance security & IT operations. Analysts are required to provide feedback to the models to adjust thresholding rules and reduce false positives. AI Bu zz
  17. 17. © 2 0 1 9 S P L U N K I N C . Problem: DGA domains are computer generated pseudo-random character strings used by attackers, blacklisting an infinite number of domains is not feasible. Hypothesis: “Are there patterns in domain generation algorithms that can be exploited to identify newly generated domains as threats in real- time?” Example Domains: Machine Learning & AI What does the scientific method look like in the IT & Security Space? http://87hfdredwertyfdvvlkgdrsadm.net/af/GHFbfsalku65 http://87hfdredwertyfdvvlkgdrsadm.net/af/sdgLKJvgh http://wszystkodokuchni.pl/34f43
  18. 18. © 2018 SPLUNK INC. Why Use Machine Learning? : MTTR $ Impact Predictive Proactive (add logs and metrics) Effective $ Impact Existing Events NEGATIVE MTTR!! Predict 30 Minutes in Advance Time Return to Business Cost of Impact Reactively Alerted MTTR Automated Resolution MTTR MTTR Splunk ML Alert Basic Value prop of Splunk One layer of ML, finding anomalies in real time + ^ Splunk A 2nd Layer of ML +^ Anomalies +^ Splunk
  19. 19. © 2 0 1 9 S P L U N K I N C . Machine Learning Tour
  20. 20. © 2 0 1 9 S P L U N K I N C . What Data Scientists Really Do Data Preparation accounts for about 80% of the work of data scientists “Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says”, Forbes Mar 23, 2016
  21. 21. © 2 0 1 9 S P L U N K I N C . Splunk Customers Want Answers from their Data ► Deviation from past behavior ► Deviation from peers ► (aka Multivariate AD or Cohesive AD) ► Unusual change in features ► Identify peer groups ► Event Correlation ► Reduce alert noise ► Behavioral Analytics Anomaly detection Predictive Analytics Clustering ► Predict Service Health Score/Churn ► Predicting Events ► Trend Forecasting ► Detecting influencing entities ► Early warning of failure
  22. 22. © 2 0 1 9 S P L U N K I N C . Skill Areas for Machine Learning @ Splunk Domain Expertise (IT, Security…) Data Science Expertise Splunk Expertise MLTK Splunk ML Toolkit facilitates and simplifies via examples & guidance Premium solutions provide out of the box ML capabilities. ITSI, UBA • Statistics/math background • Algorithm selection • Model building • Identify use cases • Drive decisions • Understanding of business impact • Searching • Reporting • Alerting • Workflow
  23. 23. © 2 0 1 9 S P L U N K I N C . Overview of Machine Learning at Splunk CORE PLATFORM SEARCH + Smarter Splunk PACKAGED PREMIUM SOLUTIONS MACHINE LEARNING TOOLKIT Platform for Operational Intelligence
  24. 24. © 2 0 1 9 S P L U N K I N C . Machine Learning in ITSI IT Service Intelligence Adaptive Thresholds Anomaly Detection Cohesion Detection Predictive Analytics Clustered Notable Events Automated Actions Assisted Deep Dive InvestigationApplication logs Network logs Metrics Server logs Time Series in Splunk INTELLIGENCE KPIs MLTK Customization Machine Learning Machine Learning
  25. 25. © 2 0 1 9 S P L U N K I N C . Finding Outliers Adaptive Thresholding: • Learn baselines & dynamic thresholds • Alert & act on deviations • Manage for 1000s of KPIs & entities • Stdev/Avg, Quartile/Median, Range Trending/Cohesive Anomaly Detection: • Find “hiccups” in expected patterns • Catches deviations beyond thresholds • Advanced proprietary algorithms IT Service Intelligence
  26. 26. © 2 0 1 9 S P L U N K I N C . Event Analytics Prioritize event insights with service context, logs & metrics Group related events to highlight the most meaningful ones Reduce noise and alert on root causes of issues Use ML algorithms to group similar events (Smart Mode) IT Service Intelligence
  27. 27. © 2 0 1 9 S P L U N K I N C . Machine Learning in Splunk UBA 60+ ANOMALY CLASSIFICATIONS 20+ THREAT CLASSIFICATIONS Machine Learning Suspicious Data Movement Unusual Machine Access Flight Risk User Unusual Network Activity Machine Generated Beacon Lateral Movement Suspicious Behavior Compromised User Account Data Exfiltration Malware Activity Endpoint logs Server logs Identity logs Machine Learning DATA SOURCES
  28. 28. © 2 0 1 9 S P L U N K I N C . Sophisticated Security Modeling in UBA How does it look? 60+ Batch Models • 165+ Detections • 60+ Anomaly Types • IOCs • Contextual Intelligence • Entity Scoring Specialized Threat Models 20+ Threat Types Raw Events 15+ Streaming Models Aggregated Events Kill-chain Analysis Graph Analysis Custom Threats
  29. 29. © 2 0 1 9 S P L U N K I N C . Splunk Machine Learning Toolkit (MLTK) Built for the Citizen Data Scientist • Experiments and Assistants: Guided model building, testing, and deployment for common objectives • Algorithms: 80+ standard algorithms (supervised & unsupervised) Extensible to operationalize any use case • Python for Scientific Computing Library: Access to 300+ open source algorithms • Deep Learning Toolkit : Supports NN and GPU accelerated machine learning • ML-SPL API: Import any open-source or proprietary algorithm Extends Splunk to operationalize Machine Learning
  30. 30. © 2 0 1 9 S P L U N K I N C . Custom ML with the Splunk Platform Visualize & Share Clean & Munge Operationalize Monitor Alert Search & Explore Collect Data Build, Test, Improve Models Ecosystem MLTK Choose Algorithm Ecosystem Splunk Splunk Splunk Splunk MLTK Splunk Ecosystem Splunk Operationalized Data Science Pipeline Ecosystem MLTK Splunk Splunk’s App Ecosystem contains 1000’s of free add-ons for getting data in, applying structure and visualizing your data giving you faster time to value. The Machine Learning Toolkit delivers new SPL commands, custom visualizations, assistants, and examples to explore a variety of ml concepts. Splunk Enterprise is the mission-critical platform for indexing, searching, analyzing, alerting and visualizing machine data. Pre-processing Feature Selection MLTK Splunk MLTK Splunk Platform for Operational Intelligence
  31. 31. © 2 0 1 9 S P L U N K I N C . Customer Success Stories
  32. 32. © 2 0 1 9 S P L U N K I N C . Recent Customer Success Stories @ .conf19 Enhanced Anomaly Detection: Join T-Mobile and Splunk as we Deep Dive an Enterprise-IT Operational Use Case Add value to your SIEM: how Israel's Ministry of Energy applies Machine Learning to protect their Critical Infrastructure and OT Operations Augment Your Security Monitoring Use Cases with Splunk's Machine Learning Toolkit T-Mobile (US) Ministry of Energy, State of Israel SIEMENS AG Learn more at conf.splunk.com with over 900+ presentations available online!
  33. 33. © 2 0 1 9 S P L U N K I N C . 1) Get help from the Splunk Data Scientists to solve your business use case with Machine Learning Toolkit 2) Complimentary support with your Enterprise or Cloud license 3) Early access to new Machine Learning features 4) Results in opportunity to tell your success story with Splunk 5) Contact mlprogram@splunk.com for more information or your Splunk account team Splunk Machine Learning Advisory Program
  34. 34. © 2 0 1 9 S P L U N K I N C . Splunk MLAdvisory Customers
  35. 35. © 2 0 1 9 S P L U N K I N C . What‘s new in MLTK 5.0
  36. 36. © 2019 SPLUNK INC. Machine Learning Toolkit 5.0 New capabilities continue to make machine learning easily accessible by more users and extensible with connectors • Easier to navigate with a new, modern showcase layout • Smarter with the introduction of the new Smart Outlier Detection Assistant for anomaly detection • Migration to Python 3 • Applicable to more use cases with the Smart Forecasting Assistant with Multivariate Forecasts and Special Days Effects
  37. 37. © 2 0 1 9 S P L U N K I N C . Deploying and Applying ML with Splunk
  38. 38. © 2 0 1 9 S P L U N K I N C . Continuous Data Ingest at Scale DevelopVisualize PredictAlertSearch Engineers Data Analysts Security Analysts Business Users Native Inputs TCP, UDP, Logs, Scripts, Wire, Mobile Industrial Data SCADA, AMI, Meter Reads Modular Inputs MQTT, AMQP, COAP, REST, JMS HTTP Event Collector Token Authenticated Events Technology Partnerships Kepware, AWS IoT, Cisco, Palo Alto Maintenance Info Asset Info Data Stores External Lookups/EnrichmentOT Industrial Assets IT Consumer and Mobile Devices Real Time
  39. 39. © 2 0 1 9 S P L U N K I N C . Every Search Can Use Machine Learning Search Third-Party Applications Smartphones and Devices Tickets Email Send an email File a ticket Send a text Flash lights Trigger process flow AlertReal Time OT Industrial Assets IT Consumer and Mobile Devices
  40. 40. © 2 0 1 9 S P L U N K I N C . MLTK + Python for Scientific Computing persisted model SearchReal Time Visualize Alert | fit y from x* into “model” | apply “model” … Python for Scientific Computing OT Industrial Assets IT Consumer and Mobile Devices
  41. 41. © 2 0 1 9 S P L U N K I N C . Deep Learning Toolkit for Splunk persisted model SearchReal Time Visualize Alert | fit y from x* into “model” | apply “model” … OT Industrial Assets IT Consumer and Mobile Devices
  42. 42. © 2 0 1 9 S P L U N K I N C . Live Demo Splunk Machine Learning Toolkit (MLTK)
  43. 43. Philipp Drieger Staff Machine Learning Architect, Splunk Announcing the Deep Learning Toolkit for Splunk with TensorFlow 2.0, PyTorch, NLP and Jupyter Lab Notebooks
  44. 44. © 2 0 1 9 S P L U N K I N C . Seamlessly Integrate with Splunk Enterprise and Machine Learning Toolkit Workflows Freedom of Code within Jupyter Lab Notebooks for Advanced Modelling with TensorFlow and PyTorch GPU accelerated Deep Learning for Compute Intensive Training Workloads Key Benefits of the MLTK Container
  45. 45. © 2 0 1 9 S P L U N K I N C .
  46. 46. © 2 0 1 9 S P L U N K I N C .
  47. 47. © 2 0 1 9 S P L U N K I N C .
  48. 48. © 2 0 1 9 S P L U N K I N C .
  49. 49. © 2 0 1 9 S P L U N K I N C .
  50. 50. © 2 0 1 9 S P L U N K I N C .
  51. 51. © 2 0 1 9 S P L U N K I N C .
  52. 52. © 2 0 1 9 S P L U N K I N C .
  53. 53. © 2 0 1 9 S P L U N K I N C .
  54. 54. © 2 0 1 9 S P L U N K I N C .
  55. 55. © 2019 SPLUNK INC. 1. Extend your Splunk platform with the Deep Learning Toolkit for Splunk 2. Integrate custom advanced deep learning and NLP models into Splunk using a predefined Jupyter Notebook workflow for rapid model development. 3. Leverage GPUs for compute intense training tasks Deep Learning Toolkit for Splunk Key Takeaways
  56. 56. © 2 0 1 9 S P L U N K I N C . Outlook: new products announced at .conf19 Data Stream Processor (DSP)
  57. 57. © 2 0 1 9 S P L U N K I N C . Splunk Data Stream Processor Log Files Online Shopping Cart Cell Phones and Devices RFID Messaging Patient Generated Data Servers Web Services Call Detail Records Protect sensitive data Take action on data in motion Turn raw data into high- value information Distribute data to Splunk or other destinations Filter Format Enrich Mask Sensitive Data Detect data patterns or conditions Aggregate Normalize Transform Track and monitor pipeline health Splunk Data Stream Processor A real-time stream processing solution that collects, processes and delivers data to Splunk and other destinations in milliseconds Data Warehouse Public Cloud Message Bus
  58. 58. © 2 0 1 9 S P L U N K I N C . Use Cases Filter out or route noisy data to specific destinations Data Routing Filtering/ Noise Removal Data Formatting Guarantee delivery of high-volume, high- velocity data to multiple destinations Format or organize data using various functions based on specified conditions Aggregate data based on specific conditions and identify abnormal patterns in data Data Aggregation DATA IN MOTION
  59. 59. © 2 0 1 9 S P L U N K I N C . Introducing Unbounded ML in DSP Streaming Analytics : Derive insights while data is still in motion ● Automatic Detection of patterns and anomalies in raw logs ● Advanced pattern matching ● Sequential Outlier detection ● Multi-source correlation Derive insights on data in motion Continuous Intelligence ● Algorithms that learn continuously ● No downtime machine learning systems ● Unbounded in cardinality of models and data volume Advanced Analytics ● Online classification, clustering, time series forecasting, changepoint detection etc baked in ● Self tuning algorithms, no manual hyper parameter tuning needed
  60. 60. © 2 0 1 9 S P L U N K I N C . Anomaly Detection on Stream. General Questions: DSP-SplunkNext@splunk.com

×