• Like
  • Save

Thanks for flagging this SlideShare!

Oops! An error has occurred.

SplunkLive! Prelert Session - Extending Splunk with Machine Learning

  • 1,055 views
Published

 

Published in Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,055
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • [no audio here]
  • Probability of data comes in all shapes and sizes – rarely does it fit a nice bell curve

Transcript

  • 1. Extending Splunk with Machine-learning Predictive Analytics Rich Collier Solutions Architect rich@prelert.com
  • 2. Why Machine Learning? • Overcome limitations of human analysis • Auto-learn baseline behavior using proper modeling • Detect anomalous behavior
  • 3. Why Machine Learning? • Overcome limitations of human analysis • Auto-learn baseline behavior using proper modeling • Detect anomalous behavior
  • 4. Overcoming limitations of Human Analysis • Judging what’s “normal” is not always easy • Humans don’t always choose the right techniques
  • 5. IPTables (firewall) • How to find most anomalous users (aggressive brute force attackers)? • Here is a typical (manual) process
  • 6. Step 1) Search Questions: What’s normal? What about that spike? Probably should try to visualize counts by SRC over time…
  • 7. Step 2) stats command, sort by count Question: How to show as a function of time, not just overall?
  • 8. Step 3) add bucketing for breakdown by time Question: What is an anomalous count per bucket? 100? 1000? 10,000? Maybe we should try to use some more stats?
  • 9. Step 4) add some “basic” statistical analysis: avg +/- 2 Question: How to show the individual “outliers” (and not lose the concept of time)?
  • 10. Step 5) use eventstats to repair time problem and add “where” clause to only show those outside of +/-2 Question: Are these 161 results accurate? (I hope you didn’t build an alert and get 161 of them!)
  • 11. Problem: Statistical modeling is INCORRECT for this data – (-75) events doesn’t make sense for avg - 2 – how much confidence do you have in avg + 2 ? Result: • Wrong model= false positives/negatives
  • 12. The Problem: +/-2 assumes data is Gaussian (Bell Curve) Clearly, this data is better fit by a Poisson curve
  • 13. Examples of Non-Gaussian Data status=503 Memory Utilization CPU load status=404 Revenue Transactions
  • 14. One More Problem… • Even if the demonstrated technique was accurate: – Still need to persist what you’ve learned “so far” so that you don’t have to keep re-inspecting historical data as new data comes in – This requires you to manually write/read information into a summary index
  • 15. Why Machine Learning? • Overcome limitations of human analysis • Auto-learn baseline behavior using proper modeling • Detect anomalous behavior
  • 16. First, an Analogy • How could I accurately predict how much Postal-mail you are likely to get delivered to your home tomorrow?
  • 17. I Would… • Watch your mail delivery for a while – 1 day? – 1 week? – 1 month? – 1 year? • Use my observations to create a…
  • 18. Average? Std. Deviation? Probability Distribution Function?
  • 19. A Probability Distribution Function! % likelihood (probability) Best for my house pieces of mail per day
  • 20. A Probability Distribution Function! % likelihood (probability) College Student? pieces of mail per day
  • 21. % likelihood (probability) A Probability Distribution Function! My Mom pieces of mail per day
  • 22. Using Machine Learning to build a Probability Distribution Function • PDF must be built specifically for each “instance” • PDF should be constructed automatically merely by watching the data
  • 23. Using Machine Learning to build a Probability Distribution Function 23
  • 24. Now what?
  • 25. Why Machine Learning? • Overcome limitations of human analysis • Auto-learn baseline behavior using proper modeling • Detect anomalous behavior
  • 26. Finding “what’s unexpected”… Your job is often looking for unexpected change in your environment, either proactively through monitoring or reactively through diagnostics/troubleshooting
  • 27. % likelihood (probability) Using the PDF to Find What is Unexpected zero pieces of mail? fifteen pieces of mail? pieces of mail per day
  • 28. Relate back to data in Splunk • # Pieces of mail = # events of a certain type – number of failed logins – number of errors of different types – number of events with certain status codes – etc. • Or, performance metrics – response time – utilization %
  • 29. Back to our Example!
  • 30. • Prelert Anomaly Detective – Automatically, and correctly models data via self-learning – Applies sophisticated Bayesian techniques – Persists “on-going” analysis to allow real-time alerting – Makes it easy to use 3 significant alerts, not 161!
  • 31. • Results are: – Accurate outliers – Automatically clustered and scored by their probabilistic “unlikelihood” – Relevant in time, easy to make alerts – Clickable for drill-down
  • 32. • Drill-downs: – Automatically constructs useful search syntax and time selection – Shows anomalies in context of the original data – Serve as a possible jumping-off point for subsequent manual mining
  • 33. Automated Anomaly Detection • Less time searching & troubleshooting • Proactive trustworthy alerts without thresholds • Auto-discovers the previously unknown
  • 34. Automated Anomaly Detection for splunk> Additional Use Cases
  • 35. Use Case • Data sources: – App logs – Network performance – SQL-Server metrics • Prelert identifies network discards that cause app to disconnect from DB Correlating Anomalies Across Data Types
  • 36. Use Case • Data source: Netstat • Prelert finds a rare FTP connection from a server that doesn’t normally use FTP Servers making unusual TCP connections
  • 37. Use Case • Data source: Custom logs • Prelert identifies unusual $0.60 transaction – traced to bug in currency conversion Revenue Transactions
  • 38. Use Case • Data source: BlueCoat proxy • Prelert identifies users abusing Internet privileges gambling sites porn sites Clients pervasively visiting rare URLs
  • 39. Use Case • Response time of online bank website • Prelert alerts on spikes without the need to create a single threshold Monitoring Performance w/o Thresholds
  • 40. Use Case • Data source: BlueCoat proxy • Prelert identifies client attempting to exploit an outside IIS webserver Unusual outbound traffic rates
  • 41. Automated Anomaly Detection for splunk>