Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bibu Labs: Problems with ML based Attack Detection in Enterprise Security (2018)

122 views

Published on

The following depicts segments of a presentation we gave at UofT AI Squared Forum in 2018: http://www.aisquaredforum.ca/

Please Note: The following content is based on our own independent research work and is not intended to reflect any particular client environment. The intention of our talk was to demystify ML-based solutions in Enterprise Security. We have decided not to reveal any specific exploit.

Reference:
Slide 3: Numbers were obtained from CISCO https://www.cnbc.com/video/2017/05/11/there-are-20-billion-cyber-attacks-every-day-cisco-.html

Slide 4: Numbers were obtained from MIT Technology Review and Forbes
https://www.technologyreview.com/f/610043/hackers-stole-172-billion-from-people-in-2017/
https://www.forbes.com/sites/stevemorgan/2016/01/17/cyber-crime-costs-projected-to-reach-2-trillion-by-2019/

Slide 7: We tapped into a few industry papers to get baseline numbers (mostly revolving around surveys sponsored by cybersecurity vendors). The feedback we got while talking with large SOC shops reaffirmed the trend.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Bibu Labs: Problems with ML based Attack Detection in Enterprise Security (2018)

  1. 1. The “Needle in a Haystack” problem of Attack Detection in Cybersecurity and how AI can solve it Tahseen Shabab Founder – Bibu Labs
  2. 2. The Problem
  3. 3. 20 Billion cyber attacks per day globally 3.3 Billion searches per day globally
  4. 4. $172 B lost globally to hackers in 2017 $2 T projected loss to hackers in 2019
  5. 5. $172 B lost globally to hackers in 2017 $2 T projected loss to hackers in 2019 10X
  6. 6. Bob
  7. 7. 300+ alerts per day 60-75% noise (false positives) 2-7 days to detect attacks + =
  8. 8. 300+ alerts per day 60-75% noise (false positives) 2-7 10Xdevices over next 5 yrs + = days to detect attacks
  9. 9. 300+ alerts per day 60-75% noise (false positives) 2-7 10Xdevices over next 5 yrs Complexity of sensor data + = days to detect attacks
  10. 10. Hackers Organized Sophisticated Targeted
  11. 11. The Defense
  12. 12. HR Data Lake Enterprise Security Router IPS/IDS End Point Server Threat Intel FW Decoy Sensors SIEM Tool Anomaly Detection Orchestration IDS NAC Antivirus FW Controls Analysts APIs
  13. 13. SOC Analysts Tier 2Tier 1 Tier 3 • Monitoring • Open Ticket • Close False Positives • Basic Investigation/Mitigation • Deep Investigation • Mitigation/Recommends changes • Advanced Investigation • Prevention • Counterintelligence • Malware Reverse Engineering
  14. 14. SOC Analysts (Illustration) Tier 2Tier 1 Tier 3 • Monitoring • Open Ticket • Close False Positives • Basic Investigation/Mitigation • Deep Investigation • Mitigation/Recommends changes • Advanced Investigation • Prevention • Counterintelligence • Malware Reverse Engineering Escalate Escalate
  15. 15. The Enemy
  16. 16. Hackers using AI • Situation Aware Malware • Adaptive • Understand environment, make calculated decisions (clock cycles/web.config) • Success Based learning • Adversarial Learning
  17. 17. Adaptive Nature of Hackers (Cat and Mouse Game) • Hackers take path of least resistance • If a patch has been deployed, hackers will try another route Vulnerability 1 Vulnerability 2 Vulnerability 3
  18. 18. Problems with ML Solutions
  19. 19. Problem Specific to Cybersecurity • Imbalanced Dataset • ~ 0.001% of dataset correlates to hack • Dynamic Environment • Traffic, user behavior, attacker behavior • Attack pattern not necessarily carried forward between organizations • Threat Intelligence
  20. 20. User Behavior Analytics • Aims to detect insider threats, targeted attacks, fraud • Understands patterns of real users, looks for anomalies • Algorithms and statistical analysis
  21. 21. Problems with User Behavior Analytics • Analysts waste time on False Positives • Illustration* Event Data Sources (ex. Server, Endpoint, etc.) State Data Sources (ex. HR System) Time Series Analysis Aggregation User Behavior of Sales Executives
  22. 22. Problems with User Behavior Analytics • Analysts waste time on False Positives • Illustration* User Behavior of Sales Executives Legitimate deviation from norm Sophisticated lateral movement Priority 1. False Positives 2. False Positives 3. False Positives 4. False Positives 5. Sophisticated Attack
  23. 23. Problems with User Behavior Analytics • Analysts waste time on False Positives • Illustration* User Behavior of Sales Executives Legitimate deviation from norm Sophisticated lateral movement Priority 1. False Positives 2. False Positives 3. False Positives 4. False Positives 5. Sophisticated Attack
  24. 24. Bayesian Statistics in Network Traffic Analysis • Bayesian Model • Users, Devices, Time, Activity, Network Traffic • Conditional Probability: P(D/T), P(D/U,T), … • False positives • “Know to cause many false positives ”, The Antivirus Hackers Handbook • “Bypassing Bayesian Networks is simple, write malware as similar as possible to goodware”, The Antivirus Hackers Handbook
  25. 25. Decoys web.config decoy Lateral movement Supervised Learning Learn Pattern of Attack specific to organization Attacker behavior
  26. 26. Prioritization due to existing controls • Priority of similar alerts are based on context Attacker Malicious Payload <script>alert(123)</script> FW Web App without XSS library Web App with XSS library Log Alert
  27. 27. Cyber AI Platform-as-a-service AI Modules to address the problem of Analyst Stress , Accuracy and Task Optimization.
  28. 28. Following Slides Removed Please contact me if you wish to learn more! tahseen@bibulabs.com

×