We are writing the year 2017. Cyber security has been a discipline for many years and thousands of security companies are offering solutions to deter and block malicious actors in order to keep our businesses operating and our data confidential. But fundamentally, cyber security has not changed during the last two decades. We are still running Snort and Bro. Firewalls are fundamentally still the same. People get hacked for their poor passwords and we collect logs that we don't know what to do with. In this talk I will paint a slightly provocative and dark picture of security. Fundamentally, nothing has really changed. We'll have a look at machine learning and artificial intelligence and see how those techniques are used today. Do they have the potential to change anything? How will the future look with those technologies? I will show some practical examples of machine learning and motivate that simpler approaches generally win. Maybe we find some hope in visualization? Or maybe Augmented reality? We still have a ways to go.
• IBM Research
• Big Data
My Provocative Premise
• Cyber Defense / Monitoring / Analytics is still at the level of 1999
• We can’t predict the weather and we have done it since 1 August 1861
o “The weather predicted by the BBC for four days time was just 30-40% accurate”
• Predicting election results anyone?
o “80% chance Clinton will win.”
• Nothing Has Changed in Security (Defense)
• Machine Learning & Artificial Intelligence
• Now What?
Summary of Technologies
• Firewalls – policy management, auditing a challenge
• IDS/IPS – false positives
• Threat Intelligence – really the same as IDS signatures
• DLP – just an IDS engine
• Vulnerability Scanners – what’s up with those old user interfaces?
• SIEM – still the same issues: parsing, context, prioritization
• Security Analytics – can actually mostly be done with your SIEM
Is this the answer to all of our
security problems? Is ML and AI
what we have been waiting for?
Machine Learning / Data Mining
• Anomaly detection (outlier detection)
o What’s “normal”?
• Association rule learning (e.g., items purchased together)
• Regression (model the data)
Data Mining in Security
The graph shows an abstract
space with colors being machine
Machine Learning in Security
•Needs a corpus of data to learn from
•Network traffic analysis
still not working
oNo labeled data
o Not sure what the right
features should be
•Works okay for SPAM
Artificial Intelligence in Security
•Just calling something AI doesn’t make it AI.
”A program that doesn't simply classify or compute model
parameters, but comes up with novel knowledge that a
security analyst finds insightful.”
Artificial Narrow Intelligence (ANI)
• Computer programs we have today that perform a specific, narrow task: Deep Blue, Amazon recommendations
Artificial General Intelligence (AGI)
• A program that could learn to complete any task
• What many of us imagine when we think of AI, but no one has managed to accomplish it yet
Artificial Superintelligence (ASI)
• Any computer program that is all-around smarter than a human (also see the singularity by Ray Kurzweil)
The Law of Accelerating Returns – Ray Kurzweil
• We have tried many thing:
o Social Network Analysis
o Seasonality detection
o Entropy over time
o Frequent pattern mining
• All kinds of challenges
o Characterize normal
o Extract what has been learned
o Statistical vs. domain anomalies
• Simple works!
Areas To Explore
• Environment specific rather than environment agnostic approaches
o Same IDS signatures for everyone? Same SIEM signatures?
o Real-time threat intel sharing
o Users don’t think in IP addresses, they think about users
o Topology mapping anyone?
o User-based policies, not machine based
o Adaptive security
• Capture expert knowledge
o Collaborative efforts
• Forget about 3D visualization 😊
Promising Approaches That Will “Change” Security
• Continuous authentication
• Dynamic policy decisions – automation – really closing the loop
o But what products do this well? Open APIs, low f/p, etc.
• Micro segmentation (including SDN?)
• Real-time threat intelligence sharing
• Human assisted machine learning systems
• Crowd sourcing
• End-user involved / assisted decision making
• Eradicate phishing, please!
How Will ML / AI Help?
• Machine learning consists of algorithms that need data
o Garbage in - garbage out
o Data formats and semantics
• Deep learning is just another ML algorithm
o Malware classification (it isn’t necessarily better than other ML algorithms)
o Basically eliminates the feature engineering step
• Many inherent challenges (see https://www.youtube.com/watch?v=CEAMF0TaUUU)
o Distance functions
o Context – need input from HR systems and others
o Choice of algorithm
• Where to use ML
o Classification problems (traffic, binaries, activities, etc.)
o There is good work being done on automating the level 1 analyst
o Look for systems that leverage humans in the loop (see topic of knowledge capture)
Security Visualization Community
• List: secviz.org/mailinglist
• Twitter: @secviz
Share, discuss, challenge, and learn about security visualization.
Visual Analytics -
Delivering Actionable Security
July 22-25 2017, Las Vegas
big data | analytics | visualization
Sophos – Security Made Simple
• Products usable by non experts
delightful for the security analyst
• Consolidating security capabilities
• Data science to SOLVE problems
not to highlight issues