It's no question that modern advances in AI and Deep Learning technologies have allowed organizations to greatly scale their defensive capabilities. Between detecting evolving threats, automating discovery, fighting dynamic attacks, and even freeing up time for IT professionals; AI-fueled automation has been a boon for system defenders. But before we get too comfortable, we need to remember that there is another side to this fight.
In this talk, we'll take a look at how AI technologies are enhancing adversarial capabilities and how challenges in defensive machine learning are opening up new attack surfaces.
2. $ whoami
Name: GTKlondike
(Independent security researcher)
(Consulting is my day job)
Passionate about network security
(Attack and Defense)
NetSec Explained: A passion project and YouTube
channel which covers intermediate and advanced level
network security topics in an easy to understand way.
Hello Again!
2
4. What Is Machine Learning?
Machine Learning is a set of statistical techniques,
that enables a process of information mining, pattern
discovery, and drawing inferences from data.
Machine Learning uses algorithms to “learn” from
past data to predict future outcomes.
4
And it’s place in AI
9. Overview
Offensive AI Tools
– Why would we even want these?
– Capabilities and trends
Adversarial Machine Learning
– Threat modeling
– Types of attacks
– Defenses
9
In two parts
11. Offensive AI
Dynamically and intelligently explore the target attack
surface
Operate at machine speed and scale
Assist in automating manual analysis
Uncover hidden blind spots in defensive tools and
software
11
Why would we even want these?
12. How Realistic is Offensive AI?
Vulnerability Discovery
Exploitation
Post Exploitation
(patching)
Data Theft
12
DARPA Cyber Grand Challenge 2016
13. Things to Keep in Mind
AI and ML does not automate the decision-making
process
Train a model to decide something, then wrap it in
automation
There is too much to learn
Start small and automate modular tasks
13
Wait, AI doesn’t solve everything?
14. Applications of Offensive AI
Social engineering
Defense detection and evasion
Evaluating data leaks
Network exploitation
Software exploitation
14
Where PoC tools already exist
15. Social Engineering
SNAP_R
Generates front-end content based on target users social media
history (e.g., posts)
GPT2
Generates realistic looking long-form content
Lyrebird and Tacotron
Realistic text to speech based on human voice audio samples
StyleGAN
Generative Adversarial Network (GAN) to generate people (and
cats)
15
Phishing with fake personas
18. Lyrebird and Tacotron
Original Voice
Synthetic Voice
Tacotron
18
Synthetic human voices
Text:
Only the photographs on the
mantelpiece really showed how
much time had passed. Ten years
ago, there had been lots of pictures
of what looked like a large pink
beach ball wearing different-colored
bonnets - but Dudley Dursley was no
longer a baby, and now the
photographs showed a large blond
boy riding his first bicycle, on a
carousel at the fair, playing a
computer game with his father,
being hugged and kissed by his
mother.
19. Detection and Evasion
MarkovObfuscate
Uses Markov chains to obfuscate data (steganography)
Lightbulb Framework
Burp plugin to bypass popular open source WAFs
Sandbox Detection*
Treats process lists as data to quickly identify various
sandboxes
19
Detect, avoid, hide, escape
20. Sandbox Detection
PID ARCH SESS NAME OWNER PATH
1 x64 0 smss.exe NT AUTHORITYSYSTEM SystemRootSystem32smss.exe
4 x64 0 csrss.exe NT AUTHORITYSYSTEM C:Windowssystem32csrss.exe
236 x64 0 wininit.exe NT AUTHORITYSYSTEM C:Windowssystem32wininit.exe
312 x64 0 csrss.exe NT AUTHORITYSYSTEM C:Windowssystem32csrss.exe
348 x64 1 winlogon.exe NT AUTHORITYSYSTEM C:Windowssystem32winlogon.exe
360 x64 1 services.exe NT AUTHORITYSYSTEM C:Windowssystem32services.exe
400 x64 0 lsass.exe NT AUTHORITYSYSTEM C:Windowssystem32lsass.exe
20
A process list as data
Reference: silentbreaksecurity.com
21. Sandbox Detection
A B C D E F
Process Count 33 157 30 84 195 34
Process
Count/User
8.25 157 7.5 84 195 8.5
User Count 4 1 4 1 1 4 Host Score Average
Host Total 59.25 315 54.5 226 480 65.5 168.04
Sandbox Score 1 0 1 0 0 1
21
Identifying the sandbox
Reference: silentbreaksecurity.com
22. Evaluating Data Leaks
PassGan
Generative Adversarial Network (GAN) to learn the
distribution of real passwords from password leaks
Proof-Pudding
Specifically attacks Proofpoint's e-mail scoring system by
stealing scored datasets and creating a copy-cat model for
abuse
22
Attackers can data mine too
23. Network Exploitation
Deep Exploit and GyoiThon
Automate recon, fingerprinting, and exploitation via Nmap
and Metasploit
DeathStar
Automates gaining Domain Admin rights using a variety of
techniques using PowerShell Empire
Eyeballer
Identifies “interesting” features in website screenshots
23
When db_autopwn isn’t enough
26. Software Exploitation
American Fuzzy Loop (AFL)
Powers Google’s “ClusterFuzz” binary fuzzer
Built on genetic algorithms to intelligently fuzz and debug binaries
Joern
Static code analysis for C/C++
Pulsar
Network protocol fuzzer with automatic protocol learning and
simulation capabilities
NexFuzz (Commercial Tool)
Automated web application testing by recording user interactions
26
Discover new test cases and 0-days
27. Creating AI Tools
Pick a job that’s hard to signature or script
Focus on scaling and automation
Execution be easy, cheap, and repeatable
Caution: The training is not easy or cheap!
The best AI tool is the one that’s useful to your team
27
What if I want to make my own?
28. Creating AI Tools
Start with a pipeline
E.x. Nmap scan finds web ports open
Based on the results of pipeline A, pivot to another
pipeline
E.x. Gobuster/Dirbuster and enumerate web pages
Eventually these workflows will be able to scale to
provide useful information to the analyst
Populate a dashboard with this information
28
Start simple and work up
29. Recap
AI allows adversaries to operate at speed and scale
We are seeing the very beginning of what AI can
bring to the offensive security space
While AI is limited, it can perform actions based on
the decisions that have been trained into it
The best ML is simply ML that is useful to your team
29
How has AI empowered attackers?
31. Adversarial Machine Learning
Model Evasion
Attacking the inference phase
Model Poisoning
Attacking the training phase
Data Leakage
Privacy or decision-making data
31
Attacking the model
32. Model Testing
White Box
Attacker has full knowledge about the model
Focus on the feature space
Black Box
Attacker treats model like an oracle
Just like in cryptography
Attack Transferability
Adversarial samples can affect multiple models
32
The adversarial approach
33. Network
IDS
Attack Surface
33
Where am I vulnerable?
Generic
Machine
Learning
Model
Physical
Object
Digital
Object
Machine Learning
Model Decision
Input
Features
OutputsFormat
(bytes)
Observed
Event
Actions
Packet
Metadata
Attack
Probability
TCP
Dump
Attack
Traffic
Block
Access
34. Model Evasion
34
Hiding in the blind spots
Theoretical Space
Training
SpaceTesting
Space
Adversarial Space
41. Poisoning Defenses
Have longer periods between retraining
Analyzing longer periods of data
Minimize impact of adversarial training samples
41
Learning from untrusted data
42. Data Leakage
Usually when models are too good to be true
Could leak private or proprietary data
Model theft by competitors
42
Holes in the data pipes
44. Recap
Model poisoning, evasion, and data leakage
We have already seen adversarial attacks against real-
world models
Adversarial examples for one model can fool another
If your model is not robust, it’s not a good model
44
Adversarial Machine Learning