Bridging Offensive
Operations and
Machine Learning
The Data Dilemma
Introduction
Agenda
Red Team
Challenges
Offensive
Data
Landscape
Offensive ML
Challenges
Nemesis Seeing the
Forest for
the Trees
#whoami
Will Schroeder
@harmj0y
@SpecterOps
Background
1
Red Team Operations
https://redteam.guide/docs/Concepts/red-vs-pen-vs-vuln
Offensive Machine Learning
“Application of machine learning to offensive problems.”
- Will Pearce (@moo_hax / @dreadnode)
Automating Phishing Sandbox Detection RAG On Stolen Docs
File Share Mining Better Password Guessing EDR Evasion
Tradecraft Suggestions Evading WAFs Password Detection
Adversarial Machine Learning
“Subdiscipline that specifically attacks ML algorithms.”
- Will Pearce (@moo_hax / @dreadnode)
Extraction Evasion Inversion Inference Poisoning
Red Team
Challenges
2
Red Team Challenges
Tradecraft is
difficult to
scale
Offensive data
and tooling is
not unified
File and tool
output triage is
tedious and
inconsistent
• File/data triage is one of the most
common tasks in offensive operations,
but it’s (usually) been heavily manual
• Automated workflows for this type of
task just haven’t existed (until recently)
1
Red Team Challenges
Manual Triage
Red Team
Story Time
Repeat x 100,000
2
Red Team Challenges
Tooling Issues
• Offensive tools weren’t built to interop
• Tooling to get the data we want might
not even exist on the offensive side!
• We (now) often have to fight
defensive products to get the data we
want…
Red Team Challenges
Data Issues
• Offensive data (like tools) is
mostly unstructured
• Offensive data (like tools) often
also doesn’t interop well
• We also often have heavy data
sensitivity/retention issues
3
4
Red Team Challenges
Scaling Tradecraft
• All operators are not equivalently skilled!
• “Scaling tradecraft” today == writing
articles in Confluence/Notion/etc.
• We don’t even have a way to effectively
scale tradecraft across teams, much less
the industry as a whole
The Offensive
Data
Landscape
3
• It is significantly easier to gather large
data sets on the defensive side than
the offensive side
• Outside of sharing malware, most
organizations keep this data to
themselves - this produces a lot of
asymmetry
• Sidenote: metadata vs full data…
A Defensive View on Data
Differences in Scale
• Defense deals with data scales several
orders of magnitude larger than offense
• Because of the scale difference + the base
rate issue, defense has to be right nearly
100% of the time!
• Offense only has to be mostly right
most/some of the time, and is significantly
more tolerant of false positives and
negatives
The Base Rate Fallacy
https://en.wikipedia.org/wiki/Base_rate_fallacy
Human Focused Tooling
Offensive Data Collection Challenges
Batch vs Incremental Ingestion Information Compression
• We want to collect data from multiple
sources like C2 + raw data + etc.
• This presents a significant modeling
challenge as you can't know when
data is “complete”
• Abstractions built from multiple
sources
• Historically, offensive tools have done a
lot of processing on the host itself and
returned relevant “interesting”
information
• Since all data is not returned, the
information is essentially “compressed”
and data is lost
• Collecting/analyzing the raw data
instead supports automation and
researching new attack paths
Example: Windows services are
derived from registry keys
• Offensive-focused data models exist
solely in isolation (BloodHound, etc.)
• A unified offensive data model is very,
very hard, partially due to tool
diversity and lack of interop
• We’re (slowly) trying to work towards
a unified model with Nemesis
Offensive Data Modelling Challenges
Offensive
Machine
Learning
Challenges
4
Lack of Dual-Domain Experts
• There are very few true experts versed in
both information security and machine
learning
• Except these two (and maybe a few others):
@dreadnode
Lack of Relevant Data Sets
Existing public security data sets have traditionally been… largely terrible.
We have the class imbalance problem, the privacy problem, and the
defensive close-hold problem for why good data hasn’t been released
No one releases high-quality,
timely, realistic, security data!
…but why would they?
The Privacy Problem
The Privacy Problem
Would you trust OpenAI or
another provider with a client’s
domain admin password?
Based on client contracts/compliance, are you
even allowed to if you wanted to?
Revelation: Synthetic Data
Good quality, labeled data has almost always been the most common
factor holding us back offensively.
Large state-of-the-art models can be used to generate high quality
synthetic data that we can use to fine tune smaller models (one reason
local models have become so good!)
However, the distribution for generated
synthetic data can, at least in some
cases, differ from the distribute of the
real-world data it’s mimicking, so this
isn’t a silver bullet
Why We Can’t Release Models
This all comes down to the inversion adversarial ML attack
Lack of Security-Focused Models
• There are only a handful of
cybersecurity-focused models (e.g.,
cyBERT)
• This is starting to change with local
model fine-tunes on Huggingface…
5
A centralized data processing platform that
ingests, enriches, and performs analytics on
offensive security assessment data.
VISION
Example Enrichment Flow
Example Enriched File
Offensive Analysis
We want to automate away level 1 analysis
(the boring/tedious tasks) and perform as much
“offline” analysis on raw data as possible
This approach permits analyzing
relationships between (previously)
disparate data sources to accomplish things
that used to require manual analysis
A goal is to provide operator feedback and
suggestions based on collected/analyzed data
(this is where LLM integration can come in!)
Demo
Seeing the
Forest For
the Trees
6
But why does
this matter…?
It's not just what
Nemesis does, it’s what
it will allow us to do!
This is a (possible) paradigm shift for red teams towards offensive data
unification and off-host data processing that offers numerous advantages.
Advantages
Centrally update
operator analysis
workflows
Enrichments/analytics
added exists in
perpetuity for ALL
operators on ALL
operations
Offline processing
allows for retroactive
analysis of data
Minimizes footprint of
offensive tooling on
endpoints
Collected
structured/unstructured
data can guide future
research and automation
Thank you!
Questions?
https://specterops.io/
http://www.github.com/SpecterOps/Nemesis

Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Learning

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
    Offensive Machine Learning “Applicationof machine learning to offensive problems.” - Will Pearce (@moo_hax / @dreadnode) Automating Phishing Sandbox Detection RAG On Stolen Docs File Share Mining Better Password Guessing EDR Evasion Tradecraft Suggestions Evading WAFs Password Detection
  • 7.
    Adversarial Machine Learning “Subdisciplinethat specifically attacks ML algorithms.” - Will Pearce (@moo_hax / @dreadnode) Extraction Evasion Inversion Inference Poisoning
  • 8.
  • 9.
    Red Team Challenges Tradecraftis difficult to scale Offensive data and tooling is not unified File and tool output triage is tedious and inconsistent
  • 10.
    • File/data triageis one of the most common tasks in offensive operations, but it’s (usually) been heavily manual • Automated workflows for this type of task just haven’t existed (until recently) 1 Red Team Challenges Manual Triage
  • 11.
  • 12.
  • 13.
    2 Red Team Challenges ToolingIssues • Offensive tools weren’t built to interop • Tooling to get the data we want might not even exist on the offensive side! • We (now) often have to fight defensive products to get the data we want…
  • 14.
    Red Team Challenges DataIssues • Offensive data (like tools) is mostly unstructured • Offensive data (like tools) often also doesn’t interop well • We also often have heavy data sensitivity/retention issues 3
  • 15.
    4 Red Team Challenges ScalingTradecraft • All operators are not equivalently skilled! • “Scaling tradecraft” today == writing articles in Confluence/Notion/etc. • We don’t even have a way to effectively scale tradecraft across teams, much less the industry as a whole
  • 17.
  • 18.
    • It issignificantly easier to gather large data sets on the defensive side than the offensive side • Outside of sharing malware, most organizations keep this data to themselves - this produces a lot of asymmetry • Sidenote: metadata vs full data… A Defensive View on Data
  • 19.
    Differences in Scale •Defense deals with data scales several orders of magnitude larger than offense • Because of the scale difference + the base rate issue, defense has to be right nearly 100% of the time! • Offense only has to be mostly right most/some of the time, and is significantly more tolerant of false positives and negatives
  • 20.
    The Base RateFallacy https://en.wikipedia.org/wiki/Base_rate_fallacy
  • 21.
  • 22.
    Offensive Data CollectionChallenges Batch vs Incremental Ingestion Information Compression • We want to collect data from multiple sources like C2 + raw data + etc. • This presents a significant modeling challenge as you can't know when data is “complete” • Abstractions built from multiple sources • Historically, offensive tools have done a lot of processing on the host itself and returned relevant “interesting” information • Since all data is not returned, the information is essentially “compressed” and data is lost • Collecting/analyzing the raw data instead supports automation and researching new attack paths Example: Windows services are derived from registry keys
  • 23.
    • Offensive-focused datamodels exist solely in isolation (BloodHound, etc.) • A unified offensive data model is very, very hard, partially due to tool diversity and lack of interop • We’re (slowly) trying to work towards a unified model with Nemesis Offensive Data Modelling Challenges
  • 24.
  • 25.
    Lack of Dual-DomainExperts • There are very few true experts versed in both information security and machine learning • Except these two (and maybe a few others): @dreadnode
  • 26.
    Lack of RelevantData Sets Existing public security data sets have traditionally been… largely terrible. We have the class imbalance problem, the privacy problem, and the defensive close-hold problem for why good data hasn’t been released No one releases high-quality, timely, realistic, security data! …but why would they?
  • 27.
  • 28.
    The Privacy Problem Wouldyou trust OpenAI or another provider with a client’s domain admin password? Based on client contracts/compliance, are you even allowed to if you wanted to?
  • 29.
    Revelation: Synthetic Data Goodquality, labeled data has almost always been the most common factor holding us back offensively. Large state-of-the-art models can be used to generate high quality synthetic data that we can use to fine tune smaller models (one reason local models have become so good!) However, the distribution for generated synthetic data can, at least in some cases, differ from the distribute of the real-world data it’s mimicking, so this isn’t a silver bullet
  • 30.
    Why We Can’tRelease Models This all comes down to the inversion adversarial ML attack
  • 31.
    Lack of Security-FocusedModels • There are only a handful of cybersecurity-focused models (e.g., cyBERT) • This is starting to change with local model fine-tunes on Huggingface…
  • 32.
  • 33.
    A centralized dataprocessing platform that ingests, enriches, and performs analytics on offensive security assessment data. VISION
  • 35.
  • 36.
  • 37.
    Offensive Analysis We wantto automate away level 1 analysis (the boring/tedious tasks) and perform as much “offline” analysis on raw data as possible This approach permits analyzing relationships between (previously) disparate data sources to accomplish things that used to require manual analysis A goal is to provide operator feedback and suggestions based on collected/analyzed data (this is where LLM integration can come in!)
  • 38.
  • 39.
  • 40.
  • 41.
    It's not justwhat Nemesis does, it’s what it will allow us to do! This is a (possible) paradigm shift for red teams towards offensive data unification and off-host data processing that offers numerous advantages.
  • 42.
    Advantages Centrally update operator analysis workflows Enrichments/analytics addedexists in perpetuity for ALL operators on ALL operations Offline processing allows for retroactive analysis of data Minimizes footprint of offensive tooling on endpoints Collected structured/unstructured data can guide future research and automation
  • 43.