The document discusses how FalconForce automates detection engineering through infrastructure as code principles. It advocates for representing detections as code that can be version controlled, peer reviewed, automatically tested and deployed. This enables detections to be treated as software where quality is assured through automation and changes are tracked. The document outlines their process for developing detections from hypothesis to reporting and revising through analysis. It also discusses how they represent detections as YAML for reusability across environments and how they perform end to end unit testing of detections against realistic attack simulations.
2. Olaf Hartong
Defensive Specialist @ FalconForce
Detection Engineer and Security Researcher
Built and/or led Security Operations Centers
Threat hunting, IR and Compromise assessments
Former documentary photographer
Father of 2 boys
“I like ” & ATT&CKCON furniture
@olafhartong
github.com/olafhartong
olaf@falconforce.nl
olafhartong.nl / falconforce.nl
3. Why we started automating
What does detection as code mean (to us)
How we document and store our detections
The benefits of automatic deployment
Automatic detection validation
4. 1. Hypothesize
• Develop general theories.
• Use Threat Intelligence, ATT&CK, industry
reports and internal knowledge.
• Develop initial queries.
• Determine timespan.
2. Investigate & research
• Find ways how a technique can be executed,
scripts/samples/procedures.
• Determine what data you will need.
• Investigate what it looks like when the
technique has been executed.
• Develop initial validation script options.
3. Develop analytics
• Build a set of analytics.
• Cast a wide net, then narrow it.
• Be efficient.
4. Analyze & implement
• Review results.
• Enrich where possible.
• Tune the query if needed, keep it resilient.
• Implement analytics in production.
• Implement validation script.
5. Report & revise
• Report to IR/TI/Management.
• Measure efficiency.
• Measure scope.
5.
6. Who changed rule x and what changed
Will we break the detection with this modification ?
When was rule x implemented and when last changed ?
Can we assure the quality of a detection and its documentation ?
Is my detection logic still working as expected ?
8. Have a backlog and prioritize this
Write dedicated documentation per detection
Test and review all changes
Track progress and have standups
Plan and organize your maintanance
Follow an
Agile
process
and
workflow
11. Choose an easy to maintain machine-readable format, like YAML
Make sure to be as expressive as possible and plan for code reuse
Reuse lists, query components and lookup tables.
Create a schema for validation and quality of life
Design for the ability to deploy to multiple environments
Use a
simple
language
and plan
reusability
14. Example query:
let timeframe = {{ timeframe|default('1h') }};
let initiators = {{ yaml2kusto(initiators | default(['wsmprovhost.exe’])) }};
let PowershellRemotingEvents = DeviceProcessEvents
| where Timestamp >= ago(timeframe)
| where InitiatingProcessFileName in~ (initiators)
// Environment-specific filter.
{{ post_filter_1 }};
// End environment-specific filter.
{% if (exclude_servers | default(True)) %}
// Filter out servers since Powershell remoting is common on servers.
DeviceInfo
| where DeviceName in (( PowershellRemotingEvents | project DeviceName ))
| where not(isempty(OSPlatform))
| summarize arg_max(Timestamp, OSPlatform) by DeviceName
| join kind=rightouter PowershellRemotingEvents on DeviceName
| where not(OSPlatform contains "server")
{% else %}
PowershellRemotingEvents
{% endif %}
Environment 1:
exclude_servers: true
post_filter_1: |
| where not(DeviceName startswith 'br')
| where not(FileName contains "test")
Environment 2:
timeframe: 4h
exclude_servers: false
initiators:
- wsmprovhost.exe
- powershell.exe
Query environment 1:
let timeframe = 1h;
let initiators = “wsmprovhost.exe”;
let PowershellRemotingEvents = DeviceProcessEvents
| where Timestamp >= ago(timeframe)
| where InitiatingProcessFileName in~ (initiators)
// Environment-specific filter.
| where not(DeviceName startswith 'br')
| where not(FileName contains "test")
// End environment-specific filter.
// Filter out servers since Powershell remoting is common on servers.
DeviceInfo
| where DeviceName in (( PowershellRemotingEvents | project DeviceName ))
| where not(isempty(OSPlatform))
| summarize arg_max(Timestamp, OSPlatform) by DeviceName
| join kind=rightouter PowershellRemotingEvents on DeviceName
| where not(OSPlatform contains "server")
Query environment 2:
let timeframe = 4h;
let initiators = dynamic(['wsmprovhost.exe’,’powershell.exe’]);
let PowershellRemotingEvents = DeviceProcessEvents
| where Timestamp >= ago(timeframe)
| where InitiatingProcessFileName in~ (initiators)
// Environment-specific filter.
// End environment-specific filter.
PowershellRemotingEvents
15. Track all changes via commits
Allows for (enforced) peer review on merge
Simple to roll back to a previous version
Have the single source of truth
Allows pipeline actions for automatic checks and deployments
Version
controlled
with peer
review
options
22. This enables us to do offline schema and language-based query validation.
Additionally, match fields, tables and entities in the output to what is mapped in the
documentation. For instance, to ensure proper entity mapping.
The language server:
Parses all Microsoft documentation and updates the schema
Allows for custom parsers and watchlists
Emulates the use of functions in Defender, like FileProfile
Available for free >> https://kql.falconforce.blue/api/kqlanalyzer
https://github.com/FalconForceTeam/KQLAnalyzer
23.
24. Now that we have al these attributes, we can match list contents. When one of the rule components is
not present in the ATT&CK technique’s component list, its alignment is flagged as false.
25. User Account: User Account Authentication
An attempt by a user to gain access to a network or computing resource, often
by providing credentials (ex: Windows EID 4625 or /var/log/auth.log)
Logon Session: Logon Session Creation
Initial construction of a new user logon session (ex: Windows EID 4624, /var/log/utmp, or /var/log/wmtp)
26. Based on this we identified 164 data components which can be added to existing techniques. This is based
on the components that we use in detection rules for those techniques.
There were even some techniques that did not have any data source
added to them yet.
These results will need to be shared with the ATT&CK team with
some additional context and reasoning to understand our mapping.
The top 15 added components out of 30 to multiple techniques is;
27.
28. End to end test of use-cases, testing from executing a real(alistic) attack,
until the attack results in an alert.
By testing end-to-end we validate all steps involved, for example:
Ideally each use-case has corresponding test case(s) that can trigger the use-case in an
automated manner.
EDR logs the expected events when the attack is performed
The format of the logs is consistent
Logs are properly ingested into Sentinel / MDE
There is no out-of-the-box rule that makes ours redundant
29. Alerts to
Leads to
Runs regularly
Detection rule Attack script pipeline Breach & attack tool Target(s) Dashboard
Improvement
Sentinel / MDE
Reports status
Slack
30. Where possible, perform an actual attack
Test whether the attack was executed successfully
Use variables for data that can differ per environment
Focus on testing the EDR component not the AV part
Custom YAML format that is an extension of the format used by
Atomic Red Team
Unit
testing
based on
realistic
attacks
31.
32.
33.
34.
35. Provide quality control by automation and review
Ensures a single source of truth
Allows for automated deployment, even across multiple environments
Self documenting
Measure operating quality through automated validation testing.