SlideShare a Scribd company logo
1 of 48
Download to read offline
Measuring the speed of the
Red Queen’s Race
Richard Harang and Felipe Ducau
Sophos Data Science Team
Who we are
• Rich Harang
@rharang; richard.harang@sophos.com
o Research Director at Sophos – PhD UCSB; formerly scientist at U.S. Army Research
Laboratory; 8 years working at the intersection of machine learning, security, and
privacy
• Felipe Ducau
@fel_d; felipe.ducau@sophos.com
o Principal Data Scientist at Sophos – MS NYU Center for Data Science; specializes in
design and evaluation of deep learning models
3
Data Science @ Sophos
You?
2
The talk in three bullets
4
• The threat landscape is constantly changing; detection strategies decay
• Knowing something about how fast and in what way the threat landscape
is changing lets us plan for the future
• Machine learning detection strategies decay in interesting ways that tell us
useful things about these changes
Important caveats
5
•A lot of details are omitted for time
•We’re data scientists first and foremost, so…
o Advance apologies for any mistakes
oOur conclusions are machine-learning centric
6
"Now here, you see, it takes all the running you can do to
keep in the same place. If you want to get somewhere else,
you must run at least twice as fast as that!“
(Lewis Carroll, 1871)
Faster and faster…
7
Vandalism
1986 – Brain virus
1988 – Morris worm
1990 – 1260 polymorphic
virus
1991 – Norton Antivirus,
EICAR founded, antivirus
industry starts in earnest
1995 – Concept virus
RATs, loggers, bots
2002 – Beast RAT
2003 – Blaster worm/DDoS
2004 – MyDoom worm/DDoS
2004 – Cabir: first mobile phone
worm
2004 – Nuclear RAT
2005 – Bifrost RAT
2008-2009 – Conficker variants
Crimeware, weapons
2010 – Koobface
2011 – Duqu
2012 – Flame, Shamoon
2013 – Cryptolocker, ZeuS
2014 – Reign
2016 – Locky, Tinba, Mirai
2017 – WannaCry, Petya
Two (static) detection paradigms
8
Signatures
• Highly specific, often to a single
family or variant
• Often straightforward to evade
• Low false positive rate
• Often fail on new malware
Machine learning
• Looks for statistical patterns that
suggest “this is a malicious
program”
• Evasive techniques not yet well
developed
• Higher false positive rate
• Often does quite well on new
malware
Complementary; not mutually exclusive approaches
A crash primer on deep learning
A toy problem
12
Benign filesEach dot is a PE file
Malicious files
What we want
11
Benign regions
Malware region
Training the model
12
Entropy
String length
Malware score
Compare with
ground truth
Error-correct all weights
…and repeat until it works
Randomly
sample the
training data
Ground Truth
Recipe for an amazing ML classifier
13
A lot of
training
time
14
But.
…and six weeks later, we have this.
15
Our model performance begins to decay
16
Some clusters have significant errors
Some clusters still mostly correct
Machine learning models decay in informative ways
• Decay in performance happens
because the data changes
• More decay means larger changes
in data
17
Model confidence
18
Alice replied: 'what's the answer?'
'I haven't the slightest idea,' said the Hatter.
(Lewis Carroll, 1871)
Intuition: “borderline” files are likely misclassified
19
Intuition: “distant” files are likely misclassified
20
Do it automatically
21
• “Wiggle the lines” a bit
• Do the resulting classifications
agree or disagree on a region?
• Amount of agreement =
“Confidence”
https://arxiv.org/pdf/1609.02226.pdf
Fitted Learning: Models with Awareness of their Limits
Navid Kardan, Kenneth O. Stanley
Do it automatically
22
Key takeaway:
• High confidence ≈ Model has
seen data like this before!
• Low confidence ≈ This data
“looks new”!
https://arxiv.org/pdf/1609.02226.pdf
Fitted Learning: Models with Awareness of their Limits
Navid Kardan, Kenneth O. Stanley
Looking at historical data with confidence
23
"It's a poor sort of memory that only works backwards," the Queen
remarked.
(Lewis Carroll, 1871)
Our model
Go from this… To this…
24
1024 Inputs
512 Nodes
512 Nodes
512 Nodes
512 Nodes
Output
Using confidence to examine changes in malware distribution
25
• Collect data for each month of 2017 (3M samples, unique sha256 values)
• Train a model on one month (e.g. January)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan
model
Using confidence to examine changes in malware distribution
26
• Collect data for each month of 2017 (3M samples, unique sha256 values)
• Train a model on one month (e.g. January)
• Evaluate it on data from all future months and record the number of
high/low confidence samples
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan
model
(etc.)
Look at change in high/low confidence samples
27
• Train January mode; count low-
confidence samples for following
months
• And for February
• And so on
• Remember:
o Low-confidence = “Looks new”
Same thing for high confidence samples
28
• Remember:
o High confidence = “Looks like original
data”
Both forms of decay show noisy but clear trends
29
Estimate the rates with a best-fit line
58
Examining changes within a single family
31
“I wonder if I've been changed in the night? Let me think. Was I the
same when I got up this morning?”
(Lewis Carroll, 1871)
Confidence over time for individual families
32
Collection of
WannaCry/HWorld
samples
January 2017
Training Data
Deeplearning
model
Number of low
confidence samples
February 2017
Training Data
March 2017
Training Data
Number of high
confidence samples
Confidence over time
for individual families
• Proportion of samples
for family scoring as
high/low confidence vs
model month
61
61
Samples first appear in training data
Samples first appear in training data
61
WannaCry/high confidence:
dips as low as 70% after
appearing in training data
Hworld/high confidence:
Never less than 84% after
appearing in training data
61
WannaCry/low confidence:
9% down to 0.2% after
appearing in training data
Hworld/low confidence:
1.3% down to 0.0008% after
appearing in training data
61
56% of WannaCry samples
high-confidence before first
appearance in training data;
99.98% detection rate in
this subset
Distance measures from training data
• Large distances = larger change in
statistical properties of the sample
o New family? Significant variant of
existing one?
• Look at distances from one month
to a later one for samples from the
same family
38
January 2017 to May 2017
39
• Changes in the feature
representation of samples lead to
changes in distance
Distances to closest family member in training data
63
41
“This thing, what is it in itself, in its own constitution?”
(Marcus Aurelius, Meditations)
Distance and new family detection
Distance measures from training data
• Distance to the nearest point of
any type in the training data
• Examine against model confidence
• Don’t need labels!
42
Distances – July data
to nearest point in
January data
Drill into clusters potentially worth examining
further.
• Mal/Behav-238 – 1468 samples
• Mal/VB-ZS – 7236 samples
• Troj/Inject-CMP – 6426 samples
• Mal/Generic-S – 318 samples
• ICLoader PUA – 124 samples
… And several clusters of apparently benign
samples
67
44
“Begin at the beginning," the King said, very gravely, "and go
on till you come to the end: then stop.”
(Lewis Carroll, 1871)
Conclusion
•ML models decay in interesting ways: this makes them
useful as analytic tools as well as just simple classifiers
oConfidence measures – population and family drift
o Distance metrics – family stability, novel family detection
69
Practical takeaways
• ML and “old school” malware detection are complementary
o ML can sometimes detect novel malware; compute and use confidence metrics
• The rate of change of existing malware – from the ML perspective – is slow
o Retiring seems to be more common than innovation
• There are large error bars on these estimates, and will vary by model and
data set, but…
o Expect to see a turnover of about 1% per quarter of established samples being
replaced by novel (from the ML perspective) samples
o About 4% per quarter of your most identifiable samples will be retired
70
Additional thanks to…
• Richard Cohen and Sophos Labs
• Josh Saxe and the rest of the DS team
• BlackHat staff and support
• … and John Tenniel for the illustrations
• Code + tools coming soon: https://github.com/inv-ds-research/red_queens_race
71
Measuring the Speed of the Red Queen's Race; Adaption and Evasion in Malware

More Related Content

Similar to Measuring the Speed of the Red Queen's Race; Adaption and Evasion in Malware

From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
Alex Pinto
 
rsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morning
Jeff Heaton
 

Similar to Measuring the Speed of the Red Queen's Race; Adaption and Evasion in Malware (20)

Data Science 101
Data Science 101Data Science 101
Data Science 101
 
Core Methods In Educational Data Mining
Core Methods In Educational Data MiningCore Methods In Educational Data Mining
Core Methods In Educational Data Mining
 
SOC2016 - The Investigation Labyrinth
SOC2016 - The Investigation LabyrinthSOC2016 - The Investigation Labyrinth
SOC2016 - The Investigation Labyrinth
 
From Replication Crisis to Credibility Revolution
From Replication Crisis to Credibility RevolutionFrom Replication Crisis to Credibility Revolution
From Replication Crisis to Credibility Revolution
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
 
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
 
Discovery Analytics: Tracking Ebola Spread
Discovery Analytics: Tracking Ebola SpreadDiscovery Analytics: Tracking Ebola Spread
Discovery Analytics: Tracking Ebola Spread
 
The zen of predictive modelling
The zen of predictive modellingThe zen of predictive modelling
The zen of predictive modelling
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
 
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
 
Coding qualitative data for non-researchers
Coding qualitative data for non-researchersCoding qualitative data for non-researchers
Coding qualitative data for non-researchers
 
CS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptxCS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptx
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Using Bioinformatics Data to inform Therapeutics discovery and development
Using Bioinformatics Data to inform Therapeutics discovery and developmentUsing Bioinformatics Data to inform Therapeutics discovery and development
Using Bioinformatics Data to inform Therapeutics discovery and development
 
rsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morning
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
 
Stance classification. Presentation at QMUL 11 Nov 2020
Stance classification. Presentation at QMUL 11 Nov 2020Stance classification. Presentation at QMUL 11 Nov 2020
Stance classification. Presentation at QMUL 11 Nov 2020
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron Schwartz
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
 

More from Priyanka Aash

More from Priyanka Aash (20)

Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
 
Verizon Breach Investigation Report (VBIR).pdf
Verizon Breach Investigation Report (VBIR).pdfVerizon Breach Investigation Report (VBIR).pdf
Verizon Breach Investigation Report (VBIR).pdf
 
Top 10 Security Risks .pptx.pdf
Top 10 Security Risks .pptx.pdfTop 10 Security Risks .pptx.pdf
Top 10 Security Risks .pptx.pdf
 
Simplifying data privacy and protection.pdf
Simplifying data privacy and protection.pdfSimplifying data privacy and protection.pdf
Simplifying data privacy and protection.pdf
 
Generative AI and Security (1).pptx.pdf
Generative AI and Security (1).pptx.pdfGenerative AI and Security (1).pptx.pdf
Generative AI and Security (1).pptx.pdf
 
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdf
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdfEVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdf
EVERY ATTACK INVOLVES EXPLOITATION OF A WEAKNESS.pdf
 
DPDP Act 2023.pdf
DPDP Act 2023.pdfDPDP Act 2023.pdf
DPDP Act 2023.pdf
 
Cyber Truths_Are you Prepared version 1.1.pptx.pdf
Cyber Truths_Are you Prepared version 1.1.pptx.pdfCyber Truths_Are you Prepared version 1.1.pptx.pdf
Cyber Truths_Are you Prepared version 1.1.pptx.pdf
 
Cyber Crisis Management.pdf
Cyber Crisis Management.pdfCyber Crisis Management.pdf
Cyber Crisis Management.pdf
 
CISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdfCISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdf
 
Chennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdfChennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdf
 
Cloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdfCloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdf
 
Stories From The Web 3 Battlefield
Stories From The Web 3 BattlefieldStories From The Web 3 Battlefield
Stories From The Web 3 Battlefield
 
Lessons Learned From Ransomware Attacks
Lessons Learned From Ransomware AttacksLessons Learned From Ransomware Attacks
Lessons Learned From Ransomware Attacks
 
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
 
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
 
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
 
Cloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow LogsCloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow Logs
 
Cyber Security Governance
Cyber Security GovernanceCyber Security Governance
Cyber Security Governance
 
Ethical Hacking
Ethical HackingEthical Hacking
Ethical Hacking
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Measuring the Speed of the Red Queen's Race; Adaption and Evasion in Malware

  • 1. Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team
  • 2. Who we are • Rich Harang @rharang; richard.harang@sophos.com o Research Director at Sophos – PhD UCSB; formerly scientist at U.S. Army Research Laboratory; 8 years working at the intersection of machine learning, security, and privacy • Felipe Ducau @fel_d; felipe.ducau@sophos.com o Principal Data Scientist at Sophos – MS NYU Center for Data Science; specializes in design and evaluation of deep learning models 3
  • 3. Data Science @ Sophos You? 2
  • 4. The talk in three bullets 4 • The threat landscape is constantly changing; detection strategies decay • Knowing something about how fast and in what way the threat landscape is changing lets us plan for the future • Machine learning detection strategies decay in interesting ways that tell us useful things about these changes
  • 5. Important caveats 5 •A lot of details are omitted for time •We’re data scientists first and foremost, so… o Advance apologies for any mistakes oOur conclusions are machine-learning centric
  • 6. 6 "Now here, you see, it takes all the running you can do to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!“ (Lewis Carroll, 1871)
  • 7. Faster and faster… 7 Vandalism 1986 – Brain virus 1988 – Morris worm 1990 – 1260 polymorphic virus 1991 – Norton Antivirus, EICAR founded, antivirus industry starts in earnest 1995 – Concept virus RATs, loggers, bots 2002 – Beast RAT 2003 – Blaster worm/DDoS 2004 – MyDoom worm/DDoS 2004 – Cabir: first mobile phone worm 2004 – Nuclear RAT 2005 – Bifrost RAT 2008-2009 – Conficker variants Crimeware, weapons 2010 – Koobface 2011 – Duqu 2012 – Flame, Shamoon 2013 – Cryptolocker, ZeuS 2014 – Reign 2016 – Locky, Tinba, Mirai 2017 – WannaCry, Petya
  • 8. Two (static) detection paradigms 8 Signatures • Highly specific, often to a single family or variant • Often straightforward to evade • Low false positive rate • Often fail on new malware Machine learning • Looks for statistical patterns that suggest “this is a malicious program” • Evasive techniques not yet well developed • Higher false positive rate • Often does quite well on new malware Complementary; not mutually exclusive approaches
  • 9. A crash primer on deep learning
  • 10. A toy problem 12 Benign filesEach dot is a PE file Malicious files
  • 11. What we want 11 Benign regions Malware region
  • 12. Training the model 12 Entropy String length Malware score Compare with ground truth Error-correct all weights …and repeat until it works Randomly sample the training data Ground Truth
  • 13. Recipe for an amazing ML classifier 13 A lot of training time
  • 15. …and six weeks later, we have this. 15
  • 16. Our model performance begins to decay 16 Some clusters have significant errors Some clusters still mostly correct
  • 17. Machine learning models decay in informative ways • Decay in performance happens because the data changes • More decay means larger changes in data 17
  • 18. Model confidence 18 Alice replied: 'what's the answer?' 'I haven't the slightest idea,' said the Hatter. (Lewis Carroll, 1871)
  • 19. Intuition: “borderline” files are likely misclassified 19
  • 20. Intuition: “distant” files are likely misclassified 20
  • 21. Do it automatically 21 • “Wiggle the lines” a bit • Do the resulting classifications agree or disagree on a region? • Amount of agreement = “Confidence” https://arxiv.org/pdf/1609.02226.pdf Fitted Learning: Models with Awareness of their Limits Navid Kardan, Kenneth O. Stanley
  • 22. Do it automatically 22 Key takeaway: • High confidence ≈ Model has seen data like this before! • Low confidence ≈ This data “looks new”! https://arxiv.org/pdf/1609.02226.pdf Fitted Learning: Models with Awareness of their Limits Navid Kardan, Kenneth O. Stanley
  • 23. Looking at historical data with confidence 23 "It's a poor sort of memory that only works backwards," the Queen remarked. (Lewis Carroll, 1871)
  • 24. Our model Go from this… To this… 24 1024 Inputs 512 Nodes 512 Nodes 512 Nodes 512 Nodes Output
  • 25. Using confidence to examine changes in malware distribution 25 • Collect data for each month of 2017 (3M samples, unique sha256 values) • Train a model on one month (e.g. January) Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan model
  • 26. Using confidence to examine changes in malware distribution 26 • Collect data for each month of 2017 (3M samples, unique sha256 values) • Train a model on one month (e.g. January) • Evaluate it on data from all future months and record the number of high/low confidence samples Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan model (etc.)
  • 27. Look at change in high/low confidence samples 27 • Train January mode; count low- confidence samples for following months • And for February • And so on • Remember: o Low-confidence = “Looks new”
  • 28. Same thing for high confidence samples 28 • Remember: o High confidence = “Looks like original data”
  • 29. Both forms of decay show noisy but clear trends 29
  • 30. Estimate the rates with a best-fit line 58
  • 31. Examining changes within a single family 31 “I wonder if I've been changed in the night? Let me think. Was I the same when I got up this morning?” (Lewis Carroll, 1871)
  • 32. Confidence over time for individual families 32 Collection of WannaCry/HWorld samples January 2017 Training Data Deeplearning model Number of low confidence samples February 2017 Training Data March 2017 Training Data Number of high confidence samples
  • 33. Confidence over time for individual families • Proportion of samples for family scoring as high/low confidence vs model month 61
  • 34. 61 Samples first appear in training data Samples first appear in training data
  • 35. 61 WannaCry/high confidence: dips as low as 70% after appearing in training data Hworld/high confidence: Never less than 84% after appearing in training data
  • 36. 61 WannaCry/low confidence: 9% down to 0.2% after appearing in training data Hworld/low confidence: 1.3% down to 0.0008% after appearing in training data
  • 37. 61 56% of WannaCry samples high-confidence before first appearance in training data; 99.98% detection rate in this subset
  • 38. Distance measures from training data • Large distances = larger change in statistical properties of the sample o New family? Significant variant of existing one? • Look at distances from one month to a later one for samples from the same family 38
  • 39. January 2017 to May 2017 39 • Changes in the feature representation of samples lead to changes in distance
  • 40. Distances to closest family member in training data 63
  • 41. 41 “This thing, what is it in itself, in its own constitution?” (Marcus Aurelius, Meditations) Distance and new family detection
  • 42. Distance measures from training data • Distance to the nearest point of any type in the training data • Examine against model confidence • Don’t need labels! 42
  • 43. Distances – July data to nearest point in January data Drill into clusters potentially worth examining further. • Mal/Behav-238 – 1468 samples • Mal/VB-ZS – 7236 samples • Troj/Inject-CMP – 6426 samples • Mal/Generic-S – 318 samples • ICLoader PUA – 124 samples … And several clusters of apparently benign samples 67
  • 44. 44 “Begin at the beginning," the King said, very gravely, "and go on till you come to the end: then stop.” (Lewis Carroll, 1871)
  • 45. Conclusion •ML models decay in interesting ways: this makes them useful as analytic tools as well as just simple classifiers oConfidence measures – population and family drift o Distance metrics – family stability, novel family detection 69
  • 46. Practical takeaways • ML and “old school” malware detection are complementary o ML can sometimes detect novel malware; compute and use confidence metrics • The rate of change of existing malware – from the ML perspective – is slow o Retiring seems to be more common than innovation • There are large error bars on these estimates, and will vary by model and data set, but… o Expect to see a turnover of about 1% per quarter of established samples being replaced by novel (from the ML perspective) samples o About 4% per quarter of your most identifiable samples will be retired 70
  • 47. Additional thanks to… • Richard Cohen and Sophos Labs • Josh Saxe and the rest of the DS team • BlackHat staff and support • … and John Tenniel for the illustrations • Code + tools coming soon: https://github.com/inv-ds-research/red_queens_race 71