SlideShare a Scribd company logo
1 of 9
Download to read offline
SO YOU GOT A MODEL…
DR. SVEN KRASSER CHIEF SCIENTIST
@SVENKRASSER
A 5 MINUTE RUNDOWN OF THE COMMON AND NOT-SO-COMMON PITFALLS
OF APPLYING MACHINE LEARNING IN INFORMATION SECURITY
2017	CROWDSTRIKE,	INC.	ALL	RIGHTS	RESERVED.	
MACHINE LEARNING AT CROWDSTRIKE
§ ~40 billion events per day
§ ~800 thousand events per second peak
§ ~700 trillion bytes of sample data
§ Local decisions on endpoint and large scale analysis in cloud
§ Static and dynamic analysis techniques, various rich data sources
§ Analysts generating new ground truth 24/7
CHALLENGES FOR
APPLIED ML
2017	CROWDSTRIKE,	INC.	ALL	RIGHTS	RESERVED.	
FALSE POSITIVE RATE
§ Most events are associated with clean executions
§ Most files on a given system are clean
§ Therefore, even low FPRs cause large numbers of FPs
§ Industry expectations driven by performance of narrow signatures
2017	CROWDSTRIKE,	INC.	ALL	RIGHTS	RESERVED.	
Repeated independent trials guarantee adversary success
TRUE POSITIVE RATE
§ Security cannot be solved
with a single ML model
§ Need to consider various
data sources (pre and post-
execution)
§ Augment with non-ML
techniques
Chanceofatleastonesuccessforadversary
Number of attempts at 99% detection rate
1%
>99.3%
2017	CROWDSTRIKE,	INC.	ALL	RIGHTS	RESERVED.	
UNWIELDY DATA
§ Many outliers
§ Multimodal distributions
§ Sometimes narrow modes far apart
§ Adversary-controlled features
§ Mix of sparse/dense and
discrete/continuous features
2017	CROWDSTRIKE,	INC.	ALL	RIGHTS	RESERVED.	
Training set distribution generally differs from…
DIFFERENCE IN DISTRIBUTIONS
§ Real-world distribution (customer networks)
§ Evaluations (what customers test)
§ Testing houses (various 3rd party testers with varying methodologies)
§ Community resources (e.g. user submissions to CrowdStrike scanner on
VirusTotal)
2017	CROWDSTRIKE,	INC.	ALL	RIGHTS	RESERVED.	
Or: the second model needs to be cheaper
REPEATABLE SUCCESS
§ Retraining cadence
§ Concept drift
§ Changes in data content (e.g. event field definitions)
§ Changes in data distribution (e.g. event disposition)
§ Data cleansing is expensive (conventional wisdom)
§ Needs automation
§ Labeling can be expensive
§ Ephemeral instances (data content or distribution changed)
§ Lack of sufficient observations
§ Embeddings and intermediate models
§ Keep track of input data
§ Keep track of ground truth budget
IJCNN 2017

More Related Content

What's hot

Dev talks 2021 Data Science @crowdstrike
Dev talks 2021   Data Science @crowdstrikeDev talks 2021   Data Science @crowdstrike
Dev talks 2021 Data Science @crowdstrikeRuxandra Burtica
 
Episode IV: A New Scope
Episode IV: A New ScopeEpisode IV: A New Scope
Episode IV: A New ScopeThreatConnect
 
Threat Hunting Platforms (Collaboration with SANS Institute)
Threat Hunting Platforms (Collaboration with SANS Institute)Threat Hunting Platforms (Collaboration with SANS Institute)
Threat Hunting Platforms (Collaboration with SANS Institute)Sqrrl
 
The Art and Science of Alert Triage
The Art and Science of Alert TriageThe Art and Science of Alert Triage
The Art and Science of Alert TriageSqrrl
 
Modernizing Your SOC: A CISO-led Training
Modernizing Your SOC: A CISO-led TrainingModernizing Your SOC: A CISO-led Training
Modernizing Your SOC: A CISO-led TrainingSqrrl
 
Machine Learning for Incident Detection: Getting Started
Machine Learning for Incident Detection: Getting StartedMachine Learning for Incident Detection: Getting Started
Machine Learning for Incident Detection: Getting StartedSqrrl
 
Au cœur de la roadmap de la Suite Elastic
Au cœur de la roadmap de la Suite ElasticAu cœur de la roadmap de la Suite Elastic
Au cœur de la roadmap de la Suite ElasticElasticsearch
 
Grace Hopper Open Source Day Findings | Thorn & Cloudera Cares
Grace Hopper Open Source Day Findings | Thorn & Cloudera CaresGrace Hopper Open Source Day Findings | Thorn & Cloudera Cares
Grace Hopper Open Source Day Findings | Thorn & Cloudera CaresCloudera, Inc.
 
Managing Indicator Deprecation in ThreatConnect
Managing Indicator Deprecation in ThreatConnectManaging Indicator Deprecation in ThreatConnect
Managing Indicator Deprecation in ThreatConnectThreatConnect
 
The Security Industry is Suffering from Fragmentation, What Can Your Organiza...
The Security Industry is Suffering from Fragmentation, What Can Your Organiza...The Security Industry is Suffering from Fragmentation, What Can Your Organiza...
The Security Industry is Suffering from Fragmentation, What Can Your Organiza...ThreatConnect
 
University of Oxford: building a next generation SIEM
University of Oxford: building a next generation SIEMUniversity of Oxford: building a next generation SIEM
University of Oxford: building a next generation SIEMElasticsearch
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidImply
 
Art into Science 2017 - Investigation Theory: A Cognitive Approach
Art into Science 2017 - Investigation Theory: A Cognitive ApproachArt into Science 2017 - Investigation Theory: A Cognitive Approach
Art into Science 2017 - Investigation Theory: A Cognitive Approachchrissanders88
 
Troubleshooting your elasticsearch cluster like a support engineer
Troubleshooting your elasticsearch cluster like a support engineerTroubleshooting your elasticsearch cluster like a support engineer
Troubleshooting your elasticsearch cluster like a support engineerImma Valls Bernaus
 
Abstract Tools for Effective Threat Hunting
Abstract Tools for Effective Threat HuntingAbstract Tools for Effective Threat Hunting
Abstract Tools for Effective Threat Huntingchrissanders88
 
User and Entity Behavior Analytics using the Sqrrl Behavior Graph
User and Entity Behavior Analytics using the Sqrrl Behavior GraphUser and Entity Behavior Analytics using the Sqrrl Behavior Graph
User and Entity Behavior Analytics using the Sqrrl Behavior GraphSqrrl
 
Sqrrl 2.0 Launch Webinar
Sqrrl 2.0 Launch WebinarSqrrl 2.0 Launch Webinar
Sqrrl 2.0 Launch WebinarSqrrl
 

What's hot (20)

Dev talks 2021 Data Science @crowdstrike
Dev talks 2021   Data Science @crowdstrikeDev talks 2021   Data Science @crowdstrike
Dev talks 2021 Data Science @crowdstrike
 
Episode IV: A New Scope
Episode IV: A New ScopeEpisode IV: A New Scope
Episode IV: A New Scope
 
Threat Hunting Platforms (Collaboration with SANS Institute)
Threat Hunting Platforms (Collaboration with SANS Institute)Threat Hunting Platforms (Collaboration with SANS Institute)
Threat Hunting Platforms (Collaboration with SANS Institute)
 
Elastic Stack Roadmap
Elastic Stack RoadmapElastic Stack Roadmap
Elastic Stack Roadmap
 
SQRRL threat hunting platform
SQRRL threat hunting platformSQRRL threat hunting platform
SQRRL threat hunting platform
 
The Art and Science of Alert Triage
The Art and Science of Alert TriageThe Art and Science of Alert Triage
The Art and Science of Alert Triage
 
Modernizing Your SOC: A CISO-led Training
Modernizing Your SOC: A CISO-led TrainingModernizing Your SOC: A CISO-led Training
Modernizing Your SOC: A CISO-led Training
 
Machine Learning for Incident Detection: Getting Started
Machine Learning for Incident Detection: Getting StartedMachine Learning for Incident Detection: Getting Started
Machine Learning for Incident Detection: Getting Started
 
Au cœur de la roadmap de la Suite Elastic
Au cœur de la roadmap de la Suite ElasticAu cœur de la roadmap de la Suite Elastic
Au cœur de la roadmap de la Suite Elastic
 
Grace Hopper Open Source Day Findings | Thorn & Cloudera Cares
Grace Hopper Open Source Day Findings | Thorn & Cloudera CaresGrace Hopper Open Source Day Findings | Thorn & Cloudera Cares
Grace Hopper Open Source Day Findings | Thorn & Cloudera Cares
 
Managing Indicator Deprecation in ThreatConnect
Managing Indicator Deprecation in ThreatConnectManaging Indicator Deprecation in ThreatConnect
Managing Indicator Deprecation in ThreatConnect
 
NSL Thesis defense
NSL Thesis defenseNSL Thesis defense
NSL Thesis defense
 
The Security Industry is Suffering from Fragmentation, What Can Your Organiza...
The Security Industry is Suffering from Fragmentation, What Can Your Organiza...The Security Industry is Suffering from Fragmentation, What Can Your Organiza...
The Security Industry is Suffering from Fragmentation, What Can Your Organiza...
 
University of Oxford: building a next generation SIEM
University of Oxford: building a next generation SIEMUniversity of Oxford: building a next generation SIEM
University of Oxford: building a next generation SIEM
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache Druid
 
Art into Science 2017 - Investigation Theory: A Cognitive Approach
Art into Science 2017 - Investigation Theory: A Cognitive ApproachArt into Science 2017 - Investigation Theory: A Cognitive Approach
Art into Science 2017 - Investigation Theory: A Cognitive Approach
 
Troubleshooting your elasticsearch cluster like a support engineer
Troubleshooting your elasticsearch cluster like a support engineerTroubleshooting your elasticsearch cluster like a support engineer
Troubleshooting your elasticsearch cluster like a support engineer
 
Abstract Tools for Effective Threat Hunting
Abstract Tools for Effective Threat HuntingAbstract Tools for Effective Threat Hunting
Abstract Tools for Effective Threat Hunting
 
User and Entity Behavior Analytics using the Sqrrl Behavior Graph
User and Entity Behavior Analytics using the Sqrrl Behavior GraphUser and Entity Behavior Analytics using the Sqrrl Behavior Graph
User and Entity Behavior Analytics using the Sqrrl Behavior Graph
 
Sqrrl 2.0 Launch Webinar
Sqrrl 2.0 Launch WebinarSqrrl 2.0 Launch Webinar
Sqrrl 2.0 Launch Webinar
 

Similar to IJCNN 2017

Cloud-Enabled: The Future of Endpoint Security
Cloud-Enabled: The Future of Endpoint SecurityCloud-Enabled: The Future of Endpoint Security
Cloud-Enabled: The Future of Endpoint SecurityCrowdStrike
 
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...CA Technologies
 
MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globallyridhav
 
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaRISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaSpark Summit
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsJen Aman
 
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...Keith Kraus
 
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...CrowdStrike
 
High Performance Computing and the Opportunity with Cognitive Technology
 High Performance Computing and the Opportunity with Cognitive Technology High Performance Computing and the Opportunity with Cognitive Technology
High Performance Computing and the Opportunity with Cognitive TechnologyIBM Watson
 
Best Practices for implementing Database Security Comprehensive Database Secu...
Best Practices for implementing Database Security Comprehensive Database Secu...Best Practices for implementing Database Security Comprehensive Database Secu...
Best Practices for implementing Database Security Comprehensive Database Secu...Kal BO
 
Machine Learning + AI for Accelerated Threat-Hunting
Machine Learning + AI for Accelerated Threat-HuntingMachine Learning + AI for Accelerated Threat-Hunting
Machine Learning + AI for Accelerated Threat-HuntingInterset
 
Centralizing Data to Address Imperatives in Clinical Development
Centralizing Data to Address Imperatives in Clinical DevelopmentCentralizing Data to Address Imperatives in Clinical Development
Centralizing Data to Address Imperatives in Clinical DevelopmentSaama
 
Going eXtreme for Healthcare
Going eXtreme for HealthcareGoing eXtreme for Healthcare
Going eXtreme for HealthcareKoen Vanderkimpen
 
Audit Fundamentals and Compliance Success Infographic
Audit Fundamentals and Compliance Success InfographicAudit Fundamentals and Compliance Success Infographic
Audit Fundamentals and Compliance Success InfographicCollin Miles
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...DataStax
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesCambridge Semantics
 
Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS...
Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS...Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS...
Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS...Amazon Web Services
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream Inc.
 

Similar to IJCNN 2017 (20)

Cloud-Enabled: The Future of Endpoint Security
Cloud-Enabled: The Future of Endpoint SecurityCloud-Enabled: The Future of Endpoint Security
Cloud-Enabled: The Future of Endpoint Security
 
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...
 
MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globally
 
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaRISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
 
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
 
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
 
High Performance Computing and the Opportunity with Cognitive Technology
 High Performance Computing and the Opportunity with Cognitive Technology High Performance Computing and the Opportunity with Cognitive Technology
High Performance Computing and the Opportunity with Cognitive Technology
 
Best Practices for implementing Database Security Comprehensive Database Secu...
Best Practices for implementing Database Security Comprehensive Database Secu...Best Practices for implementing Database Security Comprehensive Database Secu...
Best Practices for implementing Database Security Comprehensive Database Secu...
 
Machine Learning + AI for Accelerated Threat-Hunting
Machine Learning + AI for Accelerated Threat-HuntingMachine Learning + AI for Accelerated Threat-Hunting
Machine Learning + AI for Accelerated Threat-Hunting
 
Centralizing Data to Address Imperatives in Clinical Development
Centralizing Data to Address Imperatives in Clinical DevelopmentCentralizing Data to Address Imperatives in Clinical Development
Centralizing Data to Address Imperatives in Clinical Development
 
Going eXtreme for Healthcare
Going eXtreme for HealthcareGoing eXtreme for Healthcare
Going eXtreme for Healthcare
 
Audit Fundamentals and Compliance Success Infographic
Audit Fundamentals and Compliance Success InfographicAudit Fundamentals and Compliance Success Infographic
Audit Fundamentals and Compliance Success Infographic
 
1330 keynote zoldi
1330 keynote zoldi1330 keynote zoldi
1330 keynote zoldi
 
1330 keynote Shahapurkar
1330 keynote Shahapurkar1330 keynote Shahapurkar
1330 keynote Shahapurkar
 
1330 keynote shahapurkar
1330 keynote shahapurkar1330 keynote shahapurkar
1330 keynote shahapurkar
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success Stories
 
Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS...
Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS...Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS...
Application Optimized Performance: Choosing the Right Instance (CPN212) | AWS...
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
 

Recently uploaded

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 

Recently uploaded (20)

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 

IJCNN 2017

  • 1. SO YOU GOT A MODEL… DR. SVEN KRASSER CHIEF SCIENTIST @SVENKRASSER A 5 MINUTE RUNDOWN OF THE COMMON AND NOT-SO-COMMON PITFALLS OF APPLYING MACHINE LEARNING IN INFORMATION SECURITY
  • 2. 2017 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. MACHINE LEARNING AT CROWDSTRIKE § ~40 billion events per day § ~800 thousand events per second peak § ~700 trillion bytes of sample data § Local decisions on endpoint and large scale analysis in cloud § Static and dynamic analysis techniques, various rich data sources § Analysts generating new ground truth 24/7
  • 4. 2017 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. FALSE POSITIVE RATE § Most events are associated with clean executions § Most files on a given system are clean § Therefore, even low FPRs cause large numbers of FPs § Industry expectations driven by performance of narrow signatures
  • 5. 2017 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. Repeated independent trials guarantee adversary success TRUE POSITIVE RATE § Security cannot be solved with a single ML model § Need to consider various data sources (pre and post- execution) § Augment with non-ML techniques Chanceofatleastonesuccessforadversary Number of attempts at 99% detection rate 1% >99.3%
  • 6. 2017 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. UNWIELDY DATA § Many outliers § Multimodal distributions § Sometimes narrow modes far apart § Adversary-controlled features § Mix of sparse/dense and discrete/continuous features
  • 7. 2017 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. Training set distribution generally differs from… DIFFERENCE IN DISTRIBUTIONS § Real-world distribution (customer networks) § Evaluations (what customers test) § Testing houses (various 3rd party testers with varying methodologies) § Community resources (e.g. user submissions to CrowdStrike scanner on VirusTotal)
  • 8. 2017 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. Or: the second model needs to be cheaper REPEATABLE SUCCESS § Retraining cadence § Concept drift § Changes in data content (e.g. event field definitions) § Changes in data distribution (e.g. event disposition) § Data cleansing is expensive (conventional wisdom) § Needs automation § Labeling can be expensive § Ephemeral instances (data content or distribution changed) § Lack of sufficient observations § Embeddings and intermediate models § Keep track of input data § Keep track of ground truth budget