SlideShare a Scribd company logo
1 of 45
1
Understand How Machine
Learning Defends Against
Zero-Day Threats
Vinoo Thomas
Senior Product Manager
Intel Security
Rahul Mohandas
Research Manager
Intel Security
Track Sponsored by:
2
Speakers
Vinoo Thomas
Senior Product Manager
Intel Security
Rahul Mohandas
Research Manager
Intel Security
3
Agenda
• Detection Challenges
• Machine Learning Approaches
• Modeling Machine Learning classifiers
• Attacks on Machine Learning Defenses
• Real Protect
• Deep Learning in Sandbox
To participate in the polling question, download the mobile app.
4
Detection Challenges
5
The Age of “Signatures” Is Fading
• This technique is reactive by nature. Although very precise, the
sheer number and growth in malware variants is making this
unsustainable
• Malware authors are continuously monitoring antivirus vendor
detection and releasing new variants
• Use of commercial, open source or underground packers and
protectors makes repacking new variants trivial
Signatures identify with near certainty that an object is either malicious or clean
1001010
1101010
1011101
010
6
Detection Challenges
Image: https://www2.picturepush.com
What did this
snake eat for
lunch? ;)
7
Unpacking Challenges
Think of it as a file, inside another executable file,
which can be inside another executable file
Think Russian dolls (Matryoshka)
When executed, the “outer” executable will unpack
the contents of the “inner” executable into memory
and execute it.
Image: https://www.pinterest.com
The innermost executable is the “real” executable!
8
Field Example—Mimikatz
Source: http://blog.gentilkiwi.com/mimikatz
9
Mimikatz—Compiled Binary
10
Mimikatz—Compiled Binary
11
Mimikatz Detection
Resources, strings, packer and compiler details,
compile time, API, and function calls are readily
available for authoring signatures.
Native binary has thousands of
interesting features!
Image: http://www.abcya.com/word_clouds.htm
12
Modifying A Compiled Binary
Source: http://www.gironsec.com
13
Mimikatz—Packed with MPRESS
14
Mimikatz—Post MPRESS
Previously available static features are destroyed
and made unavailable by the packer!
Limited choices available for authoring a generic
signature.
15
VBS/Houdini—Initial Variant
16
VBS/Houdini—Subsequent Variants
17
Machine Learning Approaches
18
Sources of Features
10010101
10101010
11101010
Static Analysis (file type, resources, meta-data)
Fuzzy Hashing (identical byte or checksum sequences)
Import Address Hash (function calls, order of function calls)
Dynamic Analysis (file system, registry, network behaviors)
Memory Analysis (process or system memory analysis)
19
Leveraging Multiple
Sources of Knowledge
• Identify a suspicious characteristic or activity
• The object is given a reputation and confidence level if
existing signatures based methods don’t detect
• Pre-execution: Static file feature extraction
(file type, import hash, entry point, resources, strings,
packer and compiler details, compile time, APIs, section
names)
• Post-execution: Behavioral features and memory analysis
(behavioral sequence, process tree, file system, registry
events, network communication events, mutex, strings from
memory)
A hybrid approach provides
the best classification rates!
20
Extracting Static Features
• File type, resources, and strings
• Packer and compiler details
• Compile time, entry point
• Import address hash,
• Function calls and APIs
Ransomware: CTB-Locker (pre-execution)
Image: http://www.abcya.com/word_clouds.htm
21
Extracting Behavioral Features
File system, registry and network changes actions it begins encrypting files
Ransomware: CTB-Locker (post-execution)
22
Building Feature Vectors
CreateProcess("c:userroamingmalware.exe")
CreateRegistryKey("HKLM","SoftwareCTB-Locker)
SetRegistryValue("InstallDate","213355533")
GetEntryPoint(“Return Address”, 55 EB)
Features
AF12ACE76D
F2A212AC6E
22F1CAFFA8
Features Hash
AF12ACE76D F2A212AC6E 22F1CAFFA8
BBAF11284E
BBAF11284E
Feature Vector
23
Unsupervised Machine Learning
Height
Weight
We are given a large set of dogs of different breeds (Chihuahuas, Beagles, Dachshunds)We can use two features to distinguish them - their height and weight.How can we determine which dog falls into which breed?
24
Similarity: Prototype-Based Clustering
Dogs
Chihuahuas
Beagles
Dachshunds
Euclidian distance
between two objects
Height
Weight
25
Similarity: Classification-Based on Clustering
Dogs
Beagle
Chihuahuas
Beagles
Dachshunds
Height
Weight
Euclidian distance
between two objects
26
Classification with Real Protect
Graphic representation of clusters with samples which are similar
27
Modeling Machine Learning Classifier
28
Modeling a Machine Learning Classifier
Input Data
• Executables, compiled code, documents
Feature Engineering
• N-grams, entropy of sections
Labels
• Is malicious or clean?
• Belongs to a certain family of malware
• Capabilities (keyloggers, backdoors)
Model
• Assigns a sample to an output class
• Support vector machines, Naïve Bayes,
random forests, neural networks
Output Layer
Hidden Layers Output Layer
29
Attacking Machine Learning Defenses
30
Exploratory: Obfuscate to Evade Detection
31
Causative: Poisoning Sample Collections
2. Submit samples to VirusTotal
or any other public malware
collection site
1. Insert signature
fragments into
clean files
4. Many vendors reshare the
samples and trust the
malicious classification
6. Potential FP
on clean files
by the model
5. Vendor using malicious
sample for training models
3. Trusted vendor
will start detecting
those files
32
Causative: Poisoning Sample Collections
Source: Virus Bulletin
33
Causative: Poisoning Sample Collections
Source: Reuters
34
Defenses Against Machine
Learning Attacks
Exploratory attack
• Training data: Prevent the attacker from knowing training
data
• Feature selection: Harden classifiers against attack by
using multiple features
Causative attack: Attacker has some degree of control
over the training data. Learning should be resilient to
poisoning attacks
• Do empirical analysis of training instances to make it more
resilient
• Human in loop approach
35
Introducing Real Protect
36
Real Protect
• Detects zero-day malware in near real time
• Classification of malware based on behavior and static analysis
• Uses machine learning to automate classification
• Signature-less, small client footprint
• Supports both offline mode and online mode (cloud) of classification
• Improves detection up to 30% on top of .DAT and McAfee® Global Threat Intelligence detections
• Augments McAfee endpoint security products for Windows
• Produces actionable threat intelligence
• Useful for patient zero discovery, threat actor attribution and forensic investigations
• Available now!
• Standalone: www.mcafee.com/us/downloads/free-tools/raptor.aspx
• Consumer Cloud AV product
• Enterprise availability in McAfee Endpoint Security 10.5 this year
37
McAfee® Endpoint Security 10 Threat Prevention
Layered Approach
Whitelisting (Hash + Cert)
.DAT
McAfee Global Threat Intelligence
McAfee Threat Intelligence Exchange (Hash + Cert)
Real Protect - Static
Dynamic App Containment
Real Protect - Behavioral
Threat
Prevention
Web Control
Firewall
TIE
Future Modules
Pre-Execution
Post-Execution
Post-Execution
38
Deep Learning in the Sandbox
39
ATDml technology in a Nutshell
ATDml = Signatureless deep learning classifier that leverages sandboxing technology to
achieve high-precision malware conviction rate
40
Deep Learning in the Sandbox
Malware samples
Sandbox
Original Binary
Feature Vector
Behavior
Trained
Parameters
Prediction
Training
Prediction
Framework
Feature Vector
Feature Normalization
Dimensionality reduction
Unpacked File
Deep Learning
Output Layer
Hidden Layers
Input Layer
41
What Are We Going to Demo Here?
1. Shows advanced ways of evading detection
by utilizing a crypter by adding static and
behavioral evasion
2. How deep learning in the sandbox is able to
detect the most evasive and previously
unseen malware
Unmask the
Attack
42
4
43
ATDml Detection
44
ATDml Value Proposition
1. Zero-day detection by deep analysis: Efficient
classification of new and previously unseen
malware by leveraging deep learning
2. Resilience to evasion: Model to be highly
resilient to evasive techniques used to bypass
detection
3. Identify intention of attack: Ability to bring in
malware attribution to identify the intention of
the attack
Intel and the Intel and McAfee logos are trademarks of Intel Corporation in the US and/or other countries. Other marks and brands may be claimed as the property of others. The product
plans, specifications and descriptions herein are provided for information only and subject to change without notice, and are provided without warranty of any kind, express or implied.
Copyright © 2016 Intel Corporation.

More Related Content

What's hot

From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
Priyanka Aash
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detection
Chong-Kuan Chen
 
Cyber_Attack_Forecasting_Jones_2015
Cyber_Attack_Forecasting_Jones_2015Cyber_Attack_Forecasting_Jones_2015
Cyber_Attack_Forecasting_Jones_2015
Malachi Jones
 
BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches f...
BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches f...BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches f...
BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches f...
BlueHat Security Conference
 

What's hot (20)

Applied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documentsApplied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documents
 
Machine learning cybersecurity boon or boondoggle
Machine learning cybersecurity boon or boondoggleMachine learning cybersecurity boon or boondoggle
Machine learning cybersecurity boon or boondoggle
 
Materials Project Validation, Provenance, and Sandboxes by Dan Gunter
Materials Project Validation, Provenance, and Sandboxes by Dan GunterMaterials Project Validation, Provenance, and Sandboxes by Dan Gunter
Materials Project Validation, Provenance, and Sandboxes by Dan Gunter
 
Malware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning PerspectiveMalware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning Perspective
 
Applied cognitive security complementing the security analyst
Applied cognitive security complementing the security analyst Applied cognitive security complementing the security analyst
Applied cognitive security complementing the security analyst
 
B-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive DefenseB-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive Defense
 
Analysis Of Adverarial Code - The Role of Malware Kits
Analysis Of Adverarial Code - The Role of Malware KitsAnalysis Of Adverarial Code - The Role of Malware Kits
Analysis Of Adverarial Code - The Role of Malware Kits
 
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
 
PHDays 2018 Threat Hunting Hands-On Lab
PHDays 2018 Threat Hunting Hands-On LabPHDays 2018 Threat Hunting Hands-On Lab
PHDays 2018 Threat Hunting Hands-On Lab
 
A malware detection method for health sensor data based on machine learning
A malware detection method for health sensor data based on machine learningA malware detection method for health sensor data based on machine learning
A malware detection method for health sensor data based on machine learning
 
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detection
 
Secure lab setup for cyber security
Secure lab setup for cyber securitySecure lab setup for cyber security
Secure lab setup for cyber security
 
Network Security Data Visualization
Network Security Data VisualizationNetwork Security Data Visualization
Network Security Data Visualization
 
Cyber_Attack_Forecasting_Jones_2015
Cyber_Attack_Forecasting_Jones_2015Cyber_Attack_Forecasting_Jones_2015
Cyber_Attack_Forecasting_Jones_2015
 
BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches f...
BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches f...BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches f...
BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches f...
 
Automating Analysis and Exploitation of Embedded Device Firmware
Automating Analysis and Exploitation of Embedded Device FirmwareAutomating Analysis and Exploitation of Embedded Device Firmware
Automating Analysis and Exploitation of Embedded Device Firmware
 
SmartphoneHacking_Android_Exploitation
SmartphoneHacking_Android_ExploitationSmartphoneHacking_Android_Exploitation
SmartphoneHacking_Android_Exploitation
 
IDSECCONF 2020 : A Tale Story of Building and Maturing Threat Hunting Program
IDSECCONF 2020 :  A Tale Story of Building and Maturing Threat Hunting ProgramIDSECCONF 2020 :  A Tale Story of Building and Maturing Threat Hunting Program
IDSECCONF 2020 : A Tale Story of Building and Maturing Threat Hunting Program
 
Adversarial machine learning for av software
Adversarial machine learning for av softwareAdversarial machine learning for av software
Adversarial machine learning for av software
 

Viewers also liked

AI approach to malware similarity analysis: Maping the malware genome with a...
AI approach to malware similarity analysis: Maping the  malware genome with a...AI approach to malware similarity analysis: Maping the  malware genome with a...
AI approach to malware similarity analysis: Maping the malware genome with a...
Priyanka Aash
 
Malware detection software using a support vector machine as a classifier
Malware detection software using a support vector machine as a classifierMalware detection software using a support vector machine as a classifier
Malware detection software using a support vector machine as a classifier
Nicole Bili?
 

Viewers also liked (20)

Intel's Machine Learning Strategy
Intel's Machine Learning StrategyIntel's Machine Learning Strategy
Intel's Machine Learning Strategy
 
Machine Learning and Internet of Things
Machine Learning and Internet of ThingsMachine Learning and Internet of Things
Machine Learning and Internet of Things
 
Classification of Malware based on Data Mining Approach
Classification of Malware based on Data Mining ApproachClassification of Malware based on Data Mining Approach
Classification of Malware based on Data Mining Approach
 
Malicious Client Detection Using Machine Learning
Malicious Client Detection Using Machine LearningMalicious Client Detection Using Machine Learning
Malicious Client Detection Using Machine Learning
 
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
 
T summit - nas dc - final - charles fadel
T summit - nas dc - final - charles fadelT summit - nas dc - final - charles fadel
T summit - nas dc - final - charles fadel
 
Big Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureBig Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the Future
 
AI approach to malware similarity analysis: Maping the malware genome with a...
AI approach to malware similarity analysis: Maping the  malware genome with a...AI approach to malware similarity analysis: Maping the  malware genome with a...
AI approach to malware similarity analysis: Maping the malware genome with a...
 
Introduction to Azure Machine Learning
Introduction to Azure Machine LearningIntroduction to Azure Machine Learning
Introduction to Azure Machine Learning
 
Practicing Law in the Age of Machine Intelligence
Practicing Law in the Age of Machine IntelligencePracticing Law in the Age of Machine Intelligence
Practicing Law in the Age of Machine Intelligence
 
Detection of Malware Downloads via Graph Mining (AsiaCCS '16)
Detection of Malware Downloads via Graph Mining (AsiaCCS '16)Detection of Malware Downloads via Graph Mining (AsiaCCS '16)
Detection of Malware Downloads via Graph Mining (AsiaCCS '16)
 
Malware detection software using a support vector machine as a classifier
Malware detection software using a support vector machine as a classifierMalware detection software using a support vector machine as a classifier
Malware detection software using a support vector machine as a classifier
 
Malware Analysis and Defeating using Virtual Machines
Malware Analysis and Defeating using Virtual MachinesMalware Analysis and Defeating using Virtual Machines
Malware Analysis and Defeating using Virtual Machines
 
Ensembled Based Categorization and Adaptive Learning Model for Malware Detection
Ensembled Based Categorization and Adaptive Learning Model for Malware DetectionEnsembled Based Categorization and Adaptive Learning Model for Malware Detection
Ensembled Based Categorization and Adaptive Learning Model for Malware Detection
 
Malware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesMalware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning Techniques
 
A Developer's Introduction to Azure Active Directory B2C
A Developer's Introduction to Azure Active Directory B2CA Developer's Introduction to Azure Active Directory B2C
A Developer's Introduction to Azure Active Directory B2C
 
Malware Detection using Machine Learning
Malware Detection using Machine Learning	Malware Detection using Machine Learning
Malware Detection using Machine Learning
 
Malware
MalwareMalware
Malware
 
Azure Machine Learning tutorial
Azure Machine Learning tutorialAzure Machine Learning tutorial
Azure Machine Learning tutorial
 
Machine Intelligence
Machine IntelligenceMachine Intelligence
Machine Intelligence
 

Similar to Understand How Machine Learning Defends Against Zero-Day Threats

Analisis Estatico y de Comportamiento de un Binario Malicioso
Analisis Estatico y de Comportamiento de un Binario MaliciosoAnalisis Estatico y de Comportamiento de un Binario Malicioso
Analisis Estatico y de Comportamiento de un Binario Malicioso
Conferencias FIST
 
Reacting to Advanced, Unknown Attacks in Real-Time with Lastline
Reacting to Advanced, Unknown Attacks in Real-Time with LastlineReacting to Advanced, Unknown Attacks in Real-Time with Lastline
Reacting to Advanced, Unknown Attacks in Real-Time with Lastline
Lastline, Inc.
 
Penetration testing, What’s this?
Penetration testing, What’s this?Penetration testing, What’s this?
Penetration testing, What’s this?
Dmitry Evteev
 

Similar to Understand How Machine Learning Defends Against Zero-Day Threats (20)

Advanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security ManagementAdvanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security Management
 
Malware Classification and Analysis
Malware Classification and AnalysisMalware Classification and Analysis
Malware Classification and Analysis
 
Cyber Threat Hunting with Phirelight
Cyber Threat Hunting with PhirelightCyber Threat Hunting with Phirelight
Cyber Threat Hunting with Phirelight
 
IRJET- Penetration Testing using Metasploit Framework: An Ethical Approach
IRJET- Penetration Testing using Metasploit Framework: An Ethical ApproachIRJET- Penetration Testing using Metasploit Framework: An Ethical Approach
IRJET- Penetration Testing using Metasploit Framework: An Ethical Approach
 
H@dfex 2015 malware analysis
H@dfex 2015   malware analysisH@dfex 2015   malware analysis
H@dfex 2015 malware analysis
 
Analisis Estatico y de Comportamiento de un Binario Malicioso
Analisis Estatico y de Comportamiento de un Binario MaliciosoAnalisis Estatico y de Comportamiento de un Binario Malicioso
Analisis Estatico y de Comportamiento de un Binario Malicioso
 
Advanced Persistent Threats
Advanced Persistent ThreatsAdvanced Persistent Threats
Advanced Persistent Threats
 
Sasa milic, cisco advanced malware protection
Sasa milic, cisco advanced malware protectionSasa milic, cisco advanced malware protection
Sasa milic, cisco advanced malware protection
 
Path of Cyber Security
Path of Cyber SecurityPath of Cyber Security
Path of Cyber Security
 
Path of Cyber Security
Path of Cyber SecurityPath of Cyber Security
Path of Cyber Security
 
Ethical hacking and cyber security intro
Ethical hacking and cyber security introEthical hacking and cyber security intro
Ethical hacking and cyber security intro
 
Malware Analysis
Malware AnalysisMalware Analysis
Malware Analysis
 
Advanced Threat Defense Intel Security
Advanced Threat Defense  Intel SecurityAdvanced Threat Defense  Intel Security
Advanced Threat Defense Intel Security
 
Malware Static Analysis
Malware Static AnalysisMalware Static Analysis
Malware Static Analysis
 
Reacting to Advanced, Unknown Attacks in Real-Time with Lastline
Reacting to Advanced, Unknown Attacks in Real-Time with LastlineReacting to Advanced, Unknown Attacks in Real-Time with Lastline
Reacting to Advanced, Unknown Attacks in Real-Time with Lastline
 
Penetration testing, What’s this?
Penetration testing, What’s this?Penetration testing, What’s this?
Penetration testing, What’s this?
 
detection and classification of malware.pptx
detection and classification of malware.pptxdetection and classification of malware.pptx
detection and classification of malware.pptx
 
Cyber Threat Hunting Workshop
Cyber Threat Hunting WorkshopCyber Threat Hunting Workshop
Cyber Threat Hunting Workshop
 
Cyber threat-hunting---part-2-25062021-095909pm
Cyber threat-hunting---part-2-25062021-095909pmCyber threat-hunting---part-2-25062021-095909pm
Cyber threat-hunting---part-2-25062021-095909pm
 
Design and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLDesign and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using ML
 

Recently uploaded

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Recently uploaded (20)

%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

Understand How Machine Learning Defends Against Zero-Day Threats

  • 1. 1 Understand How Machine Learning Defends Against Zero-Day Threats Vinoo Thomas Senior Product Manager Intel Security Rahul Mohandas Research Manager Intel Security Track Sponsored by:
  • 2. 2 Speakers Vinoo Thomas Senior Product Manager Intel Security Rahul Mohandas Research Manager Intel Security
  • 3. 3 Agenda • Detection Challenges • Machine Learning Approaches • Modeling Machine Learning classifiers • Attacks on Machine Learning Defenses • Real Protect • Deep Learning in Sandbox To participate in the polling question, download the mobile app.
  • 5. 5 The Age of “Signatures” Is Fading • This technique is reactive by nature. Although very precise, the sheer number and growth in malware variants is making this unsustainable • Malware authors are continuously monitoring antivirus vendor detection and releasing new variants • Use of commercial, open source or underground packers and protectors makes repacking new variants trivial Signatures identify with near certainty that an object is either malicious or clean 1001010 1101010 1011101 010
  • 7. 7 Unpacking Challenges Think of it as a file, inside another executable file, which can be inside another executable file Think Russian dolls (Matryoshka) When executed, the “outer” executable will unpack the contents of the “inner” executable into memory and execute it. Image: https://www.pinterest.com The innermost executable is the “real” executable!
  • 11. 11 Mimikatz Detection Resources, strings, packer and compiler details, compile time, API, and function calls are readily available for authoring signatures. Native binary has thousands of interesting features! Image: http://www.abcya.com/word_clouds.htm
  • 12. 12 Modifying A Compiled Binary Source: http://www.gironsec.com
  • 14. 14 Mimikatz—Post MPRESS Previously available static features are destroyed and made unavailable by the packer! Limited choices available for authoring a generic signature.
  • 18. 18 Sources of Features 10010101 10101010 11101010 Static Analysis (file type, resources, meta-data) Fuzzy Hashing (identical byte or checksum sequences) Import Address Hash (function calls, order of function calls) Dynamic Analysis (file system, registry, network behaviors) Memory Analysis (process or system memory analysis)
  • 19. 19 Leveraging Multiple Sources of Knowledge • Identify a suspicious characteristic or activity • The object is given a reputation and confidence level if existing signatures based methods don’t detect • Pre-execution: Static file feature extraction (file type, import hash, entry point, resources, strings, packer and compiler details, compile time, APIs, section names) • Post-execution: Behavioral features and memory analysis (behavioral sequence, process tree, file system, registry events, network communication events, mutex, strings from memory) A hybrid approach provides the best classification rates!
  • 20. 20 Extracting Static Features • File type, resources, and strings • Packer and compiler details • Compile time, entry point • Import address hash, • Function calls and APIs Ransomware: CTB-Locker (pre-execution) Image: http://www.abcya.com/word_clouds.htm
  • 21. 21 Extracting Behavioral Features File system, registry and network changes actions it begins encrypting files Ransomware: CTB-Locker (post-execution)
  • 22. 22 Building Feature Vectors CreateProcess("c:userroamingmalware.exe") CreateRegistryKey("HKLM","SoftwareCTB-Locker) SetRegistryValue("InstallDate","213355533") GetEntryPoint(“Return Address”, 55 EB) Features AF12ACE76D F2A212AC6E 22F1CAFFA8 Features Hash AF12ACE76D F2A212AC6E 22F1CAFFA8 BBAF11284E BBAF11284E Feature Vector
  • 23. 23 Unsupervised Machine Learning Height Weight We are given a large set of dogs of different breeds (Chihuahuas, Beagles, Dachshunds)We can use two features to distinguish them - their height and weight.How can we determine which dog falls into which breed?
  • 25. 25 Similarity: Classification-Based on Clustering Dogs Beagle Chihuahuas Beagles Dachshunds Height Weight Euclidian distance between two objects
  • 26. 26 Classification with Real Protect Graphic representation of clusters with samples which are similar
  • 28. 28 Modeling a Machine Learning Classifier Input Data • Executables, compiled code, documents Feature Engineering • N-grams, entropy of sections Labels • Is malicious or clean? • Belongs to a certain family of malware • Capabilities (keyloggers, backdoors) Model • Assigns a sample to an output class • Support vector machines, Naïve Bayes, random forests, neural networks Output Layer Hidden Layers Output Layer
  • 30. 30 Exploratory: Obfuscate to Evade Detection
  • 31. 31 Causative: Poisoning Sample Collections 2. Submit samples to VirusTotal or any other public malware collection site 1. Insert signature fragments into clean files 4. Many vendors reshare the samples and trust the malicious classification 6. Potential FP on clean files by the model 5. Vendor using malicious sample for training models 3. Trusted vendor will start detecting those files
  • 32. 32 Causative: Poisoning Sample Collections Source: Virus Bulletin
  • 33. 33 Causative: Poisoning Sample Collections Source: Reuters
  • 34. 34 Defenses Against Machine Learning Attacks Exploratory attack • Training data: Prevent the attacker from knowing training data • Feature selection: Harden classifiers against attack by using multiple features Causative attack: Attacker has some degree of control over the training data. Learning should be resilient to poisoning attacks • Do empirical analysis of training instances to make it more resilient • Human in loop approach
  • 36. 36 Real Protect • Detects zero-day malware in near real time • Classification of malware based on behavior and static analysis • Uses machine learning to automate classification • Signature-less, small client footprint • Supports both offline mode and online mode (cloud) of classification • Improves detection up to 30% on top of .DAT and McAfee® Global Threat Intelligence detections • Augments McAfee endpoint security products for Windows • Produces actionable threat intelligence • Useful for patient zero discovery, threat actor attribution and forensic investigations • Available now! • Standalone: www.mcafee.com/us/downloads/free-tools/raptor.aspx • Consumer Cloud AV product • Enterprise availability in McAfee Endpoint Security 10.5 this year
  • 37. 37 McAfee® Endpoint Security 10 Threat Prevention Layered Approach Whitelisting (Hash + Cert) .DAT McAfee Global Threat Intelligence McAfee Threat Intelligence Exchange (Hash + Cert) Real Protect - Static Dynamic App Containment Real Protect - Behavioral Threat Prevention Web Control Firewall TIE Future Modules Pre-Execution Post-Execution Post-Execution
  • 38. 38 Deep Learning in the Sandbox
  • 39. 39 ATDml technology in a Nutshell ATDml = Signatureless deep learning classifier that leverages sandboxing technology to achieve high-precision malware conviction rate
  • 40. 40 Deep Learning in the Sandbox Malware samples Sandbox Original Binary Feature Vector Behavior Trained Parameters Prediction Training Prediction Framework Feature Vector Feature Normalization Dimensionality reduction Unpacked File Deep Learning Output Layer Hidden Layers Input Layer
  • 41. 41 What Are We Going to Demo Here? 1. Shows advanced ways of evading detection by utilizing a crypter by adding static and behavioral evasion 2. How deep learning in the sandbox is able to detect the most evasive and previously unseen malware Unmask the Attack
  • 42. 42 4
  • 44. 44 ATDml Value Proposition 1. Zero-day detection by deep analysis: Efficient classification of new and previously unseen malware by leveraging deep learning 2. Resilience to evasion: Model to be highly resilient to evasive techniques used to bypass detection 3. Identify intention of attack: Ability to bring in malware attribution to identify the intention of the attack
  • 45. Intel and the Intel and McAfee logos are trademarks of Intel Corporation in the US and/or other countries. Other marks and brands may be claimed as the property of others. The product plans, specifications and descriptions herein are provided for information only and subject to change without notice, and are provided without warranty of any kind, express or implied. Copyright © 2016 Intel Corporation.

Editor's Notes

  1. Polymorphic & Metamorphic Malware Rootkits and bootkits Sandbox aware malware Attacks on Disassembly and Packing Behavioral Polymorphism
  2. Created using: http://www.abcya.com/word_clouds.htm
  3. Inspired by the inner working of human brain Loose model of human brain that could be programmed in a computer Neural network learns from observational data, figuring out its own solution to the problem. Used in areas such as pattern recognition and data classification
  4. Nop insertion Register renaming Junk insertion Instruction reordering Encryption Compression Branch condition modification Instruction substitution OS Fingerprinting Interaction based System Tampering Latent Execution Hypervisor detection Basic block reordering