SlideShare a Scribd company logo
How to NEUTRALIZE Machine Learning based
Anti-Malware Software
JunSeok Seo (boanproject) + JaeHwan Kim (Korea Univ)
2017. 7. 12
Who we are
• Jun-Seok, Seo (nababora)
• Vice President of Boanprjoect ( start-up )
• Study for Teaching – Vuln Analysis, IoT, ML, Malware
• Interested in AI, ML, especially ‘adversarial ML’
• nababora@naver.com
• Jae-Hwan, Kim
• Researcher, Data Scientist
• Interested in Machine Learning for data analysis
• edenkim519@korea.ac.kr
Background
• We live in the data-driven world, everything is data
• We have no choice but to use ‘data’, ‘machine learning’, ‘AI’
• AI uses machine learning as a core engine
• Machine learning is de facto ultimate solution in information security...?!
• Can we fully trust decision made by machines ?
What if ?!
ML in Information Security
• Spam Filtering
• Based on probabiility of each word in e-mail contents
• Network Traffic Analysis
• Find malicious traffic with anomaly detection
• Incident Prevention & Response
• Find abnormal ‘PATTERN’ in data ( system log, traffic, application log, etc )
• Malware Detection
• What I am going to show you today
What is ML
• Machine Learning is
• computers the ability to learn without being explicitly programmed
• explores the study and construction of algorithms that can learn from and
make predictions on data
• It is just the way of drawing a line ( what ? how ? where ? )
ML Process
[ Gutierrez-Osuna (2011), PRISM ]
‘ FEATURE ’ is the key !
• Probability distribution
• Correlation Analysis
• Euclidean Distance
• Entropy
• Bayes Theorem
• BackPropagation
• ...
So, How to learn?
http://www.hongyusu.com/programming/2015/10/10/novelty-detection/
• Advarsarial Machine Learning is
• research field that lies at the intersection of ML and computer security
• it aims not only to violate security, but also to compromise the learnability
• Arms Race Problem
• arms race between the adversary and the learner
• ‘reactive’ ( security by obscurity ) / ‘proactive’ ( security by design )
Advarsarial Machine Learning
• EvadeML: forces a malicious PDF detector(ML) make wrong predictions
• https://githubcom/uvasrg/EvadeML
• AdversariaLib: algorithms focused on sklearn and neural networks
• http//pralab.diee.unica.it/en/AdversariaLIb
• Explaining and Harnessing Adversarial Examples
• https://pdfs.semanticscholar.org/bee0/44c8e8903fb67523c1f8c105ab4718600
cdb.pdf
• Pwning Deep Learning Systems
• https://www.slideshare.net/ClarenceChio/machine-duping-101-pwning-deep-
learning-systems
Adversaral Examples
• Spam Filtering with ‘WORD’ probability
• It’s black boxed, but
Examples: Spam Filtering
https://alexn.org/blog/2012/02/09/howto-build-naive-bayes-classifier.html
Attack Taxonomy
Causative
(online learning)
Exploratory
Targeted
Classifier is mis-trained on
particular positive samples
Misclassifying a specific subset
of positive samples
Indiscriminate
Classifier is mis-trained
generally on positive samples
Misclassifying positive samples
generally
on training phase on testing phase
• Targeted Exploratory Integrity Attack (TEIA)
• It’s based on the ‘Game Theory’ - maximize the false negative
• condition: ‘the number of permitted queries is sufficiently large’
• but, can you understand this formula?
* False Negative - a test result indicates that a condition failed, while it was successful
Attack Taxonomy
Intuition, rather than formula
Attack Taxonomy
Attack Model
Causative
Exploratory
Adversary Knowledge
Black
box
Zero Knowledge = only input and output
1 or 0
Adversary Knowledge
Testing / training Samples
Adversary Knowledge
Features
Adversary Knowledge
Architecture
Scores
Adversary Knowledge
Hyper-Parameters
Training Tools
Adversary Knowledge
Hyper-Parameters
Training Tools
Architecture
Scores
Testing / training SamplesFeatures
In the real world, none of them are available !
Can you find a sniper ?!
Adversarial Environment
• build own features, parameters, models as many as possible
• As if adversary has knowledge of ‘4 key factors’ (white-box)
• Only validation process is done in black-box environment
Learning Testing
real world
application
validation
repeat until complete mission !
√ virusshare
√ malwr
√ random
√ malwr
√ win default
√ portable app
√ pe header
√ section info
√ packer(yara)
√ entropy
√ n-gram
√ image
√ API
√ Behavior
virustotal
check
√ benign.csv
√ malware.csv
√ benign_images
√ mal_images
√ neural network
√ svm
√ random forest
√ adaboost
√ shuffle data
√ cross-validation
√ unseen sample
Malware
Detection
System
Malware Detection System
It will be uploaded to Github soon !
‘ Python + Scikit-Learn + Tensorflow ’
Metadata Code Pattern
Static
API API sequence
Dynamic
Function Type
Image
Feture Extraction
• Only focused on 32-bit PE malwares
Future Extraction
• Metadata
• PE header + Section information
• Total 68 features → Thanks to ClaMP(https://github.com/urwithajit9/ClaMP)
• originally 69 features, 69th is ‘packertype’ (one-hot encoding → 173 features)
Future Extraction
• Code Pattern
• extract code pattern from disassembled code ← ‘code’ section
• using n-gram analysis used in text-mining area: 4-gram
mov cx ,count
mov dx,13
mov ah,2
int 21h
mov dl,10
mov ah,2
int 21h
loop first
mov ax,4c00h
int 21h
1-gram 4-gram
1: {mov, 6}
2: {int, 3}
3: {loop, 1}
1: {mov mov mov int, 1}
2: {mov mov int mov, 1}
3: {mov int mov mov, 1}
3: {int mov mov int, 1}
3: {mov mov int loop,1}
3: {mov int loop mov,1}
3: {int loop mov int,1}
Future Extraction
• Image
• PE file into image ( gray scale )
• file size is different – different image size → make thumbnail : 256 x 256
(80 / 20) n.feat SVM R.F Ada DNN CNN
PE 68 91.3 % 97.5 % 95.7 % 92.8 % -
PE + Packer 173 91.8 % 99.8 % 99.8 % 93.8 % -
N-gram 10000 87.3 % 99.9 % 100 % 100 % -
Image 28 x 28 - - - - 99.8 %
Modeling
• Result
• Using 10-Fold cross validation
• 30000 malware samples / 4000 benign samples
• Accuracy
1024 deep x 4 layer
MY TARGET !
Attack Scenario
ML model
ML model
mal & benign
samples
load
adversarial
module
extract features get_proba
+ virustotal
modeling
• Target : randomforest and CNN(deep learning) model
1. Get probability of sample RandomForest
2. Get feature importance from randomforest
3. Feature analysis ( divided into 4 class )
4. Overwrite output features and find critical point
5. Disguise a malware as a benign sample
6. Validation
Attack Process
• scikit-learn provides predict_prob ← predict class probabilities
• adversary can estimate the impact of modification using this function
Predict_proba
• using randomforest, you can get feature importance of all features
• there is no principle feature → top1 feature only has 12% importance
• so, just top 20 features are used for disguise
Feature Importance
• draw histogram, boxplot from all feature vectors
• categorize features into four classes and compare them witth importance data
• distribution almost same / different number of outlier → 9 / 18
• different distribution → 4 / 7
• similar distribution → 7 / 19
• almost same → 0 / 24
Feature Analysis
• Just overwrite feature array(benign → malware) by each class from feature
analysis
• for 100 percent probability malware sample ( 0 : 100 )
• just one class – probability changed to 90 % ( 10 : 90 )
• two class – probability does not changed ( still 10: 90 )
• three class – probability dropped to 35% ( 65 : 35 ) ← bypass classifier !
Overwrite Headers
Extract Features Predict proba [ 0.25 0.75 ]
overwrite
benign: [ 1 0 1 0 1 ]
malware: [ 0 1 0 1 0 ]
malware: [ 1 0 1 0 1 ]
benign malware
• overwrite extracted features ← meaningless!
• need to change the binary itself
• ok to overwrite (39) – timestamp, checksum, characteristics, linker info, etc
• need to care specifications (5) - entropy of whole file, sections, entrypoint, filesize
• After overwrite features from benign sample into malware sample ( 39 features )
• Probability dropped 15 % ( 0 : 100 → 15 : 85 )
• VirusTotal result : 38 → 32 ( what the ?! )
File modification
• I just wrote adversarial attack code for my own ML model, but ?!
• decided to keep checking the virustotal report ☺
File modification
• Entropy is a measure of unpredictability of the state, or equivalently, of its
average information content
• Entropy of file or specific section can be used as a feature for ML
• It’s not a simple job to change entropy of a binary
File modification
• fit malware’s entropy to benign sample
File modification
DOS Header
NT Headers
Section Header
Section Header
.text
.data
DOS Header
NT Headers
Section Header
Section Header
.text
.data
injected byte
injected byte
injected byte
DOS Header
NT Headers
Section Header
Section Header
.text
.data
malware benign
fit code
section
fit data
section
fit whole
file
• After changed both 39 features info + entropy
• Virustotal detection dropped to 26 !
File modification
• Actually, I didn’t count the impact of API malware used
• I’m curious, so packing the malware and same test again
File modification
← detection rate dropped after simply packed original file
← adversarial attack on packed file
• then, what about ‘wannacry’ malware sample ?!
• pick a random sample from my dataset and query to virustotal
• ok, let’s start ☺
Model Validation
• first step > after pass the binary to adversarial model ( benign: procexp.exe )
• second step > pass the binary(from first step) to my ML model
Model Validation
couldn’t bypass my ML model 
• third step > upx packing and adversarial model
• fourth step > query to the virustotal ( upx + adv )
???? BYPASSED ML BASED AV ~ yeah~ ☺
Model Validation
still... malware
• If AV use deep learning to classify malware
• Candidate model
• DNN – nothing different than other machine learning algorithm ( just deep neural network )
• CNN – using binary image as features
• RNN(api sequence) – using behavior analysis, extract api sequence info from executing
• main idea – add garbage byte to the end of the binary. That’s it !
Adversarial Deep Learning
random
byte
Summary
• develop adversarial model just using static features ( PE metadata )
• even build your own model → doesn’t tell you the exact answer
• UPX can be used as a camouflage tool
• extract as many features as you can → lead to robust adversarial model
• adversarial model can affect traditional av software ( signature based )
• Expand Feature Vector – API, behavior information
• Reinforcement Machine Learning Model – automatic adversarial attack
• Virustotal Detection Rate ‘Zero’
• Develop adversarial testing framework for anti-virus software
Future work
• Can Machine Learning Be Secure? - Marco Barreno et al
• Adversarial Machine Learning – J.D. Tygar
• Adversarial and Secure Machine Learning – Huang Xiao
• Adversarial Reinforcement Learning – William Uther et al
• Adversarial Machine Learning – Ling Huang
• Adversarial Examples in the physical world – Alexey Kurakin et al
• Adversarial Examples in Machine Learning – Nicolas Papernot
• Explaining and harnessing adversarial examples – Ian J. Goodfellow et al
• Machine Learning in adversarial environments – Pavel Laskov et al
References
Thank you
any question? nababora@naver.com

More Related Content

What's hot

Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
Kush Kulshrestha
 
Simple overview of machine learning
Simple overview of machine learningSimple overview of machine learning
Simple overview of machine learning
priyadharshini R
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Ganesh Satpute
 
Practical Applications of Machine Learning in Cybersecurity
Practical Applications of Machine Learning in CybersecurityPractical Applications of Machine Learning in Cybersecurity
Practical Applications of Machine Learning in Cybersecurity
scoopnewsgroup
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
Gianmario Spacagna
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
Pattern Recognition.pptx
Pattern Recognition.pptxPattern Recognition.pptx
Pattern Recognition.pptx
hafeez504942
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
Andrew Ferlitsch
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
Hitesh Mohapatra
 
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
Edureka!
 
Machine learning
Machine learningMachine learning
Machine learning
Dr Geetha Mohan
 
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptx
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptxCardiovascular Disease Prediction Using Machine Learning Approaches.pptx
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptx
Taminul Islam
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairness
Manojit Nandi
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
Melanie Swan
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
Bill Liu
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
Impetus Technologies
 
Uncertainty in Deep Learning
Uncertainty in Deep LearningUncertainty in Deep Learning
Uncertainty in Deep Learning
Roberto Pereira Silveira
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Dataconomy Media
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
Krishnaram Kenthapadi
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
Saurabh Kaushik
 

What's hot (20)

Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Simple overview of machine learning
Simple overview of machine learningSimple overview of machine learning
Simple overview of machine learning
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Practical Applications of Machine Learning in Cybersecurity
Practical Applications of Machine Learning in CybersecurityPractical Applications of Machine Learning in Cybersecurity
Practical Applications of Machine Learning in Cybersecurity
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Pattern Recognition.pptx
Pattern Recognition.pptxPattern Recognition.pptx
Pattern Recognition.pptx
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
 
Machine learning
Machine learningMachine learning
Machine learning
 
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptx
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptxCardiovascular Disease Prediction Using Machine Learning Approaches.pptx
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptx
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairness
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
 
Uncertainty in Deep Learning
Uncertainty in Deep LearningUncertainty in Deep Learning
Uncertainty in Deep Learning
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
 

Viewers also liked

Cognitive Computing in Security with AI
Cognitive Computing in Security with AI Cognitive Computing in Security with AI
Cognitive Computing in Security with AI
JoAnna Cheshire
 
Malware Detection using Machine Learning
Malware Detection using Machine Learning	Malware Detection using Machine Learning
Malware Detection using Machine Learning
Cysinfo Cyber Security Community
 
Malware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesMalware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning Techniques
ArshadRaja786
 
Checkmate to crypto malware. Scacco matto ai crypto malware
Checkmate to crypto malware. Scacco matto ai crypto malwareCheckmate to crypto malware. Scacco matto ai crypto malware
Checkmate to crypto malware. Scacco matto ai crypto malware
Gianfranco Tonello
 
AI approach to malware similarity analysis: Maping the malware genome with a...
AI approach to malware similarity analysis: Maping the  malware genome with a...AI approach to malware similarity analysis: Maping the  malware genome with a...
AI approach to malware similarity analysis: Maping the malware genome with a...
Priyanka Aash
 
Machine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and ClusteringMachine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and Clustering
EndgameInc
 
Battling Unknown Malware with Machine Learning
Battling Unknown Malware with Machine Learning Battling Unknown Malware with Machine Learning
Battling Unknown Malware with Machine Learning
CrowdStrike
 
Automated In-memory Malware/Rootkit Detection via Binary Analysis and Machin...
Automated In-memory Malware/Rootkit  Detection via Binary Analysis and Machin...Automated In-memory Malware/Rootkit  Detection via Binary Analysis and Machin...
Automated In-memory Malware/Rootkit Detection via Binary Analysis and Machin...
Malachi Jones
 
In that case, we have an OWASP Top 10 opportunity...
In that case, we have an OWASP Top 10 opportunity...In that case, we have an OWASP Top 10 opportunity...
In that case, we have an OWASP Top 10 opportunity...
Josh Grossman
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
MLconf
 

Viewers also liked (10)

Cognitive Computing in Security with AI
Cognitive Computing in Security with AI Cognitive Computing in Security with AI
Cognitive Computing in Security with AI
 
Malware Detection using Machine Learning
Malware Detection using Machine Learning	Malware Detection using Machine Learning
Malware Detection using Machine Learning
 
Malware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesMalware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning Techniques
 
Checkmate to crypto malware. Scacco matto ai crypto malware
Checkmate to crypto malware. Scacco matto ai crypto malwareCheckmate to crypto malware. Scacco matto ai crypto malware
Checkmate to crypto malware. Scacco matto ai crypto malware
 
AI approach to malware similarity analysis: Maping the malware genome with a...
AI approach to malware similarity analysis: Maping the  malware genome with a...AI approach to malware similarity analysis: Maping the  malware genome with a...
AI approach to malware similarity analysis: Maping the malware genome with a...
 
Machine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and ClusteringMachine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and Clustering
 
Battling Unknown Malware with Machine Learning
Battling Unknown Malware with Machine Learning Battling Unknown Malware with Machine Learning
Battling Unknown Malware with Machine Learning
 
Automated In-memory Malware/Rootkit Detection via Binary Analysis and Machin...
Automated In-memory Malware/Rootkit  Detection via Binary Analysis and Machin...Automated In-memory Malware/Rootkit  Detection via Binary Analysis and Machin...
Automated In-memory Malware/Rootkit Detection via Binary Analysis and Machin...
 
In that case, we have an OWASP Top 10 opportunity...
In that case, we have an OWASP Top 10 opportunity...In that case, we have an OWASP Top 10 opportunity...
In that case, we have an OWASP Top 10 opportunity...
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
 

Similar to Adversarial machine learning for av software

Design and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLDesign and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using ML
Siva krishnam raju Patsamatla
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
Rahul Mohandas
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
Rahul Mohandas
 
Subverting Machine Learning Detections for fun and profit
Subverting Machine Learning Detections for fun and profitSubverting Machine Learning Detections for fun and profit
Subverting Machine Learning Detections for fun and profit
Ram Shankar Siva Kumar
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
pseudor00t overflow
 
High time to add machine learning to your information security stack
High time to add machine learning to your information security stackHigh time to add machine learning to your information security stack
High time to add machine learning to your information security stack
Minhaz A V
 
Malware collection and analysis
Malware collection and analysisMalware collection and analysis
Malware collection and analysis
Chong-Kuan Chen
 
AI Security : Machine Learning, Deep Learning and Computer Vision Security
AI Security : Machine Learning, Deep Learning and Computer Vision SecurityAI Security : Machine Learning, Deep Learning and Computer Vision Security
AI Security : Machine Learning, Deep Learning and Computer Vision Security
Cihan Özhan
 
Malware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning PerspectiveMalware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning Perspective
Chong-Kuan Chen
 
Evade and bypass AV with MSF
Evade and bypass AV with MSFEvade and bypass AV with MSF
Evade and bypass AV with MSF
Abdul Adil
 
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDefcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
DaveEdwards12
 
Open Anti-Cheat System (OACS)
Open Anti-Cheat System (OACS)Open Anti-Cheat System (OACS)
Open Anti-Cheat System (OACS)
Stephen Larroque
 
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDefcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Priyanka Aash
 
The Future of Automated Malware Generation
The Future of Automated Malware GenerationThe Future of Automated Malware Generation
The Future of Automated Malware Generation
Stephan Chenette
 
"Threat Model Every Story": Practical Continuous Threat Modeling Work for You...
"Threat Model Every Story": Practical Continuous Threat Modeling Work for You..."Threat Model Every Story": Practical Continuous Threat Modeling Work for You...
"Threat Model Every Story": Practical Continuous Threat Modeling Work for You...
Izar Tarandach
 
ANALYZE'15 - Bulk Malware Analysis at Scale
ANALYZE'15 - Bulk Malware Analysis at ScaleANALYZE'15 - Bulk Malware Analysis at Scale
ANALYZE'15 - Bulk Malware Analysis at Scale
John Bambenek
 
Protect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying TechniquesProtect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying Techniques
Leo Loobeek
 
Ember
EmberEmber
Ember
mrphilroth
 
Malware Classification and Analysis
Malware Classification and AnalysisMalware Classification and Analysis
Malware Classification and Analysis
Prashant Chopra
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Alex Pinto
 

Similar to Adversarial machine learning for av software (20)

Design and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLDesign and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using ML
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
Subverting Machine Learning Detections for fun and profit
Subverting Machine Learning Detections for fun and profitSubverting Machine Learning Detections for fun and profit
Subverting Machine Learning Detections for fun and profit
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
 
High time to add machine learning to your information security stack
High time to add machine learning to your information security stackHigh time to add machine learning to your information security stack
High time to add machine learning to your information security stack
 
Malware collection and analysis
Malware collection and analysisMalware collection and analysis
Malware collection and analysis
 
AI Security : Machine Learning, Deep Learning and Computer Vision Security
AI Security : Machine Learning, Deep Learning and Computer Vision SecurityAI Security : Machine Learning, Deep Learning and Computer Vision Security
AI Security : Machine Learning, Deep Learning and Computer Vision Security
 
Malware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning PerspectiveMalware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning Perspective
 
Evade and bypass AV with MSF
Evade and bypass AV with MSFEvade and bypass AV with MSF
Evade and bypass AV with MSF
 
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDefcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
 
Open Anti-Cheat System (OACS)
Open Anti-Cheat System (OACS)Open Anti-Cheat System (OACS)
Open Anti-Cheat System (OACS)
 
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDefcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
 
The Future of Automated Malware Generation
The Future of Automated Malware GenerationThe Future of Automated Malware Generation
The Future of Automated Malware Generation
 
"Threat Model Every Story": Practical Continuous Threat Modeling Work for You...
"Threat Model Every Story": Practical Continuous Threat Modeling Work for You..."Threat Model Every Story": Practical Continuous Threat Modeling Work for You...
"Threat Model Every Story": Practical Continuous Threat Modeling Work for You...
 
ANALYZE'15 - Bulk Malware Analysis at Scale
ANALYZE'15 - Bulk Malware Analysis at ScaleANALYZE'15 - Bulk Malware Analysis at Scale
ANALYZE'15 - Bulk Malware Analysis at Scale
 
Protect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying TechniquesProtect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying Techniques
 
Ember
EmberEmber
Ember
 
Malware Classification and Analysis
Malware Classification and AnalysisMalware Classification and Analysis
Malware Classification and Analysis
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 

Recently uploaded

Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
ginni singh$A17
 
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
6459astrid
 
Exclusive Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
Exclusive Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...Exclusive Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
Exclusive Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
sheetal singh$A17
 
🚂🚘 Premium Girls Call Nashik 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Nashik  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...🚂🚘 Premium Girls Call Nashik  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Nashik 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
kuldeepsharmaks8120
 
Semantic Web and organizational data .pptx
Semantic Web and organizational data .pptxSemantic Web and organizational data .pptx
Semantic Web and organizational data .pptx
Kanchana Weerasinghe
 
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
revolutionary575
 
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
tanupasswan6
 
DataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptxDataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptx
Kanchana Weerasinghe
 
VIP Kanpur Girls Call Kanpur 0X0000000X Doorstep High-Profile Girl Service Ca...
VIP Kanpur Girls Call Kanpur 0X0000000X Doorstep High-Profile Girl Service Ca...VIP Kanpur Girls Call Kanpur 0X0000000X Doorstep High-Profile Girl Service Ca...
VIP Kanpur Girls Call Kanpur 0X0000000X Doorstep High-Profile Girl Service Ca...
satpalsheravatmumbai
 
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
weiwchu
 
Cyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & PricingCyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & Pricing
BaraDaniel1
 
🚂🚘 Premium Girls Call Guwahati 🛵🚡000XX00000 💃 Choose Best And Top Girl Servi...
🚂🚘 Premium Girls Call Guwahati  🛵🚡000XX00000 💃 Choose Best And Top Girl Servi...🚂🚘 Premium Girls Call Guwahati  🛵🚡000XX00000 💃 Choose Best And Top Girl Servi...
🚂🚘 Premium Girls Call Guwahati 🛵🚡000XX00000 💃 Choose Best And Top Girl Servi...
kuldeepsharmaks8120
 
BDSM Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDelivery
BDSM Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeliveryBDSM Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDelivery
BDSM Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDelivery
erynsouthern
 
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion dataTowards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Samuel Jackson
 
DU degree offer diploma Transcript
DU degree offer diploma TranscriptDU degree offer diploma Transcript
DU degree offer diploma Transcript
uapta
 
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
rightmanforbloodline
 
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
ginni singh$A17
 
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
45unexpected
 
VVIP Girls Call Noida 9873940964 Provide Best And Top Girl Service And No1 in...
VVIP Girls Call Noida 9873940964 Provide Best And Top Girl Service And No1 in...VVIP Girls Call Noida 9873940964 Provide Best And Top Girl Service And No1 in...
VVIP Girls Call Noida 9873940964 Provide Best And Top Girl Service And No1 in...
Ak47
 
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
Grant McAlister
 

Recently uploaded (20)

Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
 
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
Exclusive Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
Exclusive Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...Exclusive Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
Exclusive Girls Call Noida 🎈🔥9873940964 🔥💋🎈 Provide Best And Top Girl Service...
 
🚂🚘 Premium Girls Call Nashik 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Nashik  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...🚂🚘 Premium Girls Call Nashik  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Nashik 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
 
Semantic Web and organizational data .pptx
Semantic Web and organizational data .pptxSemantic Web and organizational data .pptx
Semantic Web and organizational data .pptx
 
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
Celebrity Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servi...
 
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
 
DataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptxDataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptx
 
VIP Kanpur Girls Call Kanpur 0X0000000X Doorstep High-Profile Girl Service Ca...
VIP Kanpur Girls Call Kanpur 0X0000000X Doorstep High-Profile Girl Service Ca...VIP Kanpur Girls Call Kanpur 0X0000000X Doorstep High-Profile Girl Service Ca...
VIP Kanpur Girls Call Kanpur 0X0000000X Doorstep High-Profile Girl Service Ca...
 
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
 
Cyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & PricingCyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & Pricing
 
🚂🚘 Premium Girls Call Guwahati 🛵🚡000XX00000 💃 Choose Best And Top Girl Servi...
🚂🚘 Premium Girls Call Guwahati  🛵🚡000XX00000 💃 Choose Best And Top Girl Servi...🚂🚘 Premium Girls Call Guwahati  🛵🚡000XX00000 💃 Choose Best And Top Girl Servi...
🚂🚘 Premium Girls Call Guwahati 🛵🚡000XX00000 💃 Choose Best And Top Girl Servi...
 
BDSM Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDelivery
BDSM Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeliveryBDSM Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDelivery
BDSM Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDelivery
 
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion dataTowards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
 
DU degree offer diploma Transcript
DU degree offer diploma TranscriptDU degree offer diploma Transcript
DU degree offer diploma Transcript
 
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
 
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
 
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
Female Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service A...
 
VVIP Girls Call Noida 9873940964 Provide Best And Top Girl Service And No1 in...
VVIP Girls Call Noida 9873940964 Provide Best And Top Girl Service And No1 in...VVIP Girls Call Noida 9873940964 Provide Best And Top Girl Service And No1 in...
VVIP Girls Call Noida 9873940964 Provide Best And Top Girl Service And No1 in...
 
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
 

Adversarial machine learning for av software

  • 1. How to NEUTRALIZE Machine Learning based Anti-Malware Software JunSeok Seo (boanproject) + JaeHwan Kim (Korea Univ) 2017. 7. 12
  • 2. Who we are • Jun-Seok, Seo (nababora) • Vice President of Boanprjoect ( start-up ) • Study for Teaching – Vuln Analysis, IoT, ML, Malware • Interested in AI, ML, especially ‘adversarial ML’ • nababora@naver.com • Jae-Hwan, Kim • Researcher, Data Scientist • Interested in Machine Learning for data analysis • edenkim519@korea.ac.kr
  • 3. Background • We live in the data-driven world, everything is data • We have no choice but to use ‘data’, ‘machine learning’, ‘AI’ • AI uses machine learning as a core engine • Machine learning is de facto ultimate solution in information security...?! • Can we fully trust decision made by machines ?
  • 5. ML in Information Security • Spam Filtering • Based on probabiility of each word in e-mail contents • Network Traffic Analysis • Find malicious traffic with anomaly detection • Incident Prevention & Response • Find abnormal ‘PATTERN’ in data ( system log, traffic, application log, etc ) • Malware Detection • What I am going to show you today
  • 6. What is ML • Machine Learning is • computers the ability to learn without being explicitly programmed • explores the study and construction of algorithms that can learn from and make predictions on data • It is just the way of drawing a line ( what ? how ? where ? )
  • 7. ML Process [ Gutierrez-Osuna (2011), PRISM ] ‘ FEATURE ’ is the key !
  • 8. • Probability distribution • Correlation Analysis • Euclidean Distance • Entropy • Bayes Theorem • BackPropagation • ... So, How to learn? http://www.hongyusu.com/programming/2015/10/10/novelty-detection/
  • 9. • Advarsarial Machine Learning is • research field that lies at the intersection of ML and computer security • it aims not only to violate security, but also to compromise the learnability • Arms Race Problem • arms race between the adversary and the learner • ‘reactive’ ( security by obscurity ) / ‘proactive’ ( security by design ) Advarsarial Machine Learning
  • 10. • EvadeML: forces a malicious PDF detector(ML) make wrong predictions • https://githubcom/uvasrg/EvadeML • AdversariaLib: algorithms focused on sklearn and neural networks • http//pralab.diee.unica.it/en/AdversariaLIb • Explaining and Harnessing Adversarial Examples • https://pdfs.semanticscholar.org/bee0/44c8e8903fb67523c1f8c105ab4718600 cdb.pdf • Pwning Deep Learning Systems • https://www.slideshare.net/ClarenceChio/machine-duping-101-pwning-deep- learning-systems Adversaral Examples
  • 11. • Spam Filtering with ‘WORD’ probability • It’s black boxed, but Examples: Spam Filtering https://alexn.org/blog/2012/02/09/howto-build-naive-bayes-classifier.html
  • 12. Attack Taxonomy Causative (online learning) Exploratory Targeted Classifier is mis-trained on particular positive samples Misclassifying a specific subset of positive samples Indiscriminate Classifier is mis-trained generally on positive samples Misclassifying positive samples generally on training phase on testing phase
  • 13. • Targeted Exploratory Integrity Attack (TEIA) • It’s based on the ‘Game Theory’ - maximize the false negative • condition: ‘the number of permitted queries is sufficiently large’ • but, can you understand this formula? * False Negative - a test result indicates that a condition failed, while it was successful Attack Taxonomy
  • 14. Intuition, rather than formula Attack Taxonomy
  • 16. Adversary Knowledge Black box Zero Knowledge = only input and output 1 or 0
  • 17. Adversary Knowledge Testing / training Samples
  • 21. Adversary Knowledge Hyper-Parameters Training Tools Architecture Scores Testing / training SamplesFeatures In the real world, none of them are available !
  • 22. Can you find a sniper ?!
  • 23. Adversarial Environment • build own features, parameters, models as many as possible • As if adversary has knowledge of ‘4 key factors’ (white-box) • Only validation process is done in black-box environment Learning Testing real world application validation repeat until complete mission !
  • 24. √ virusshare √ malwr √ random √ malwr √ win default √ portable app √ pe header √ section info √ packer(yara) √ entropy √ n-gram √ image √ API √ Behavior virustotal check √ benign.csv √ malware.csv √ benign_images √ mal_images √ neural network √ svm √ random forest √ adaboost √ shuffle data √ cross-validation √ unseen sample Malware Detection System Malware Detection System It will be uploaded to Github soon ! ‘ Python + Scikit-Learn + Tensorflow ’
  • 25. Metadata Code Pattern Static API API sequence Dynamic Function Type Image Feture Extraction • Only focused on 32-bit PE malwares
  • 26. Future Extraction • Metadata • PE header + Section information • Total 68 features → Thanks to ClaMP(https://github.com/urwithajit9/ClaMP) • originally 69 features, 69th is ‘packertype’ (one-hot encoding → 173 features)
  • 27. Future Extraction • Code Pattern • extract code pattern from disassembled code ← ‘code’ section • using n-gram analysis used in text-mining area: 4-gram mov cx ,count mov dx,13 mov ah,2 int 21h mov dl,10 mov ah,2 int 21h loop first mov ax,4c00h int 21h 1-gram 4-gram 1: {mov, 6} 2: {int, 3} 3: {loop, 1} 1: {mov mov mov int, 1} 2: {mov mov int mov, 1} 3: {mov int mov mov, 1} 3: {int mov mov int, 1} 3: {mov mov int loop,1} 3: {mov int loop mov,1} 3: {int loop mov int,1}
  • 28. Future Extraction • Image • PE file into image ( gray scale ) • file size is different – different image size → make thumbnail : 256 x 256
  • 29. (80 / 20) n.feat SVM R.F Ada DNN CNN PE 68 91.3 % 97.5 % 95.7 % 92.8 % - PE + Packer 173 91.8 % 99.8 % 99.8 % 93.8 % - N-gram 10000 87.3 % 99.9 % 100 % 100 % - Image 28 x 28 - - - - 99.8 % Modeling • Result • Using 10-Fold cross validation • 30000 malware samples / 4000 benign samples • Accuracy 1024 deep x 4 layer MY TARGET !
  • 30. Attack Scenario ML model ML model mal & benign samples load adversarial module extract features get_proba + virustotal modeling
  • 31. • Target : randomforest and CNN(deep learning) model 1. Get probability of sample RandomForest 2. Get feature importance from randomforest 3. Feature analysis ( divided into 4 class ) 4. Overwrite output features and find critical point 5. Disguise a malware as a benign sample 6. Validation Attack Process
  • 32. • scikit-learn provides predict_prob ← predict class probabilities • adversary can estimate the impact of modification using this function Predict_proba
  • 33. • using randomforest, you can get feature importance of all features • there is no principle feature → top1 feature only has 12% importance • so, just top 20 features are used for disguise Feature Importance
  • 34. • draw histogram, boxplot from all feature vectors • categorize features into four classes and compare them witth importance data • distribution almost same / different number of outlier → 9 / 18 • different distribution → 4 / 7 • similar distribution → 7 / 19 • almost same → 0 / 24 Feature Analysis
  • 35. • Just overwrite feature array(benign → malware) by each class from feature analysis • for 100 percent probability malware sample ( 0 : 100 ) • just one class – probability changed to 90 % ( 10 : 90 ) • two class – probability does not changed ( still 10: 90 ) • three class – probability dropped to 35% ( 65 : 35 ) ← bypass classifier ! Overwrite Headers Extract Features Predict proba [ 0.25 0.75 ] overwrite benign: [ 1 0 1 0 1 ] malware: [ 0 1 0 1 0 ] malware: [ 1 0 1 0 1 ] benign malware
  • 36. • overwrite extracted features ← meaningless! • need to change the binary itself • ok to overwrite (39) – timestamp, checksum, characteristics, linker info, etc • need to care specifications (5) - entropy of whole file, sections, entrypoint, filesize • After overwrite features from benign sample into malware sample ( 39 features ) • Probability dropped 15 % ( 0 : 100 → 15 : 85 ) • VirusTotal result : 38 → 32 ( what the ?! ) File modification
  • 37. • I just wrote adversarial attack code for my own ML model, but ?! • decided to keep checking the virustotal report ☺ File modification
  • 38. • Entropy is a measure of unpredictability of the state, or equivalently, of its average information content • Entropy of file or specific section can be used as a feature for ML • It’s not a simple job to change entropy of a binary File modification
  • 39. • fit malware’s entropy to benign sample File modification DOS Header NT Headers Section Header Section Header .text .data DOS Header NT Headers Section Header Section Header .text .data injected byte injected byte injected byte DOS Header NT Headers Section Header Section Header .text .data malware benign fit code section fit data section fit whole file
  • 40. • After changed both 39 features info + entropy • Virustotal detection dropped to 26 ! File modification
  • 41. • Actually, I didn’t count the impact of API malware used • I’m curious, so packing the malware and same test again File modification ← detection rate dropped after simply packed original file ← adversarial attack on packed file
  • 42. • then, what about ‘wannacry’ malware sample ?! • pick a random sample from my dataset and query to virustotal • ok, let’s start ☺ Model Validation
  • 43. • first step > after pass the binary to adversarial model ( benign: procexp.exe ) • second step > pass the binary(from first step) to my ML model Model Validation couldn’t bypass my ML model 
  • 44. • third step > upx packing and adversarial model • fourth step > query to the virustotal ( upx + adv ) ???? BYPASSED ML BASED AV ~ yeah~ ☺ Model Validation still... malware
  • 45. • If AV use deep learning to classify malware • Candidate model • DNN – nothing different than other machine learning algorithm ( just deep neural network ) • CNN – using binary image as features • RNN(api sequence) – using behavior analysis, extract api sequence info from executing • main idea – add garbage byte to the end of the binary. That’s it ! Adversarial Deep Learning random byte
  • 46. Summary • develop adversarial model just using static features ( PE metadata ) • even build your own model → doesn’t tell you the exact answer • UPX can be used as a camouflage tool • extract as many features as you can → lead to robust adversarial model • adversarial model can affect traditional av software ( signature based )
  • 47. • Expand Feature Vector – API, behavior information • Reinforcement Machine Learning Model – automatic adversarial attack • Virustotal Detection Rate ‘Zero’ • Develop adversarial testing framework for anti-virus software Future work
  • 48. • Can Machine Learning Be Secure? - Marco Barreno et al • Adversarial Machine Learning – J.D. Tygar • Adversarial and Secure Machine Learning – Huang Xiao • Adversarial Reinforcement Learning – William Uther et al • Adversarial Machine Learning – Ling Huang • Adversarial Examples in the physical world – Alexey Kurakin et al • Adversarial Examples in Machine Learning – Nicolas Papernot • Explaining and harnessing adversarial examples – Ian J. Goodfellow et al • Machine Learning in adversarial environments – Pavel Laskov et al References
  • 49. Thank you any question? nababora@naver.com