SlideShare a Scribd company logo
1 of 19
Download to read offline
Pa#ern	
  Recogni-on	
  	
  
and	
  Applica-ons	
  Lab	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  
University	
  
of	
  Cagliari,	
  Italy	
  
	
  
Department	
  of	
  
Electrical	
  and	
  Electronic	
  
Engineering	
  
Is Feature Selection Secure
against Training Data Poisoning?
Huang	
  Xiao2,	
  BaEsta	
  Biggio1,	
  Gavin	
  Brown3,	
  Giorgio	
  Fumera1,	
  
Claudia	
  Eckert2,	
  Fabio	
  Roli1	
  
	
  
(1)	
  Dept.	
  Of	
  Electrical	
  and	
  Electronic	
  Engineering,	
  University	
  of	
  Cagliari,	
  Italy	
  
(2)	
  	
  Department	
  of	
  Computer	
  Science,	
  Technische	
  Universität	
  München,	
  Germany	
  	
  
(3)	
  School	
  of	
  Computer	
  Science,	
  University	
  of	
  Manchester,	
  UK	
  
	
  
Jul	
  6	
  -­‐	
  11,	
  2015	
  ICML	
  2015	
  
 
http://pralab.diee.unica.it
Motivation
•  Increasing number of services and apps available on the Internet
–  Improved user experience
•  Proliferation and sophistication of attacks and cyberthreats
–  Skilled / economically-motivated attackers
•  Several security systems use machine learning to detect attacks
–  but … is machine learning secure enough?
2	
  
 
http://pralab.diee.unica.it
Is Feature Selection Secure?
•  Adversarial ML: security of learning and clustering algorithms
–  Barreno et al., 2006; Huang et al., 2011; Biggio et al., 2014; 2012; 2013a;
Brueckner et al., 2012; Globerson & Roweis, 2006
•  Feature Selection
–  High-dimensional feature spaces (e.g., spam and malware detection)
–  Dimensionality reduction to improve interpretability and generalization
•  How about the security of feature selection?
3	
  
x1
x2
...
…
…
xd
x(1)
x(2)
…
x(k)
 
http://pralab.diee.unica.it
Feature Selection under Attack
Attacker Model
•  Goal of the attack
•  Knowledge of the attacked system
•  Capability of manipulating data
•  Attack strategy
4	
  
PD(X,Y)?	
  
f(x)
 
http://pralab.diee.unica.it
Attacker’s Goal
•  Integrity Violation: to perform malicious activities without
compromising normal system operation
–  enforcing selection of features to facilitate evasion at test time
•  Availability Violation: to compromise normal system operation
–  enforcing selection of features to maximize generalization error
•  Privacy Violation: gaining confidential information on system users
–  reverse-engineering feature selection to get confidential information
5	
  
Security
Violation
Integrity Availability Privacy
 
http://pralab.diee.unica.it
Attacker’s Knowledge
•  Perfect knowledge
–  upper bound on performance degradation under attack
•  Limited knowledge
–  attack on surrogate data sampled from same distribution
TRAINING DATA
FEATURE
REPRESENTATION
FEATURE
SELECTION
ALGORITHM
x1
x2
...
…
…
xd
6	
  
x(1)
x(2)
…
x(k)
 
http://pralab.diee.unica.it
•  Inject points into the training data
•  Constraints on data manipulation
–  Fraction of the training data under the attacker’s control
–  Application-specific constraints
•  Example on PDF data
–  PDF file: hierarchy of interconnected objects
–  Objects can be added but not easily removed without compromising
the file structure
Attacker’s Capability
7	
  
13	
  0	
  obj	
  
<<	
  /Kids	
  [	
  1	
  0	
  R	
  11	
  0	
  R	
  ]	
  
/Type	
  /Page	
  
...	
  >>	
  end	
  obj	
  
17	
  0	
  obj	
  
<<	
  /Type	
  /Encoding	
  
/Differences	
  [	
  0	
  /C0032	
  ]	
  >>	
  
endobj	
  
 
http://pralab.diee.unica.it
Attack Scenarios
•  Different potential attack scenarios depending on assumptions
on the attacker’s goal, knowledge, capability
–  Details and examples in the paper
•  Poisoning Availability Attacks
Enforcing selection of features to maximize generalization error
–  Goal: availability violation
–  Knowledge: perfect / limited
–  Capability: injecting samples into the training data
8	
  
 
http://pralab.diee.unica.it
Embedded Feature Selection Algorithms
•  Linear models
–  Select features according to |w|
9	
  
LASSO	
  	
  
Tibshirani, 1996
Ridge	
  Regression	
  
Hoerl & Kennard, 1970
Elas9c	
  Net	
  
Zou & Hastie, 2005
 
http://pralab.diee.unica.it
Poisoning Embedded Feature Selection
•  Attacker’s objective
–  to maximize generalization error on untainted data
•  Solution: subgradient-ascent technique
10	
  
Loss estimated on surrogate data
(excluding the attack point)
Algorithm is trained on surrogate data
(including the attack point)
… w.r.t. choice of the attack point
 
http://pralab.diee.unica.it
KKTconditions
Gradient Computation
11	
  
How does the solution change w.r.t. xc?
Subgradient is unique at the optimal solution!
 
http://pralab.diee.unica.it
Gradient Computation
•  We require the KKT conditions to hold under perturbation of xc
12	
  
Gradient is now uniquely determined
 
http://pralab.diee.unica.it
Poisoning Attack Algorithm
13	
  
 
http://pralab.diee.unica.it
Experiments on PDF Malware Detection
•  PDF: hierarchy of interconnected objects (keyword/value pairs)
•  Learner’s task: to classify benign vs malware PDF files
•  Attacker’s task: to maximize classification error by injecting
poisoning attack samples
–  Only feature increments are considered (object insertion)
•  Object removal may compromise the PDF file
/Type 	
   	
  2	
  
/Page 	
   	
  1	
  
/Encoding 	
  1	
  
…	
  
13	
  0	
  obj	
  
<<	
  /Kids	
  [	
  1	
  0	
  R	
  11	
  0	
  R	
  ]	
  
/Type	
  /Page	
  
...	
  >>	
  end	
  obj	
  
	
  
17	
  0	
  obj	
  
<<	
  /Type	
  /Encoding	
  
/Differences	
  [	
  0	
  /C0032	
  ]	
  >>	
  
endobj	
  
Features:	
  keyword	
  counts	
  
14	
  
Maiorca et al., 2012; 2013;
Smutz & Stavrou, 2012;
Srndic & Laskov, 2013
 
http://pralab.diee.unica.it
Experimental Results
15	
  
PerfectKnowledge
Data: 300 (TR) and 5,000 (TS) samples – 114 features
Similar results obtained for limited-knowledge attacks!
 
http://pralab.diee.unica.it
Experimental Results
16	
  
PerfectKnowledge
A: selected features in the absence of attack
B: selected features under attack
k: number of features selected out of d
r: common features between the two setsKuncheva et al., 2007
 
http://pralab.diee.unica.it
Conclusions and Future Work
•  Framework for security evaluation of feature selection under attack
–  Poisoning attacks against embedded feature selection algorithms
•  Poisoning can significantly affect feature selection
–  LASSO significantly vulnerable to poisoning attacks
•  Future research directions
–  Error bounds on the impact of poisoning on learning algorithms
–  Secure / robust feature selection algorithms
17	
  
L1 regularization: stability against random noise,
but not against adversarial (worst-case) noise?
 
http://pralab.diee.unica.it
?	
  Any questions
Thanks	
  for	
  your	
  a#en-on!	
  
18	
  
 
http://pralab.diee.unica.it
Experimental Results
19	
  
Perfect	
  Knowledge	
  Limited	
  Knowledge	
  

More Related Content

What's hot

On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsPluribus One
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringPluribus One
 
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresMachine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresPluribus One
 
Adversarial Learning_Rupam Bhattacharya
Adversarial Learning_Rupam BhattacharyaAdversarial Learning_Rupam Bhattacharya
Adversarial Learning_Rupam BhattacharyaRupam Bhattacharya
 
Causative Adversarial Learning
Causative Adversarial LearningCausative Adversarial Learning
Causative Adversarial LearningDavid Dao
 
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019Pluribus One
 
adversarial robustness through local linearization
 adversarial robustness through local linearization adversarial robustness through local linearization
adversarial robustness through local linearizationtaeseon ryu
 
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...Pluribus One
 
Self-learning systems for cyber security
Self-learning systems for cyber securitySelf-learning systems for cyber security
Self-learning systems for cyber securityKim Hammar
 
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...GeekPwn Keen
 
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.anant90
 
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Pluribus One
 
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...Pluribus One
 
On the Malware Detection Problem: Challenges & Novel Approaches
On the Malware Detection Problem: Challenges & Novel ApproachesOn the Malware Detection Problem: Challenges & Novel Approaches
On the Malware Detection Problem: Challenges & Novel ApproachesMarcus Botacin
 
The Future of Automated Malware Generation
The Future of Automated Malware GenerationThe Future of Automated Malware Generation
The Future of Automated Malware GenerationStephan Chenette
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersGianmario Spacagna
 
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...Marcus Botacin
 

What's hot (20)

On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial Settings
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
 
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresMachine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
 
Adversarial Learning_Rupam Bhattacharya
Adversarial Learning_Rupam BhattacharyaAdversarial Learning_Rupam Bhattacharya
Adversarial Learning_Rupam Bhattacharya
 
Causative Adversarial Learning
Causative Adversarial LearningCausative Adversarial Learning
Causative Adversarial Learning
 
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
 
Adversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdfAdversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdf
 
adversarial robustness through local linearization
 adversarial robustness through local linearization adversarial robustness through local linearization
adversarial robustness through local linearization
 
Adversarial ML - Part 1.pdf
Adversarial ML - Part 1.pdfAdversarial ML - Part 1.pdf
Adversarial ML - Part 1.pdf
 
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
 
Self-learning systems for cyber security
Self-learning systems for cyber securitySelf-learning systems for cyber security
Self-learning systems for cyber security
 
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
 
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
 
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
 
On the Malware Detection Problem: Challenges & Novel Approaches
On the Malware Detection Problem: Challenges & Novel ApproachesOn the Malware Detection Problem: Challenges & Novel Approaches
On the Malware Detection Problem: Challenges & Novel Approaches
 
The Future of Automated Malware Generation
The Future of Automated Malware GenerationThe Future of Automated Malware Generation
The Future of Automated Malware Generation
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
 
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
 

Similar to Is Feature Selection Secure against Training Data Poisoning

Machine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and ClusteringMachine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and ClusteringAshwini Almad
 
Machine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and ClusteringMachine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and ClusteringEndgameInc
 
Predict Android ransomware using categorical classifiaction.pptx
Predict Android ransomware using categorical classifiaction.pptxPredict Android ransomware using categorical classifiaction.pptx
Predict Android ransomware using categorical classifiaction.pptxlaharisai03
 
Ethical Hacking Conference 2015- Building Secure Products -a perspective
 Ethical Hacking Conference 2015- Building Secure Products -a perspective Ethical Hacking Conference 2015- Building Secure Products -a perspective
Ethical Hacking Conference 2015- Building Secure Products -a perspectiveDr. Anish Cheriyan (PhD)
 
Odin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_PredictionOdin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_PredictionMinh Nguyen
 
INTRUSION DETECTION SYSTEM
INTRUSION DETECTION SYSTEMINTRUSION DETECTION SYSTEM
INTRUSION DETECTION SYSTEMIRJET Journal
 
Application Threat Modeling In Risk Management
Application Threat Modeling In Risk ManagementApplication Threat Modeling In Risk Management
Application Threat Modeling In Risk ManagementMel Drews
 
IRJET- Sandbox Technology
IRJET- Sandbox TechnologyIRJET- Sandbox Technology
IRJET- Sandbox TechnologyIRJET Journal
 
First Principles Vulnerability Assessment
First Principles Vulnerability AssessmentFirst Principles Vulnerability Assessment
First Principles Vulnerability AssessmentManuel Brugnoli
 
IRJET- Security from Threats of Computer System
IRJET-  	  Security from Threats of Computer SystemIRJET-  	  Security from Threats of Computer System
IRJET- Security from Threats of Computer SystemIRJET Journal
 
IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...
IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...
IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...IRJET Journal
 
Advanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security ManagementAdvanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security ManagementMayur Nanotkar
 
IS_Syllabus_4_credits_2018.docx
IS_Syllabus_4_credits_2018.docxIS_Syllabus_4_credits_2018.docx
IS_Syllabus_4_credits_2018.docxTinTin271940
 
A review of machine learning based anomaly detection
A review of machine learning based anomaly detectionA review of machine learning based anomaly detection
A review of machine learning based anomaly detectionMohamed Elfadly
 
A review of machine learning based anomaly detection
A review of machine learning based anomaly detectionA review of machine learning based anomaly detection
A review of machine learning based anomaly detectionMohamed Elfadly
 
Mansour Alirfan5632632IntroductionProposalResults.docx
Mansour Alirfan5632632IntroductionProposalResults.docxMansour Alirfan5632632IntroductionProposalResults.docx
Mansour Alirfan5632632IntroductionProposalResults.docxinfantsuk
 
Distributed database security with discretionary access control
Distributed database security with discretionary access controlDistributed database security with discretionary access control
Distributed database security with discretionary access controlJyotishkar Dey
 
Threat modelling(system + enterprise)
Threat modelling(system + enterprise)Threat modelling(system + enterprise)
Threat modelling(system + enterprise)abhimanyubhogwan
 

Similar to Is Feature Selection Secure against Training Data Poisoning (20)

Machine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and ClusteringMachine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and Clustering
 
Machine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and ClusteringMachine Learning for Malware Classification and Clustering
Machine Learning for Malware Classification and Clustering
 
Predict Android ransomware using categorical classifiaction.pptx
Predict Android ransomware using categorical classifiaction.pptxPredict Android ransomware using categorical classifiaction.pptx
Predict Android ransomware using categorical classifiaction.pptx
 
Cloud computing final show
Cloud computing final   showCloud computing final   show
Cloud computing final show
 
Ethical Hacking Conference 2015- Building Secure Products -a perspective
 Ethical Hacking Conference 2015- Building Secure Products -a perspective Ethical Hacking Conference 2015- Building Secure Products -a perspective
Ethical Hacking Conference 2015- Building Secure Products -a perspective
 
Odin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_PredictionOdin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_Prediction
 
INTRUSION DETECTION SYSTEM
INTRUSION DETECTION SYSTEMINTRUSION DETECTION SYSTEM
INTRUSION DETECTION SYSTEM
 
Application Threat Modeling In Risk Management
Application Threat Modeling In Risk ManagementApplication Threat Modeling In Risk Management
Application Threat Modeling In Risk Management
 
IRJET- Sandbox Technology
IRJET- Sandbox TechnologyIRJET- Sandbox Technology
IRJET- Sandbox Technology
 
First Principles Vulnerability Assessment
First Principles Vulnerability AssessmentFirst Principles Vulnerability Assessment
First Principles Vulnerability Assessment
 
Deception towards Moving Target Defense
Deception towards Moving Target DefenseDeception towards Moving Target Defense
Deception towards Moving Target Defense
 
IRJET- Security from Threats of Computer System
IRJET-  	  Security from Threats of Computer SystemIRJET-  	  Security from Threats of Computer System
IRJET- Security from Threats of Computer System
 
IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...
IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...
IRJET- Windows Log Investigator System for Faster Root Cause Detection of a D...
 
Advanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security ManagementAdvanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security Management
 
IS_Syllabus_4_credits_2018.docx
IS_Syllabus_4_credits_2018.docxIS_Syllabus_4_credits_2018.docx
IS_Syllabus_4_credits_2018.docx
 
A review of machine learning based anomaly detection
A review of machine learning based anomaly detectionA review of machine learning based anomaly detection
A review of machine learning based anomaly detection
 
A review of machine learning based anomaly detection
A review of machine learning based anomaly detectionA review of machine learning based anomaly detection
A review of machine learning based anomaly detection
 
Mansour Alirfan5632632IntroductionProposalResults.docx
Mansour Alirfan5632632IntroductionProposalResults.docxMansour Alirfan5632632IntroductionProposalResults.docx
Mansour Alirfan5632632IntroductionProposalResults.docx
 
Distributed database security with discretionary access control
Distributed database security with discretionary access controlDistributed database security with discretionary access control
Distributed database security with discretionary access control
 
Threat modelling(system + enterprise)
Threat modelling(system + enterprise)Threat modelling(system + enterprise)
Threat modelling(system + enterprise)
 

More from Pluribus One

Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Pluribus One
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Pluribus One
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesPluribus One
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Pluribus One
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Pluribus One
 
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Pluribus One
 
Understanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsUnderstanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsPluribus One
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 PosterPluribus One
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Pluribus One
 
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterAriu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterPluribus One
 
Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Pluribus One
 
Ariu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisAriu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisPluribus One
 
Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Pluribus One
 
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...Pluribus One
 

More from Pluribus One (15)

Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense Slides
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...
 
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
 
Understanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsUnderstanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environments
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 Poster
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011
 
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterAriu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
 
Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011
 
Ariu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisAriu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern Analysis
 
Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011
 
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
 
Wiamis2010 poster
Wiamis2010 posterWiamis2010 poster
Wiamis2010 poster
 

Recently uploaded

Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 

Recently uploaded (20)

Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 

Is Feature Selection Secure against Training Data Poisoning

  • 1. Pa#ern  Recogni-on     and  Applica-ons  Lab                                     University   of  Cagliari,  Italy     Department  of   Electrical  and  Electronic   Engineering   Is Feature Selection Secure against Training Data Poisoning? Huang  Xiao2,  BaEsta  Biggio1,  Gavin  Brown3,  Giorgio  Fumera1,   Claudia  Eckert2,  Fabio  Roli1     (1)  Dept.  Of  Electrical  and  Electronic  Engineering,  University  of  Cagliari,  Italy   (2)    Department  of  Computer  Science,  Technische  Universität  München,  Germany     (3)  School  of  Computer  Science,  University  of  Manchester,  UK     Jul  6  -­‐  11,  2015  ICML  2015  
  • 2.   http://pralab.diee.unica.it Motivation •  Increasing number of services and apps available on the Internet –  Improved user experience •  Proliferation and sophistication of attacks and cyberthreats –  Skilled / economically-motivated attackers •  Several security systems use machine learning to detect attacks –  but … is machine learning secure enough? 2  
  • 3.   http://pralab.diee.unica.it Is Feature Selection Secure? •  Adversarial ML: security of learning and clustering algorithms –  Barreno et al., 2006; Huang et al., 2011; Biggio et al., 2014; 2012; 2013a; Brueckner et al., 2012; Globerson & Roweis, 2006 •  Feature Selection –  High-dimensional feature spaces (e.g., spam and malware detection) –  Dimensionality reduction to improve interpretability and generalization •  How about the security of feature selection? 3   x1 x2 ... … … xd x(1) x(2) … x(k)
  • 4.   http://pralab.diee.unica.it Feature Selection under Attack Attacker Model •  Goal of the attack •  Knowledge of the attacked system •  Capability of manipulating data •  Attack strategy 4   PD(X,Y)?   f(x)
  • 5.   http://pralab.diee.unica.it Attacker’s Goal •  Integrity Violation: to perform malicious activities without compromising normal system operation –  enforcing selection of features to facilitate evasion at test time •  Availability Violation: to compromise normal system operation –  enforcing selection of features to maximize generalization error •  Privacy Violation: gaining confidential information on system users –  reverse-engineering feature selection to get confidential information 5   Security Violation Integrity Availability Privacy
  • 6.   http://pralab.diee.unica.it Attacker’s Knowledge •  Perfect knowledge –  upper bound on performance degradation under attack •  Limited knowledge –  attack on surrogate data sampled from same distribution TRAINING DATA FEATURE REPRESENTATION FEATURE SELECTION ALGORITHM x1 x2 ... … … xd 6   x(1) x(2) … x(k)
  • 7.   http://pralab.diee.unica.it •  Inject points into the training data •  Constraints on data manipulation –  Fraction of the training data under the attacker’s control –  Application-specific constraints •  Example on PDF data –  PDF file: hierarchy of interconnected objects –  Objects can be added but not easily removed without compromising the file structure Attacker’s Capability 7   13  0  obj   <<  /Kids  [  1  0  R  11  0  R  ]   /Type  /Page   ...  >>  end  obj   17  0  obj   <<  /Type  /Encoding   /Differences  [  0  /C0032  ]  >>   endobj  
  • 8.   http://pralab.diee.unica.it Attack Scenarios •  Different potential attack scenarios depending on assumptions on the attacker’s goal, knowledge, capability –  Details and examples in the paper •  Poisoning Availability Attacks Enforcing selection of features to maximize generalization error –  Goal: availability violation –  Knowledge: perfect / limited –  Capability: injecting samples into the training data 8  
  • 9.   http://pralab.diee.unica.it Embedded Feature Selection Algorithms •  Linear models –  Select features according to |w| 9   LASSO     Tibshirani, 1996 Ridge  Regression   Hoerl & Kennard, 1970 Elas9c  Net   Zou & Hastie, 2005
  • 10.   http://pralab.diee.unica.it Poisoning Embedded Feature Selection •  Attacker’s objective –  to maximize generalization error on untainted data •  Solution: subgradient-ascent technique 10   Loss estimated on surrogate data (excluding the attack point) Algorithm is trained on surrogate data (including the attack point) … w.r.t. choice of the attack point
  • 11.   http://pralab.diee.unica.it KKTconditions Gradient Computation 11   How does the solution change w.r.t. xc? Subgradient is unique at the optimal solution!
  • 12.   http://pralab.diee.unica.it Gradient Computation •  We require the KKT conditions to hold under perturbation of xc 12   Gradient is now uniquely determined
  • 14.   http://pralab.diee.unica.it Experiments on PDF Malware Detection •  PDF: hierarchy of interconnected objects (keyword/value pairs) •  Learner’s task: to classify benign vs malware PDF files •  Attacker’s task: to maximize classification error by injecting poisoning attack samples –  Only feature increments are considered (object insertion) •  Object removal may compromise the PDF file /Type    2   /Page    1   /Encoding  1   …   13  0  obj   <<  /Kids  [  1  0  R  11  0  R  ]   /Type  /Page   ...  >>  end  obj     17  0  obj   <<  /Type  /Encoding   /Differences  [  0  /C0032  ]  >>   endobj   Features:  keyword  counts   14   Maiorca et al., 2012; 2013; Smutz & Stavrou, 2012; Srndic & Laskov, 2013
  • 15.   http://pralab.diee.unica.it Experimental Results 15   PerfectKnowledge Data: 300 (TR) and 5,000 (TS) samples – 114 features Similar results obtained for limited-knowledge attacks!
  • 16.   http://pralab.diee.unica.it Experimental Results 16   PerfectKnowledge A: selected features in the absence of attack B: selected features under attack k: number of features selected out of d r: common features between the two setsKuncheva et al., 2007
  • 17.   http://pralab.diee.unica.it Conclusions and Future Work •  Framework for security evaluation of feature selection under attack –  Poisoning attacks against embedded feature selection algorithms •  Poisoning can significantly affect feature selection –  LASSO significantly vulnerable to poisoning attacks •  Future research directions –  Error bounds on the impact of poisoning on learning algorithms –  Secure / robust feature selection algorithms 17   L1 regularization: stability against random noise, but not against adversarial (worst-case) noise?