SlideShare a Scribd company logo
1 of 11
Detecting
Adversarial Examples
2021/07/27
参考
• [1] 2017, 被引用数856 Xu, Weilin, David Evans, and Yanjun Qi. "Feature squeezing:
Detecting adversarial examples in deep neural networks." arXiv preprint
arXiv:1704.01155 (2017).
• [2] 2017, 被引用数1079 Carlini, Nicholas, and David Wagner. "Adversarial examples are
not easily detected: Bypassing ten detection methods." Proceedings of the 10th ACM
workshop on artificial intelligence and security. 2017.
• [3] 2017, 被引用数269 He, Warren, et al. "Adversarial example defense: Ensembles of
weak defenses are not strong." 11th {USENIX} workshop on offensive technologies
({WOOT} 17). 2017.
• [4] 2017, 被引用数440, Grosse, Kathrin, et al. "On the (statistical) detection of adversarial
examples." arXiv preprint arXiv:1702.06280 (2017).
Adversarial Examples への対抗策 [1:Sec2:C, 2:Sec3]
1. Adversarial Training
敵対的画像を生成し, それを訓練データに加える
good: 最も成功している([1:Sec4:B])
DNNモデルをそのまま使える( 学習方法を変えるだけ )
bad: 敵対的画像の生成コストが高い, 生成手法に依存
2. Gradient Masking
DNNモデルの勾配を0に近くする
good: 勾配法ベースの攻撃を無力化する
bad: モデルの精度が悪くなる, 対抗する攻撃手法が存在([1:Sec2:C])
3. Input Transformer
入力画像を変換する
1. PCA変換, 次元圧縮 など
good: ノイズに対するモデルの感度を軽減できる(ことが期待される)
bad: PCA変換だと画像の空間構造が破壊されるため, CNNモデルが使用できない
→ 空間構造が破壊されない変換として下記が提案されている
1. Auto Encoder
2. Squeeze (平滑化)
Adversarial Examples への対抗策 [1:Sec2:D, 2:Sec3]
4. Detection
そもそも入力画像が敵対的かどうかを判断する
1. 統計的検定
Pure画像グループとAdv混入画像グループの検定
bad: 効果的ではない, 小さいノイズの場合, 検定は困難
2. 検出器の作成[2:Sec3.1 ]
敵対的画像かを判定する, 下記のようなバリエーション
1. DNNモデルの中間層の出力を入力にする検出器を作成する
2. 最終層に敵対的クラスを追加する [2:Sec3.1]
bad: 訓練に大量の敵対的画像が必要, 生成手法に依存,
入力画像をそのまま入力とする検出器は精度が悪い
3. 予測の不整合の検出 ([1]はここに着目)
複数モデルの出力から判断する.
経験的にadv画像はpure画像に比べて複数モデルの出力が一致しにくい性質があるらしい
1. Dropoutにより擬似的に複数モデルを作成して, その出力を比較する
2. 入力画像に複数の前処理を施してその出力を比較する
bad:
Feature Squeezing [1:Sec3]
画像の前処理: 色深度の削減と平滑化
1. 色ベースの平滑化
ピクセルごとに色情報のbit数を減らして平滑化する
(Fig.2, Fig.3中段)
2. 位置ベースの平滑化
近傍ピクセル画像を用いて平滑化(Fig.3下段)
( この時画像は縮小させない )
ビット数を減らしても, 我々人は識別できる
平滑化実験
→ 被攻撃画像を平滑化してモデルに入力した結果
MNISTだと Binary Filterで83%の攻撃を無力化,
CIFAR-10だとMedian-Smoothingで87%の攻撃を
無力化できた(もっとも良い結果を記載)
[1:Table3]
→ 平滑化によりノイズの影響を取り除くことがで
きたことになる
Feature Squeezing [1:Sec3]
モデルへの組み込み
1. Binary Filter を 事前学習済みモデルの前に適用するだけ
2. Adversarial Training
3. Composed( Binary Filter + Adversarial Training )
で比較
Binary Filterだけでも十分な攻撃無力化力 ( 𝐿2, 𝐿∞に対しては有効, 𝐿0に対しては効果的でない )
Adversarial Trainingはコストがかかるが, Filterはコスト0なのが利点
Feature Squeezing [1:Sec3]
余談: 今後の展開
著者らは, そのほかのsqueezing方法として下記の将来性に言及
1. 非可逆的圧縮(JPEGなど)
2. 次元圧縮( 特徴空間への射影など )
1. 例えば顔認識タスクにおいては多くのピクセルが無関係であり
eigenfacesと呼ばれる特徴空間に射影( → 復元 )することで
ノイズを緩和できる
Detecting Adversarial Input [1:Sec5]
Squeezingを用いて敵対的画像の検出を行う
pure画像 → モデル
pure画像 → squeezing → モデル
adv画像 → モデル
adv画像 → squeezing → モデル
出力が似ている
出力が似てない ( 下段はpure画像のモデル出力に似るはず )
Detecting Adversarial Input [1:Sec5]
𝑔(𝑋) : 画像𝑋に対するモデル出力
𝑋𝑠𝑞𝑢𝑒𝑒𝑧𝑒𝑑: 平滑化処理が行われた画像𝑋
検出指標
平滑化手法を複数用意して
scoreのmaxをとる
• あとは, scoreの閾値𝑡 ( score > t なら敵対的画像, そうでないならpure)を決めるだけ
→ 普通の学習
• 攻撃者が Squeezingの存在を知らないものとして, 生成した攻撃画像の検出率で評価
→ Imagene + FGSM の検出はうまくいかない(理由不明[1:Sec5:C])が, それ以外は概ね良い[1:Table4]
Detecting Adversarial Input [1:Sec5, 3:Sec2.4]
攻撃者が防御策の存在を知っていた場合の効果的な攻撃手法はあるか?
[3]ではシンプルな攻撃方法での実験が行われている
𝑥: 元画像, 𝑥′
: 攻撃画像
J 𝐹𝜃 𝑥′
, 𝑦 : 防御策なしの分類器のラベル𝑦のスコア
流れ[3:Sec3.1.1]
防御策込みでのadv画像が見つかればいいなと思いながら
防御策なしでのadv画像を探索する( 非決定論的 )
1. 初期点生成
2. 勾配方向に移動
3. 探索点が防御策込みでのadv画像になっているかをcheck
4. 数回移動したら, 初期点再設定
→ MNIST, CIFAR-10ともに防御策込みでのadv画像の発見に成功( 右図 )
← [1:Sec5:D]では攻撃に時間がかかると指摘されている
Detecting Adversarial Input [1:Sec5, 3:Sec2.4]
攻撃者が防御策の存在を知っていた場合の効果的な攻撃手法はあるか?
[3:Sec5]では[4]をベースにした
別の手法(攻撃ではなく再学習だが)も提案されている.
複数のadv判別器がある場合.
再学習でモデルの最終層にadv画像クラスを追加して
𝑁 → 𝑁 + 1 分類をするというもの
下記のような出力をするモデル𝐺を訓練する.
𝐹 𝑥 𝑖: オリジナルモデルのクラス𝑖 スコア
𝐷𝑗(𝑥): adv判別器𝑗が画像𝑥をadvと判断する確率-1/2
(𝐷𝑗 𝑥 > 0ならadv判定)
𝐷 𝑥 ≔ max
𝑗
𝐷𝑗 𝑥 : 画像𝑥がadvである確率-1/2
上記𝐺(𝑥)は𝐷 𝑥 > 0のとき𝐺 𝑥 𝑁+1が最大になる.
[3]自体は, 複数の防御策を組み合わせるとより強い防御策にな
るか? を主題にしており, 筆者らはNoと述べている.
防御策が複数あっても, それをすり抜けられる攻撃が可能とさ
れている.
攻撃は Adaptive ( 防御策を知っている ), Static ( 防御策を知
らない ) に大別され, 本稿ではAdaptive Adversaryのみにつ
いてのみ検証されている.
[1:Sec5:D] ではadaptive adversaryによるロバスト性の評価
は不当と述べられている. なぜなら, 生成される画像が人が認
識できないレベルのノイズではadvが見つけられていないから.
→ [3]の例を見ると
そうでもなさそうだが…

More Related Content

Similar to Detecting adversarial example

Paper Explained: One Pixel Attack for Fooling Deep Neural Networks
Paper Explained: One Pixel Attack for Fooling Deep Neural NetworksPaper Explained: One Pixel Attack for Fooling Deep Neural Networks
Paper Explained: One Pixel Attack for Fooling Deep Neural NetworksDevansh16
 
Security in Machine Learning
Security in Machine LearningSecurity in Machine Learning
Security in Machine LearningFlavio Clesio
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringPluribus One
 
White box in Computer Vision
White box in Computer VisionWhite box in Computer Vision
White box in Computer VisionJaehyuk Heo
 
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...IJCNCJournal
 
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...IJCNCJournal
 
n-Tier Modelling of Robust Key management for Secure Data Aggregation in Wire...
n-Tier Modelling of Robust Key management for Secure Data Aggregation in Wire...n-Tier Modelling of Robust Key management for Secure Data Aggregation in Wire...
n-Tier Modelling of Robust Key management for Secure Data Aggregation in Wire...IJECEIAES
 
A critical review on Adversarial Attacks on Intrusion Detection Systems: Must...
A critical review on Adversarial Attacks on Intrusion Detection Systems: Must...A critical review on Adversarial Attacks on Intrusion Detection Systems: Must...
A critical review on Adversarial Attacks on Intrusion Detection Systems: Must...PhD Assistance
 
Multi-stage secure clusterhead selection using discrete rule-set against unkn...
Multi-stage secure clusterhead selection using discrete rule-set against unkn...Multi-stage secure clusterhead selection using discrete rule-set against unkn...
Multi-stage secure clusterhead selection using discrete rule-set against unkn...IJECEIAES
 
Adversarial Training is all you Need.pptx
Adversarial Training is all you Need.pptxAdversarial Training is all you Need.pptx
Adversarial Training is all you Need.pptxPrerana Khatiwada
 
22946884 a-coarse-taxonomy-of-artificial-immune-systems
22946884 a-coarse-taxonomy-of-artificial-immune-systems22946884 a-coarse-taxonomy-of-artificial-immune-systems
22946884 a-coarse-taxonomy-of-artificial-immune-systemsFranco Bressan
 
A novel ensemble modeling for intrusion detection system
A novel ensemble modeling for intrusion detection system A novel ensemble modeling for intrusion detection system
A novel ensemble modeling for intrusion detection system IJECEIAES
 
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion Detection
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion DetectionMulti Stage Filter Using Enhanced Adaboost for Network Intrusion Detection
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion DetectionIJNSA Journal
 
Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...
Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...
Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...Eswar Publications
 
Secure intrusion detection and attack measure selection
Secure intrusion detection and attack measure selectionSecure intrusion detection and attack measure selection
Secure intrusion detection and attack measure selectionUvaraj Shan
 
Secure intrusion detection and attack measure selection in virtual network sy...
Secure intrusion detection and attack measure selection in virtual network sy...Secure intrusion detection and attack measure selection in virtual network sy...
Secure intrusion detection and attack measure selection in virtual network sy...Uvaraj Shan
 
INVERTIBLE NEURAL NETWORK FOR INFERENCE PIPELINE ANOMALY DETECTION
INVERTIBLE NEURAL NETWORK FOR INFERENCE PIPELINE ANOMALY DETECTIONINVERTIBLE NEURAL NETWORK FOR INFERENCE PIPELINE ANOMALY DETECTION
INVERTIBLE NEURAL NETWORK FOR INFERENCE PIPELINE ANOMALY DETECTIONIJNSA Journal
 
Robustness of compressed CNNs
Robustness of compressed CNNsRobustness of compressed CNNs
Robustness of compressed CNNsKaushalya Madhawa
 

Similar to Detecting adversarial example (20)

Paper Explained: One Pixel Attack for Fooling Deep Neural Networks
Paper Explained: One Pixel Attack for Fooling Deep Neural NetworksPaper Explained: One Pixel Attack for Fooling Deep Neural Networks
Paper Explained: One Pixel Attack for Fooling Deep Neural Networks
 
Security in Machine Learning
Security in Machine LearningSecurity in Machine Learning
Security in Machine Learning
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
 
White box in Computer Vision
White box in Computer VisionWhite box in Computer Vision
White box in Computer Vision
 
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
 
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
 
n-Tier Modelling of Robust Key management for Secure Data Aggregation in Wire...
n-Tier Modelling of Robust Key management for Secure Data Aggregation in Wire...n-Tier Modelling of Robust Key management for Secure Data Aggregation in Wire...
n-Tier Modelling of Robust Key management for Secure Data Aggregation in Wire...
 
A critical review on Adversarial Attacks on Intrusion Detection Systems: Must...
A critical review on Adversarial Attacks on Intrusion Detection Systems: Must...A critical review on Adversarial Attacks on Intrusion Detection Systems: Must...
A critical review on Adversarial Attacks on Intrusion Detection Systems: Must...
 
Adversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdfAdversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdf
 
Multi-stage secure clusterhead selection using discrete rule-set against unkn...
Multi-stage secure clusterhead selection using discrete rule-set against unkn...Multi-stage secure clusterhead selection using discrete rule-set against unkn...
Multi-stage secure clusterhead selection using discrete rule-set against unkn...
 
Adversarial Training is all you Need.pptx
Adversarial Training is all you Need.pptxAdversarial Training is all you Need.pptx
Adversarial Training is all you Need.pptx
 
22946884 a-coarse-taxonomy-of-artificial-immune-systems
22946884 a-coarse-taxonomy-of-artificial-immune-systems22946884 a-coarse-taxonomy-of-artificial-immune-systems
22946884 a-coarse-taxonomy-of-artificial-immune-systems
 
A novel ensemble modeling for intrusion detection system
A novel ensemble modeling for intrusion detection system A novel ensemble modeling for intrusion detection system
A novel ensemble modeling for intrusion detection system
 
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion Detection
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion DetectionMulti Stage Filter Using Enhanced Adaboost for Network Intrusion Detection
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion Detection
 
Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...
Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...
Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...
 
Secure intrusion detection and attack measure selection
Secure intrusion detection and attack measure selectionSecure intrusion detection and attack measure selection
Secure intrusion detection and attack measure selection
 
Secure intrusion detection and attack measure selection in virtual network sy...
Secure intrusion detection and attack measure selection in virtual network sy...Secure intrusion detection and attack measure selection in virtual network sy...
Secure intrusion detection and attack measure selection in virtual network sy...
 
M43057580
M43057580M43057580
M43057580
 
INVERTIBLE NEURAL NETWORK FOR INFERENCE PIPELINE ANOMALY DETECTION
INVERTIBLE NEURAL NETWORK FOR INFERENCE PIPELINE ANOMALY DETECTIONINVERTIBLE NEURAL NETWORK FOR INFERENCE PIPELINE ANOMALY DETECTION
INVERTIBLE NEURAL NETWORK FOR INFERENCE PIPELINE ANOMALY DETECTION
 
Robustness of compressed CNNs
Robustness of compressed CNNsRobustness of compressed CNNs
Robustness of compressed CNNs
 

Recently uploaded

(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?Paolo Missier
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligenceRevolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligencePrecisely
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxjbellis
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 

Recently uploaded (20)

(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligenceRevolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 

Detecting adversarial example

  • 2. 参考 • [1] 2017, 被引用数856 Xu, Weilin, David Evans, and Yanjun Qi. "Feature squeezing: Detecting adversarial examples in deep neural networks." arXiv preprint arXiv:1704.01155 (2017). • [2] 2017, 被引用数1079 Carlini, Nicholas, and David Wagner. "Adversarial examples are not easily detected: Bypassing ten detection methods." Proceedings of the 10th ACM workshop on artificial intelligence and security. 2017. • [3] 2017, 被引用数269 He, Warren, et al. "Adversarial example defense: Ensembles of weak defenses are not strong." 11th {USENIX} workshop on offensive technologies ({WOOT} 17). 2017. • [4] 2017, 被引用数440, Grosse, Kathrin, et al. "On the (statistical) detection of adversarial examples." arXiv preprint arXiv:1702.06280 (2017).
  • 3. Adversarial Examples への対抗策 [1:Sec2:C, 2:Sec3] 1. Adversarial Training 敵対的画像を生成し, それを訓練データに加える good: 最も成功している([1:Sec4:B]) DNNモデルをそのまま使える( 学習方法を変えるだけ ) bad: 敵対的画像の生成コストが高い, 生成手法に依存 2. Gradient Masking DNNモデルの勾配を0に近くする good: 勾配法ベースの攻撃を無力化する bad: モデルの精度が悪くなる, 対抗する攻撃手法が存在([1:Sec2:C]) 3. Input Transformer 入力画像を変換する 1. PCA変換, 次元圧縮 など good: ノイズに対するモデルの感度を軽減できる(ことが期待される) bad: PCA変換だと画像の空間構造が破壊されるため, CNNモデルが使用できない → 空間構造が破壊されない変換として下記が提案されている 1. Auto Encoder 2. Squeeze (平滑化)
  • 4. Adversarial Examples への対抗策 [1:Sec2:D, 2:Sec3] 4. Detection そもそも入力画像が敵対的かどうかを判断する 1. 統計的検定 Pure画像グループとAdv混入画像グループの検定 bad: 効果的ではない, 小さいノイズの場合, 検定は困難 2. 検出器の作成[2:Sec3.1 ] 敵対的画像かを判定する, 下記のようなバリエーション 1. DNNモデルの中間層の出力を入力にする検出器を作成する 2. 最終層に敵対的クラスを追加する [2:Sec3.1] bad: 訓練に大量の敵対的画像が必要, 生成手法に依存, 入力画像をそのまま入力とする検出器は精度が悪い 3. 予測の不整合の検出 ([1]はここに着目) 複数モデルの出力から判断する. 経験的にadv画像はpure画像に比べて複数モデルの出力が一致しにくい性質があるらしい 1. Dropoutにより擬似的に複数モデルを作成して, その出力を比較する 2. 入力画像に複数の前処理を施してその出力を比較する bad:
  • 5. Feature Squeezing [1:Sec3] 画像の前処理: 色深度の削減と平滑化 1. 色ベースの平滑化 ピクセルごとに色情報のbit数を減らして平滑化する (Fig.2, Fig.3中段) 2. 位置ベースの平滑化 近傍ピクセル画像を用いて平滑化(Fig.3下段) ( この時画像は縮小させない ) ビット数を減らしても, 我々人は識別できる 平滑化実験 → 被攻撃画像を平滑化してモデルに入力した結果 MNISTだと Binary Filterで83%の攻撃を無力化, CIFAR-10だとMedian-Smoothingで87%の攻撃を 無力化できた(もっとも良い結果を記載) [1:Table3] → 平滑化によりノイズの影響を取り除くことがで きたことになる
  • 6. Feature Squeezing [1:Sec3] モデルへの組み込み 1. Binary Filter を 事前学習済みモデルの前に適用するだけ 2. Adversarial Training 3. Composed( Binary Filter + Adversarial Training ) で比較 Binary Filterだけでも十分な攻撃無力化力 ( 𝐿2, 𝐿∞に対しては有効, 𝐿0に対しては効果的でない ) Adversarial Trainingはコストがかかるが, Filterはコスト0なのが利点
  • 7. Feature Squeezing [1:Sec3] 余談: 今後の展開 著者らは, そのほかのsqueezing方法として下記の将来性に言及 1. 非可逆的圧縮(JPEGなど) 2. 次元圧縮( 特徴空間への射影など ) 1. 例えば顔認識タスクにおいては多くのピクセルが無関係であり eigenfacesと呼ばれる特徴空間に射影( → 復元 )することで ノイズを緩和できる
  • 8. Detecting Adversarial Input [1:Sec5] Squeezingを用いて敵対的画像の検出を行う pure画像 → モデル pure画像 → squeezing → モデル adv画像 → モデル adv画像 → squeezing → モデル 出力が似ている 出力が似てない ( 下段はpure画像のモデル出力に似るはず )
  • 9. Detecting Adversarial Input [1:Sec5] 𝑔(𝑋) : 画像𝑋に対するモデル出力 𝑋𝑠𝑞𝑢𝑒𝑒𝑧𝑒𝑑: 平滑化処理が行われた画像𝑋 検出指標 平滑化手法を複数用意して scoreのmaxをとる • あとは, scoreの閾値𝑡 ( score > t なら敵対的画像, そうでないならpure)を決めるだけ → 普通の学習 • 攻撃者が Squeezingの存在を知らないものとして, 生成した攻撃画像の検出率で評価 → Imagene + FGSM の検出はうまくいかない(理由不明[1:Sec5:C])が, それ以外は概ね良い[1:Table4]
  • 10. Detecting Adversarial Input [1:Sec5, 3:Sec2.4] 攻撃者が防御策の存在を知っていた場合の効果的な攻撃手法はあるか? [3]ではシンプルな攻撃方法での実験が行われている 𝑥: 元画像, 𝑥′ : 攻撃画像 J 𝐹𝜃 𝑥′ , 𝑦 : 防御策なしの分類器のラベル𝑦のスコア 流れ[3:Sec3.1.1] 防御策込みでのadv画像が見つかればいいなと思いながら 防御策なしでのadv画像を探索する( 非決定論的 ) 1. 初期点生成 2. 勾配方向に移動 3. 探索点が防御策込みでのadv画像になっているかをcheck 4. 数回移動したら, 初期点再設定 → MNIST, CIFAR-10ともに防御策込みでのadv画像の発見に成功( 右図 ) ← [1:Sec5:D]では攻撃に時間がかかると指摘されている
  • 11. Detecting Adversarial Input [1:Sec5, 3:Sec2.4] 攻撃者が防御策の存在を知っていた場合の効果的な攻撃手法はあるか? [3:Sec5]では[4]をベースにした 別の手法(攻撃ではなく再学習だが)も提案されている. 複数のadv判別器がある場合. 再学習でモデルの最終層にadv画像クラスを追加して 𝑁 → 𝑁 + 1 分類をするというもの 下記のような出力をするモデル𝐺を訓練する. 𝐹 𝑥 𝑖: オリジナルモデルのクラス𝑖 スコア 𝐷𝑗(𝑥): adv判別器𝑗が画像𝑥をadvと判断する確率-1/2 (𝐷𝑗 𝑥 > 0ならadv判定) 𝐷 𝑥 ≔ max 𝑗 𝐷𝑗 𝑥 : 画像𝑥がadvである確率-1/2 上記𝐺(𝑥)は𝐷 𝑥 > 0のとき𝐺 𝑥 𝑁+1が最大になる. [3]自体は, 複数の防御策を組み合わせるとより強い防御策にな るか? を主題にしており, 筆者らはNoと述べている. 防御策が複数あっても, それをすり抜けられる攻撃が可能とさ れている. 攻撃は Adaptive ( 防御策を知っている ), Static ( 防御策を知 らない ) に大別され, 本稿ではAdaptive Adversaryのみにつ いてのみ検証されている. [1:Sec5:D] ではadaptive adversaryによるロバスト性の評価 は不当と述べられている. なぜなら, 生成される画像が人が認 識できないレベルのノイズではadvが見つけられていないから. → [3]の例を見ると そうでもなさそうだが…