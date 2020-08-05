Successfully reported this slideshow.
DaST: Data-free Substitute Training for Adversarial Attack [CVPR2020] M1, Kaede Shiohara Mingyi Zhou, Jing Wu, Yipeng Liu,...
Contents • Explanation part • Main contribution • Traditional Adversarial Attack methods • Idea • Attack Scenario • Advers...
Explanation part 3
Main contribution (Why this paper is accepted) •The first to train substitute model without real training data in two atta...
Traditional Adversarial Attack methods • Gradient-based (e.g. FGSM[1]) ✓Need pretrained model which imitates target model ...
・Naïve Gradient-based Attack online local Target Substitute Loss func online local Target Substitute Loss func ・Gradient-b...
Idea Use image generator(G ) for training substitute model(D ) • Objective of D • Imitate attacked model(T ) • Objective o...
Attack scenario • Label-only • Attackers can probe the output hard-label of the attacked model • Probability-only • Attack...
Adversarial Generator-Classifier Training N: # of classes 11
Adversarial Generator-Classifier Training Li generates samples with label i N: # of classes 12
Adversarial Generator-Classifier Training Conv layers are shared by all Li N: # of classes 13
Adversarial Generator-Classifier Training ( ) 14
Experiment on MNIST Substitute model type Pretrained: train with same dataset as attacked model used DaST-P: probability-o...
Experiment on MNIST DaST-P > Pretrained (> DaST-L) (DaST-P >) DaST-L > Pretrained • Attacked : 4 convs net • Substitute : ...
Experiment on MNIST • Attacked : 4 convs net • Substitute S/M/L : 3/4/5 convs net Large > Small ≧ Medium 17
Experiment on CIFAR-10 DaST-P > Pretrained (> DaST-L) (DaST-P >) DaST-L > Pretrained • Attacked : VGG16 • Substitute : Res...
Experiment on CIFAR-10 VGG13 > ResNet50 > ResNet18 • Attacked : VGG16 • Substitute : VGG16/ResNet18/ResNet50 Small model i...
Visualization 20
Experiment on Microsoft Azure (online model) DaST-L > DaST-P > Pretrained • Attacked : unknown • Substitute : 5 convs net ...
Visualization DaST generates ‘singular’ images because of first term e-d(T(X),D(X)) of LG 22
Re-implementation part 23
Model Architecture 24
Model Architecture ・論文に層の数やパラメータなどの記載なし 25
Model Architecture ・論文には ”3(,4,5) convolutional layers“ としか記載がない 27
Training α=0.2 ・Dataset : MNIST ・Scenario : Non-Targeted, Probability-only/Label-only ・Optimizer : Adam(lr=0.0001) (論文に記載な...
Result (Prob-only) 再現実験ではLGやLDは下がったが、MNISTに対するAccuracyが論文ほど上がらなかった。 -> その結果、Target modelに対する有効なAdversarial Examplesが生成できなか...
Result (Prob-only) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 ...
0 0.5 1 1.5 2 1 20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch L...
Result 論文に記載されている値を再現できなかった原因として以下が考えられる • 学習の難しさ • 本手法では通常のGANの学習のようにGとDのミニマックスゲームになっている (p.8,9 参照) 実際に、前ページで示したように学習が不安定...
DaST

  1. 1. DaST: Data-free Substitute Training for Adversarial Attack [CVPR2020] M1, Kaede Shiohara Mingyi Zhou, Jing Wu, Yipeng Liu, Shuaicheng Liu, Ce Zhu University of Electronic Science and Technology of China Megvii Technology 1
  2. 2. Contents • Explanation part • Main contribution • Traditional Adversarial Attack methods • Idea • Attack Scenario • Adversarial Generator-Classifier Training • Experiments • Visualizations • Re-implementation part • Model Architecture • Experiment on MNIST 2
  3. 3. Explanation part 3
  4. 4. Main contribution (Why this paper is accepted) •The first to train substitute model without real training data in two attack scenario. 4
  5. 5. Traditional Adversarial Attack methods • Gradient-based (e.g. FGSM[1]) ✓Need pretrained model which imitates target model -> Need real training data (That is very difficult in real problems!) • Score-based, Decision-based (e.g. ZOO[2]) ✓Need many query on test ✓Not need substitute model [1]Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. Inter- national Conference on Learning Representations (ICLR), 2015 [2] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. Zoo: Zeroth order optimization based black- box attacks to deep neural networks without training sub- stitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 15–26. ACM, 2017 5
  6. 6. Traditional Adversarial Attack methods • Gradient-based (e.g. FGSM[1]) ✓Need pretrained model which imitates target model -> Need real training data (That is very difficult in real problems!) • Score-based, Decision-based (e.g. ZOO[2]) ✓Need many query on test ✓Not need substitute model • DaST(proposed mothod) : Not attack method • Train substitute model without real training data -> useful when we need substitute model as Gradient-based attack methods [1]Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. Inter- national Conference on Learning Representations (ICLR), 2015 [2] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. Zoo: Zeroth order optimization based black- box attacks to deep neural networks without training sub- stitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 15–26. ACM, 2017 6 Give a Solution
  7. 7. ・Naïve Gradient-based Attack online local Target Substitute Loss func online local Target Substitute Loss func ・Gradient-based Attack with DaST It is difficult for attacker to collect same training datasets as target model used lol. Trained by backpropagation Trained by backpropagation No need to collect training datasets because DaST generates them. TT DD G 7
  8. 8. Idea Use image generator(G ) for training substitute model(D ) • Objective of D • Imitate attacked model(T ) • Objective of G • Generate new samples with the given label n ( ) • Generate new samples that maximizes distance between D and T ( ) (CE: cross entropy) ( is more stable on training than ) to increase diversity of generated samples 8
  9. 9. Idea Use image generator(G ) for training substitute model(D ) • Objective of D • Imitate attacked model(T ) • Objective of G • Generate new samples with the given label n ( ) • Generate new samples that maximizes distance between D and T ( ) Can’t access T ’s grad In training progresses, D≒T (CE: cross entropy) to increase diversity of generated samples 9
  10. 10. Attack scenario • Label-only • Attackers can probe the output hard-label of the attacked model • Probability-only • Attackers can probe the output probability of the attacked model prob label prob prob (e.g. [0, 0, …, 0, 1, 0, …, 0]) (e.g. [0.03, 0.1, …, 0.05, 0.7, 0.01, …, 0.04]) 10
  11. 11. Adversarial Generator-Classifier Training N: # of classes 11
  12. 12. Adversarial Generator-Classifier Training Li generates samples with label i N: # of classes 12
  13. 13. Adversarial Generator-Classifier Training Conv layers are shared by all Li N: # of classes 13
  14. 14. Adversarial Generator-Classifier Training ( ) 14
  15. 15. Experiment on MNIST Substitute model type Pretrained: train with same dataset as attacked model used DaST-P: probability-only scenario DaST-L: label-only scenario Attack method zzzzzzzzzAttack successful rate Attack type (%) 15
  16. 16. Experiment on MNIST DaST-P > Pretrained (> DaST-L) (DaST-P >) DaST-L > Pretrained • Attacked : 4 convs net • Substitute : 5 convs net 16 Surprisingly, Attack Successful Rate of DaST is higher than one of Pretrained.
  17. 17. Experiment on MNIST • Attacked : 4 convs net • Substitute S/M/L : 3/4/5 convs net Large > Small ≧ Medium 17
  18. 18. Experiment on CIFAR-10 DaST-P > Pretrained (> DaST-L) (DaST-P >) DaST-L > Pretrained • Attacked : VGG16 • Substitute : ResNet50 18
  19. 19. Experiment on CIFAR-10 VGG13 > ResNet50 > ResNet18 • Attacked : VGG16 • Substitute : VGG16/ResNet18/ResNet50 Small model is better unlike in MNIST 19
  20. 20. Visualization 20
  21. 21. Experiment on Microsoft Azure (online model) DaST-L > DaST-P > Pretrained • Attacked : unknown • Substitute : 5 convs net The low attack successful rate of ‘pretrained’ implies that unknown model is very different from substitute model. 21
  22. 22. Visualization DaST generates ‘singular’ images because of first term e-d(T(X),D(X)) of LG 22
  23. 23. Re-implementation part 23
  24. 24. Model Architecture 24
  25. 25. Model Architecture ・論文に層の数やパラメータなどの記載なし 25
  26. 26. Model Architecture ・論文に層の数やパラメータなどの記載なし 26
  27. 27. Model Architecture ・論文には ”3(,4,5) convolutional layers“ としか記載がない 27
  28. 28. Training α=0.2 ・Dataset : MNIST ・Scenario : Non-Targeted, Probability-only/Label-only ・Optimizer : Adam(lr=0.0001) (論文に記載なし) ・# of samples : 記載がなかったので十分に繰り返しを行った ・Attack method : FGSM 以下の設定で再現実験を行った 28
  29. 29. Result (Prob-only) 再現実験ではLGやLDは下がったが、MNISTに対するAccuracyが論文ほど上がらなかった。 -> その結果、Target modelに対する有効なAdversarial Examplesが生成できなかった(ASR が論文ほど上がらなかった) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309 323 337 351 365 379 393 407 421 435 449 463 477 491 Epoch Acc_mnist Acc_synth ASR 0 0.2 0.4 0.6 0.8 1 1 20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch LC 0 0.001 0.002 0.003 0.004 0.005 1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 321 341 361 381 401 421 441 461 481 Epoch LD = 29
  30. 30. Result (Prob-only) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309 323 337 351 365 379 393 407 421 435 449 463 477 491 Epoch Acc_mnist Acc_synth ASR 0 0.2 0.4 0.6 0.8 1 1 20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch LC 0 0.001 0.002 0.003 0.004 0.005 1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 321 341 361 381 401 421 441 461 481 Epoch LD = 30 Acc_synth: 生成された画像に対する代替モデルの精度 Acc_mnist: MNISTのテストセット(10000サンプル) に対する代替モデルの精度 ASR: 代替モデルでのAttack Successful Rate 学習が不安定 精度が頭打ちになった 損失がすぐに頭打ちになった 再現実験ではLGやLDは下がったが、MNISTに対するAccuracyが論文ほど上がらなかった。 -> その結果、Target modelに対する有効なAdversarial Examplesが生成できなかった(ASR が論文ほど上がらなかった)
  31. 31. 0 0.5 1 1.5 2 1 20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch LD Result (Label-only) Label-onlyの場合でも、Probability-onlyの場合のようにMNISTのテストセットに対する代替モデル の精度が0.4程度で学習が進まなくなってしまった。 0 0.2 0.4 0.6 0.8 1 1 20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch LC = 31 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309 323 337 351 365 379 393 407 421 435 449 463 477 491 Epoch Acc_mnist Acc_synth ASR
  32. 32. Result 論文に記載されている値を再現できなかった原因として以下が考えられる • 学習の難しさ • 本手法では通常のGANの学習のようにGとDのミニマックスゲームになっている (p.8,9 参照) 実際に、前ページで示したように学習が不安定であった。 • 実データ(MNIST)は代替モデルからは invisibleであり、精度が保証されない 前ページの実験では提案されているLDやLGがきちんと下がっているにも関わらず、MNISTに対 する精度は上がらなかった。 以上より、モデルのハイパパラメータや学習方法の詳細が省略されている 原論文の情報だけでは実験結果が再現できない可能性がある。 ( ) 32 (※再現実験に使用したパラメータはいくつか行った実験のうち最良のものを載せている) Code URL: https://github.com/mapooon/DaST_reimplement

×