Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Lowering the bar: deep learning for side-channel analysis

2,182 views

Published on

Deep learning can help automate the signal analysis process in power side channel analysis. We show how typical signal processing problems such as noise reduction and re-alignment are automatically solved by the deep learning network. We show we can break a lightly protected AES, an AES implementation with masking countermeasures and a protected ECC implementation. These experiments indicate that where previously side channel analysis had a large dependency on the skills of the human, first steps are being developed that bring down the attacker skill required for such attacks using Deep Learning automation.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Lowering the bar: deep learning for side-channel analysis

  1. 1. 1 Lowering the Bar: Deep Learning for Side Channel Analysis Guilherme Perin, Baris Ege, Jasper van Woudenberg @jzvw Blackhat, August 9, 2018
  2. 2. 2 Before BLACKHAT 2018 Leakage modeling Signal processing
  3. 3. 3 After BLACKHAT 2018
  4. 4. 4 Power / EM side channel analysis BLACKHAT 2018
  5. 5. 5BLACKHAT 2018
  6. 6. 6 Power analysis BLACKHAT 2018 Some crypto algorithm
  7. 7. 7 Example (huge) leakage BLACKHAT 2018 Data leakage Noise
  8. 8. 8 Signal processing (demo) Raw trace Processed trace BLACKHAT 2018
  9. 9. 9 Misalignment (demo) BLACKHAT 2018
  10. 10. 10 AES-128 first round attack 𝑖0 𝑖4 𝑖8 𝑖12 𝑖1 𝑖5 𝑖9 𝑖13 𝑖2 𝑖6 𝑖10 𝑖14 𝑖3 𝑖7 𝑖11 𝑖15 π‘˜0 π‘˜4 π‘˜8 π‘˜12 π‘˜1 π‘˜5 π‘˜9 π‘˜13 π‘˜2 π‘˜6 π‘˜10 π‘˜14 π‘˜3 π‘˜7 π‘˜11 π‘˜15 π‘₯0 π‘₯4 π‘₯8 π‘₯12 π‘₯1 π‘₯5 π‘₯9 π‘₯13 π‘₯2 π‘₯6 π‘₯10 π‘₯14 π‘₯3 π‘₯7 π‘₯11 π‘₯15 𝑠0 𝑠4 𝑠8 𝑠12 𝑠1 𝑠5 𝑠9 𝑠13 𝑠2 𝑠6 𝑠10 𝑠14 𝑠3 𝑠7 𝑠11 𝑠15 S-BOX Key Addition Unknown Known Leakage model, Power prediction
  11. 11. 11 Points of interest selection BLACKHAT 2018 Data leakage Noise Correlation, T-test, Difference of Means Samples showing statistical dependency between intermediate (key-related) data and power consumption.
  12. 12. 12 Concept of Template Analysis BLACKHAT 2018 Open Sample Ciphertext Keys Templates Leakage ModelMeasure Closed Sample Fixed Key Analysis InputMeasure Input Learn (Profiling) Phase Attack (Exploitation) Phase
  13. 13. 13 Key recovery BLACKHAT 2018 Number of traces KeyByteRank AES key bytes 0-15
  14. 14. 14 The actual process Setup Processing Acquisition Analysis BLACKHAT 2018
  15. 15. 15 Deep learning background BLACKHAT 2018
  16. 16. 16 Deep Learning BLACKHAT 2018 dog dog dog cat cat cat Data with labels
  17. 17. 17 Deep Learning BLACKHAT 2018 machine Cat (%) Dog (%) Error function BACK-PROPAGATION ALGORITHM Train a machine to classify these data Data with labels
  18. 18. 18 Deep Learning BLACKHAT 2018 Test the machine on new data Train a machine to classify these data Data with labels Cat (%) Dog (%) Trained machine
  19. 19. 19 Deep Learning BLACKHAT 2018 Is classification accuracy good enough? No Yes We are done! Change parameters Test the machine on new data Train a machine to classify these data Data with labels Trained machine Cat Machine = Deep Neural Network
  20. 20. 20 Input Layer (the size is equivalent to the number of samples) Dense Layers (classifiers) Output Layer (the size is equivalent to the number of classes) Conv. Layers (feature extractor + encoding) The convolutional layers are able to detect the features independently of their positions Convolutional Neural Networks (CNNs)
  21. 21. 21 Creating training/test/validation data sets BLACKHAT 2018 samples HW = 5 samples HW = 7 samples HW = 3 samples HW = 4 … features label … Leakage model
  22. 22. 22 Classification BLACKHAT 2018 HW = 4 HW = 5 HW = 6 HW = 7 Key enumeration using output probabilities (Bayes)0.65 0.15 0.05 0.08 0.01 0.02 0.01 0.02 0.02 Softmax (Οƒ 𝑝𝑖 = 1) Trained Model Trace (samples)
  23. 23. 23 Deep learning on side channels in practice BLACKHAT 2018
  24. 24. 24 Step 1: Define initial hyper-parameters (demo) BLACKHAT 2018 HW = 0 HW = 1
  25. 25. 25 Step 2: Make sure it’s capable of learning BLACKHAT 2018 β€’ Increase the number of training traces and observe the training and validation accuracy β€’ Overfitting too fast? β€’ Training accuracy: 100% | Validation accuracy: low β€’ Neural network is too big for the number of traces and samples
  26. 26. 26 Step 3: Make it generalize Make sure the training accuracy/recall is increasing 0.111 Validation recall stays above the minimum threshold value = model is generalizing 0.111 = 1/9 (9 is the number of classes – HW of a byte) NN is learning from its training set BLACKHAT 2018
  27. 27. 27 Step 3: Make it generalize Regularization techniques:  L1, L2 (penalty applied to the weights)  Dropout  Data Augmentation (+traces)  Early Stopping BLACKHAT 2018 π‘₯2 π‘₯1 π‘₯2 π‘₯1 π‘₯2 π‘₯1 Linear Separation Good Regularization Overfitting Low Training Accuracy Low Validation Accuracy Good Training Accuracy Good Validation Accuracy High Training Accuracy Low Validation Accuracy
  28. 28. 28 Step 4: Key Recovery In this analysis, we only need slightly-above coin flip accuracy! BLACKHAT 2018 Number of traces KeyByteRank
  29. 29. 29 Getting keys from the thingz! BLACKHAT 2018
  30. 30. 30 PiΓ±ata AES-128 with misalignment (demo) BLACKHAT 2018
  31. 31. 31 Bypassing Misalignment with CNNs Neural Network: Input Layer > ConvLayer > 36 > 36 > 36 > Output Layer Training/validation/test sets: 90000/5000/5000 traces of 500 samples Leakage Model: HW of S-Box Out (Round 1) β†’ 9 classes Results for key byte 0: Use Data Augmentation as regularization technique to improve generalization BLACKHAT 2018 Number of traces KeyByteRank
  32. 32. 32 Breaking protected ECC on PiΓ±ata Supervised deep learning attack: - Curve25519, Montgomery ladder, scalar blinding - Messy signal - Brute-force methods for ECC are needed if test accuracy < 100% - Need to get (almost) all bits from one trace! BLACKHAT 2018
  33. 33. 33 Breaking protected ECC Unsupervised/Supervised Horizontal Attack: 60% success rate Deep learning: 90% success rate Deep learning( + data augmentation): 99.4% success rate Data augmentation: 25k β†’ 200k traces. Misaligned traces BLACKHAT 2018 4 Dense Layers (100 Neurons) 3 Conv Layers (10 filters) Input (4000) Output (2 Classes) RELU TANH SOFTMAX
  34. 34. 34 Breaking AES with First-Order Masking (demo)  Target published in 2013 (http://www.dpacontest.org/v4/)  40k traces available  AES-256 (Atmel ATMega-163 smart card)  Countermeasure: Rotating S-box Masking (RSM) BLACKHAT 2018
  35. 35. 35 Breaking AES with First-Order Masking π‘₯ βŠ• π‘š 𝑦 βŠ• π‘š Combine data: (π‘₯ βŠ• π‘š) βŠ• (𝑦 βŠ• π‘š) = π‘₯ βŠ• 𝑦 Combine samples: 𝑑 𝑖 Γ— 𝑑 𝑗 Brute-force 𝑖, 𝑗 β†’ Quadratic complexity Remove the relationship between power consumption (EM) and predictable data BLACKHAT 2018 π‘₯ βŠ• π‘š 𝑦 βŠ• π‘š 𝑖 𝑗
  36. 36. 36 Breaking AES with First-Order Masking BLACKHAT 2018 Challenge: Training key == validation key Correct key byte candidate (good generalization) Wrong key byte candidate (poor generalization) 1/9
  37. 37. 37 Breaking AES with First-Order Masking BLACKHAT 2018 Overfitting can be verified by checking where the NN is learning Correct key byte candidate (CNN learns from specific and leaky samples) Wrong key byte candidate (CNN overfits because it can’t distinguish leaky samples from noise)
  38. 38. 38 Breaking AES with First-Order Masking Neural Network: Input Layer > ConvLayer > 50 > 50 > 50 > Output Layer Training/validation/test sets: 36000/2000/2000 traces Leakage Model: HW of S-Box Out (Round 1) β†’ 9 classes 1/9 BLACKHAT 2018 Results for key byte 0: The processing of 8 traces is sufficient to recover the key
  39. 39. 39 1st cool thing BLACKHAT 2018 This shouldn’t work
  40. 40. 40 2nd cool thing BLACKHAT 2018 DL is up there with dozens of SCA research teams
  41. 41. 41 Wrapping up BLACKHAT 2018
  42. 42. 42 I want to learn more! BLACKHAT 2018 Deeplearningbook.org introtodeeplearning.com bookstores nostarch ? By Colin & Jasper
  43. 43. 43 Key takeaways BLACKHAT 2018 β€’ DL does SCA art + science and scales β€’ DL requires network fiddling, the bar is low, not yet at 0 β€’ Automation needed to put a dent in embedded insecurity
  44. 44. 44 References BLACKHAT 2018 β€’ http://www.deeplearningbook.org/ β€’ S. Haykin, β€œNeural Networks and Learning Machines”. β€’ E. Cagli et al, β€œBreaking Cryptographic Implementations Using Deep Learning” https://eprint.iacr.org/2016/921.pdf β€’ Benadjila et al, β€œStudy of Deep Learning Techniques for Side-Channel Analysis and Introduction to ADCAD Database” https://eprint.iacr.org/2018/053.pdf β€’ H. Maghrebi et al, β€œConvolutional Neural Networks with Data Augmentation Against Jitter-Based Countermeasures” https://eprint.iacr.org/2017/740 β€’ Zhang et al, β€œUnderstanding deep learning requires re-thinking generalization” https://arxiv.org/abs/1611.03530 β€’ Keskar et al, β€œOn Large-Batch Training for Deep Learning: Generalization Gap and SharpMinima”, https://arxiv.org/pdf/1609.04836.pdf β€’ Shwartz and Tishby, β€œOpening the black-box of Deep Learning via Information”, https://arxiv.org/abs/1703.00810
  45. 45. 45 Challenge your security Riscure B.V. Frontier Building, Delftechpark 49 2628 XJ Delft The Netherlands Phone: +31 15 251 40 90 www.riscure.com Riscure North America 550 Kearny St., Suite 330 San Francisco, CA 94108 USA Phone: +1 650 646 99 79 inforequest@riscure.com Riscure China Room 2030-31, No. 989, Changle Road, Shanghai 200031 China Phone: +86 21 5117 5435 inforcn@riscure.com jasper@riscure.com @jzvw

Γ—