Lifecycle-Aware Power Side-Channel Malware Detection - Alexander Cathis

SLAM Lab
System-Level Architecture and Modeling Group
Lifecycle-Aware Power Side-Channel Malware
Detection
Alexander Cathis
Advisor: Dr. Andreas Gerstlauer
Advisor: Dr. Mohit Tiwari
October 22, 2024

Side-Channel
• Implementation-based medium that
leaks information
• Not a weakness of the algorithm
• "Flaw” of hardware implementation
• Electromagnetic, power, timing, etc.
• Broad and impactful information
Can be used for attack and defense
• Well-fitted for embedded systems
• Deploy on smart battery
• Orthogonal to other defenses
No HW/SW overhead for target device
2
Power-based detector flow [Wei et al. ‘19]

Key Challenges
• Power-based challenges
• Power signal is noisy
• Large characterization scope increases misclassification
• Attacker can produce power-mimicking malware [Wei et al. ‘19]
• Prior work limitations
• Evaluation on parallel task sets
• Inappropriate utilization of ML tools
• Lack of rigorous public datasets
Ø Did not evaluate for modern complex systems
3

Detection for Modern Systems
• State awareness
• Any unique combination of executing tasks presents an operating state
• Train a one-class classification (OCC) pipeline on each benign state
• Scaling to parallel task sets
• Each pipeline labels windows from its trained state as benign, all others as malicious
• With more tasks, add more pipelines
4

Experimental Setup
5
Target Device Portwell PCOM-C700 Type VII carrier board
Portwell PCOM-B700G processor module
8-core Intel Xeon D-1539 embedded class processor
Power Sampling Spliced 12V CPU power rail
Adafruit INA169 analog current sensor
Detector Deployed on Raspberry Pi4
Python implementation achieves 27 inferences per second
Features Sampled 12V CPU power rail at 2 KHz
For regression-based detectors, input window was size 1000 and prediction window 3
For other ML formulations, each sliding window was transformed into a feature vector
Feature vector consisted of statistical, and bag-of-words features
Prior Works Replicated representative works for various ML formulations
Non-ensemble formulations include: one-class classification, binary classification,
multiclass classification, ensemble of one-class classifiers, regression, statistical tests
Mix of non-deep and deep methods evaluated
Benchmarks Deployed on Raspberry3 benign applications representing drone tasks; SHA-3, face
detection, autonomous drone path-finding
3 Microarchitectural attacks; Meltdown, Spectre, and L1 Cache covert-channel

Performance Evaluation
• Characterize Operating Range
• 𝑆 = 2!
• 3 applications → 8 benign states → 64 comparisons
• Ensemble has limitations
• NOP insert, low-power, power-mimicry, noise
Ø Power cannot detect everything
6
Evaluation of prior works
Evaluation of ensemble approaches
Evaluation against software behaviors

Lifecycle-Aware Detector Architecture
7

HMM Results
8
• Baseline Flow Evaluation
• Comparison detectors: density-based detector, state-based ensemble

Summary, Conclusions, and Future Work
9
ü Lifecycle-Aware Power Side-Channel Malware Detection
• Can improve performance in face of noisy local detectors
• Future Work: Lifecycle Awareness for Cloud Security
• Cloud Access Control
• Attack flow generation
• Combining data sources -> “multidomain” detector

Overview
• Side-Channel
• Power side-channel detection
• Scaling to modern, complex systems
• Detector architecture
• Experiments
Need for lifecycle awareness
• Lifecycle Awareness
• Lifecycle intuition
• Detector architecture
• Implementation details
• Experiments
Can be used in other domains
12

Side-Channel
• Implementation-based medium that
leaks information
• Not a weakness of the algorithm
• "Flaw” of hardware implementation
• Electromagnetic, power, timing, etc.
Broad attack vector
• Broad and impactful attack vector
• Meltdown, Spectre, SPA, DPA, etc.
Real-world threat
13
SPA (simple power analysis) attack exploiting a power trace
during modular exponentiation [Fujino et al. ‘17]

Power Side-Channel
• Used to steal encryption keys
• Attacks: SPA, DPA
• Defenses: Constant power, blinding,
randomized execution
Arms race
• Defense vector
• Utilize leaked information for defense
• Characterize standard operation with ML
• Raise alarm when deviations occur
Out-of-band detection mechanism
14
Power traces of benign and malicious workloads [Wei et al. ‘19]

Detection for Embedded Systems
• Out-of-band deployment
• Deploy on smart battery
• Orthogonal to other defenses
No HW/SW overhead for target device
15
Power-based detector flow [Wei et al. ‘19]
• Machine Learning
• Non-deep models have proven results
• Learns periodic behavior of embedded tasks
• Does not train on malware
Well-fitted for embedded systems

Detection for Modern Systems
• State awareness
• Any unique combination of executing tasks presents an operating state
• Train a one-class classification (OCC) pipeline on each benign state
• Scaling to parallel task sets
• Each pipeline labels windows from its trained state as benign, all others as malicious
• With more tasks, add more pipelines
16

Operating Range
• Many attacks cannot be reliably detected
• Slow down attack
• Low-power software-exploiting attacks
• Power-mimicking attack
17
Detection against slowdown attacks Detection against software actions

ATT&CK Matrix Simplified
19
Initial Access Discovery Execution Persistence Exfiltration

Attack Lifecycles
20

Detecting Techniques
21
• Initial Access, Discovery, Persistence, Exfiltration
• Initial Access, Execution
• Initial Access, Execution, Exfiltration

Likelihood of Attack Lifecycle
22
• Initial Access, Discovery, Persistence, Exfiltration
• Initial Access, Execution
• Initial Access, Execution, Exfiltration
• Exfiltration, Initial Access, Execution (???)

Action Classifier
26
• Supervised classifier
• Only trains on malware
Ø Poor performance
• Sequence processor
• Converts outputs of action classifier into
sequence without repeats
Lifecycle techniques

Hidden Markov Model
27
𝑇ℎ𝑟𝑒𝑎𝑡 𝑆𝑐𝑜𝑟𝑒 =
𝒍
𝒆𝒙𝒑 log(3
𝑿
𝑷(𝒀|𝑿)𝑷(𝑿))

Ensemble Detector Key Takeaways
• Multicore context is a challenge
• Detection rates decrease and do not scale well
• Must consider the worst-case between benign and malicious state
comparisons
• ML formulation is crucial
• Performance heavily relies on appropriate formulation of detection
problem
• Improper formulations have catastrophically failed in our experiments
• Basic models can outperform complex ones in the right formulation
29

Lifecycle-Aware Power Side-Channel Malware Detection - Alexander Cathis

More Related Content

Similar to Lifecycle-Aware Power Side-Channel Malware Detection - Alexander Cathis

More from MITRE ATT&CK

Recently uploaded

Lifecycle-Aware Power Side-Channel Malware Detection - Alexander Cathis