Presentation @ RECOSOC2019. Proposal of a reconfigurable Antivirus implemented on FPGA. The solution captures performance counters data via shared memory bus and classifies it as outlier or not using machine learning classifiers, such as SVM, Random Forest and MLP.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
The AV says: Your Hardware Definitions were Updated!
1. Introduction Malware REHAB Evaluation Final Remarks
The AV says: Your hardware definitions
were updated!
Marcus Botacin1, Lucas Galante2, Fabr´ıcio Ceschin1, Paulo
Santos3, Luigi Carro3, Paulo L´ıcio de Geus2, Andr´e Gr´egio1,
Marco Zanata1
1Federal University of Parana (UFPR-BR)
{mfbotacin, fjoceschin, gregio, mazalves}@inf.ufpr.br
2University of Campinas (UNICAMP-BR)
{galante, paulo}@lasca.ic.unicamp.br
3Federal University of Rio Grande do Sul (UFRGS-BR)
{pcssjunior, carro}@inf.ufrgs.br
The AV says: Your hardware definitions were updated! 1 / 35 ReCoSoC’19
2. Introduction Malware REHAB Evaluation Final Remarks
Who Am I?
Background
Computer Engineer (University of Campinas–Brazil).
Computer Science Master (University of Campinas–Brazil).
Computer Science PhD Candidate (Federal University of
Paran´a–Brazil).
Malware Analyst (Since 2012).
Research Interests
Malware Analysis & Detection.
Hardware-Assisted Security.
Security Co-Processors.
The AV says: Your hardware definitions were updated! 2 / 35 ReCoSoC’19
3. Introduction Malware REHAB Evaluation Final Remarks
The Problem
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 3 / 35 ReCoSoC’19
4. Introduction Malware REHAB Evaluation Final Remarks
The Problem
The Problem
Malware
Self-Modifying, Obfuscated Code.
Constantly evolving over time.
AntiViruses (AVs)
Runtime Monitoring.
Performance-Intensive Applications.
Require periodic updates.
The AV says: Your hardware definitions were updated! 4 / 35 ReCoSoC’19
5. Introduction Malware REHAB Evaluation Final Remarks
The Problem
The Challenges
An Alternative for AVs
Move AV to hardware.
No monitoring overhead.
Hardware AV drawbacks
Identify Low-Level Features.
Deploy Hardware Updates.
Insight
Use Reconfigurable Hardware.
The AV says: Your hardware definitions were updated! 5 / 35 ReCoSoC’19
6. Introduction Malware REHAB Evaluation Final Remarks
The Problem
Insight Support
Past
Hardware AVs with no update support.
Targeting embedded systems only.
Future
Source: http://tinyurl.com/y2yabdrp
The AV says: Your hardware definitions were updated! 6 / 35 ReCoSoC’19
7. Introduction Malware REHAB Evaluation Final Remarks
Background
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 7 / 35 ReCoSoC’19
8. Introduction Malware REHAB Evaluation Final Remarks
Background
Background
AV Operation Modes
Signature-Based.
Behavior-Based.
Profiling-Based.
Machine Learning Classifiers
Support Vector Machines (SVM).
Random Forest (RF).
Multi-Layer Perceptron (MLP).
The AV says: Your hardware definitions were updated! 8 / 35 ReCoSoC’19
9. Introduction Malware REHAB Evaluation Final Remarks
Background
Profiling-Based AV
1k
10k
100k
1M
10M
0 10 20 30 40 50 60
Classifier Separation
Branch−misses(#)
Elapsed time (s)
Perf stat branch−miss:
Branch−miss stat comparison between goodware and malware during 60 seconds.
bluetooth−wizard
dpkg−log−summary
3f5b5...f0
3c55...50
Figure: Malware Classification using low level features.
The AV says: Your hardware definitions were updated! 9 / 35 ReCoSoC’19
10. Introduction Malware REHAB Evaluation Final Remarks
Background
SVM Classifier
b
b
b
O
u
t
p
u
t
X2
X3
XI
n
p
u
t
X1
W
X
X
+
+
+
B
W3
W2
W1
B1
Figure: SVM. A hyperplane with maximum separation between two
classes is created and used to predict samples (left). The circuit
implemented (right) multiplies the input (xi ) by the learned parameters
(wi ) and adds b.
The AV says: Your hardware definitions were updated! 10 / 35 ReCoSoC’19
11. Introduction Malware REHAB Evaluation Final Remarks
Background
RF Classifier
I
n
p
u
t
W
<
<
0
O
u
t
p
u
t
X1
X2
W1
W2
Figure: Random Forest. A single decision tree (from the ensemble) is
shown (left) with the corresponding circuit of two nodes (right), with
comparators and a MUX deciding the decision path.
The AV says: Your hardware definitions were updated! 11 / 35 ReCoSoC’19
12. Introduction Malware REHAB Evaluation Final Remarks
Background
MLP Classifier
Hidden
Layer
Output
Input
N
e
u
r
o
n
O
u
t
p
u
t
X2
X3
XI
n
p
u
t
X1
W
X
X
+
+
+
B
W3
W2
W1
B1
𝜑
Figure: MLP. A feed-forward neural network composed by multiple
neurons (left), which circuit implementation (right) is similar to SVM,
but using an activation function to calculate the output.
The AV says: Your hardware definitions were updated! 12 / 35 ReCoSoC’19
13. Introduction Malware REHAB Evaluation Final Remarks
Architecture
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 13 / 35 ReCoSoC’19
14. Introduction Malware REHAB Evaluation Final Remarks
Architecture
REconfigurable, Hardware-Assisted Blocking (REHAB) of
Malware
FPGA-modelled.
Mechanism notifying software-AV via detection interrupts.
Runtime monitoring with no performance overhead.
System-wide or Per-Process monitoring modes.
Collects HPC data via memory-mapped registers.
Profiling-based detection.
Implement only the matching step of ML classifiers.
Updatable via software (AV compatible).
Updates the entire classifier and not only its weights.
The AV says: Your hardware definitions were updated! 14 / 35 ReCoSoC’19
15. Introduction Malware REHAB Evaluation Final Remarks
Architecture
REHAB Architecture
Figure: REHAB Architecture. CPU’s HPC data is used as feature for a
FPGA-based, reconfigurable ML classifier updatable via software.
The AV says: Your hardware definitions were updated! 15 / 35 ReCoSoC’19
16. Introduction Malware REHAB Evaluation Final Remarks
Architecture
REHAB Implementation
RAM_CLOCK
RAM_WR
RAM_ADDR[6..0]
RAM_DATA[4..0]
RAM_DATA_OUT[4..0]
7’h06..
5’h00..
Single_port_RAM_VHDL:uut
PRE
ENA
D Q
Support4[4]
D
PRE
ENA
Q
Support4[2]
DATAA
DATAB
OUT0
Y~[63..0]
MUX21
Y~[63..0]
PRE
D
ENA
CLRN
Q
<
temp
PRE
D
ENA
CLRN
Q Detection
LessThan1
LESS_THAN
B[63..0]
A[63..0]
63’ h00..
1’ h00..
CLRN
CLRN
SEL
Figure: Excerpt of a ML classifier implemented in FPGA. ML
parameters are loaded from an external memory at startup and can be
updated by software writes to the external RAM memory.
The AV says: Your hardware definitions were updated! 16 / 35 ReCoSoC’19
17. Introduction Malware REHAB Evaluation Final Remarks
Goals
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 17 / 35 ReCoSoC’19
18. Introduction Malware REHAB Evaluation Final Remarks
Goals
We demonstrate:
The need of a hardware accelerator.
The need of reconfigurable hardware.
Hardware requirements.
The AV says: Your hardware definitions were updated! 18 / 35 ReCoSoC’19
19. Introduction Malware REHAB Evaluation Final Remarks
Hardware Accelerator
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 19 / 35 ReCoSoC’19
20. Introduction Malware REHAB Evaluation Final Remarks
Hardware Accelerator
AV Checks Cost
Table: Execution Speedup per AV check. Hardware Accelerator is
essential for overhead elimination.
ML algorithm → SVM RF MLP
CPU 220µs 270µs 240µs
FPGA+Comm 124.5ns 111.2ns 158.9ns
Speedup 1.7k× 2.4k× 1.5k×
The AV says: Your hardware definitions were updated! 20 / 35 ReCoSoC’19
21. Introduction Malware REHAB Evaluation Final Remarks
Reconfigurable Hardware
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 21 / 35 ReCoSoC’19
22. Introduction Malware REHAB Evaluation Final Remarks
Reconfigurable Hardware
SVM
Table: SVM Classifier. 1000 iterations in a linear kernel results in the
best accuracy for the VirusTotal dataset.
Kernel/Iter (#) 1000 10000 100000
Poly 0.2960 0.2960 0.2960
Linear 0.8256 0.7952 0.8088
rbf 0.4793 0.4793 0.4793
Table: SVM Classifier. 1000 iterations in a linear kernel results in the
best accuracy for the VirusShare dataset.
Kernel/Iter (#) 1000 10000 100000
Poly 0.3644 0.4234 0.4234
Linear 0.7705 0.7353 0.7266
rbf 0.5001 0.4759 0.4759
The AV says: Your hardware definitions were updated! 22 / 35 ReCoSoC’19
23. Introduction Malware REHAB Evaluation Final Remarks
Reconfigurable Hardware
MLP
Table: MLP Classifier. Alpha as 100 with adam solver results in the best
accuracy for the VirusTotal dataset.
Solver/Alpha (#) 0.01 1 100 1000
sgd 0.4997 0.4997 0.4997 0.5003
adam 0.7098 0.7218 0.7433 0.7213
lbfgs 0.4997 0.4997 0.4997 0.4997
Table: MLP Classifier. Alpha as 1 with adam results in the best accuracy
for the VirusShare dataset.
Solver/Alpha (#) 0.01 1 100 1000
sgd 0.4999 0.4999 0.4929 0.4999
adam 0.7288 0.7614 0.6951 0.7067
lbfgs 0.4999 0.4999 0.4999 0.4997
The AV says: Your hardware definitions were updated! 23 / 35 ReCoSoC’19
24. Introduction Malware REHAB Evaluation Final Remarks
Reconfigurable Hardware
RF
Table: RF Classifier. 16 estimators and a max depth of 64 results in the
best accuracy for the VirusTotal dataset.
Depth/Est (#) 8 16 32 64 128
4 0.9240 0.9172 0.9178 0.9214 0.9199
8 0.9366 0.9366 0.9398 0.9434 0.9403
16 0.9377 0.9455 0.9408 0.9445 0.9429
32 0.9350 0.9439 0.9403 0.9460 0.9445
64 0.9392 0.9466 0.9445 0.9434 0.9445
Table: RF Classifier. 16 estimators and a max depth of 16 results in the
best accuracy for the VirusShare dataset.
Depth/Est (#) 8 16 32 64 128
4 0.9564 0.9569 0.9577 0.9601 0.958
8 0.9644 0.9642 0.9653 0.9644 0.9661
16 0.9626 0.9671 0.9655 0.9639 0.9671
32 0.9644 0.9642 0.965 0.9644 0.9661
64 0.962 0.965 0.9442 0.9653 0.9647
The AV says: Your hardware definitions were updated! 24 / 35 ReCoSoC’19
25. Introduction Malware REHAB Evaluation Final Remarks
Reconfigurable Hardware
Concept Drift
Table: Classifier’s concept drift. Whereas the MLP classifier best
scored in the VirusTotal dataset, the RandomForest classifier was the
best choice for the VirusShare dataset, thus showing the need of having
reconfigurable AV mechanisms.
Classifier/Dataset VirusShare VirusTotal
Random Forest 0.9144 0.6953
MLP 0.881 0.9738
SVM 0.9079 0.5728
The AV says: Your hardware definitions were updated! 25 / 35 ReCoSoC’19
26. Introduction Malware REHAB Evaluation Final Remarks
Hardware Requirements
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 26 / 35 ReCoSoC’19
27. Introduction Malware REHAB Evaluation Final Remarks
Hardware Requirements
Table: SVM Implementations.
Classifier Work LUTs/REGs/MULs/DSPs
This 520/196/5/20
SVM [NF16] 832/–/–/–
[MSKT] 748/–/–/–
Table: RF Implementations.
Classifier Work LUTs/REGs/MULs/DSPs
This 707/40/0-7.5K/240/0/0
RF [FBL09] 4k-24K/–/–/–
[NJSS17] 600-118K/–/–/–
Table: MLP Implementations.
Classifier Work LUTs/REGs/MULs/DSPs
This 170/89/5-11K/690/502/38
MLP [FBL09] 6.7K/5K/–/–
[eMEH14] 26.8K/4K/–/–
The AV says: Your hardware definitions were updated! 27 / 35 ReCoSoC’19
28. Introduction Malware REHAB Evaluation Final Remarks
Limitations
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 28 / 35 ReCoSoC’19
29. Introduction Malware REHAB Evaluation Final Remarks
Limitations
Limitations & Future Work
Limitations
Proof-of-Concept (PoC) for future developments.
Single Classifier.
Current AV detection drawbacks.
Future Work
Parallel Classifiers.
In-Place Learning.
Feedback Information for AV companies.
The AV says: Your hardware definitions were updated! 29 / 35 ReCoSoC’19
30. Introduction Malware REHAB Evaluation Final Remarks
Conclusion
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 30 / 35 ReCoSoC’19
31. Introduction Malware REHAB Evaluation Final Remarks
Conclusion
Conclusion
Malware are very dynamic pieces of code.
Malware classifiers present concept drift.
Antivirus are performance-intensive applications.
Reconfigurable Hardware as a promising alternative for
efficient and effective malware detection.
The AV says: Your hardware definitions were updated! 31 / 35 ReCoSoC’19
32. Introduction Malware REHAB Evaluation Final Remarks
Questions?
Topics
1 Introduction
The Problem
Background
2 Malware REHAB
Architecture
3 Evaluation
Goals
Hardware Accelerator
Reconfigurable Hardware
Hardware Requirements
4 Final Remarks
Limitations
Conclusion
Questions?
The AV says: Your hardware definitions were updated! 32 / 35 ReCoSoC’19
33. Introduction Malware REHAB Evaluation Final Remarks
Questions?
Contact
mfbotacin@inf.ufpr.br
The AV says: Your hardware definitions were updated! 33 / 35 ReCoSoC’19
34. Introduction Malware REHAB Evaluation Final Remarks
Questions?
References
Sami el Moukhlis, Abdessamad Elrharras, and Abdellatif Hamdoun, Fpga
implementation of artificial neural networks, 2014.
Antonyus Ferreira, Edna Barros, and Teresa Ludermir, Fpga design of a
mlp artificial neural network architecture,
http://www.lbd.dcc.ufmg.br/colecoes/sforum/2009/0046.pdf,
2009.
D. Mahmoodi, A. Soleimani, H. Khosravi, and M. Taghizadeh, Fpga
simulation of linear and nonlinear support vector machine.
Daniel H Noronha and Marcelo Fernandes, Implementac¸˜ao em fpga de
m´aquina de vetores de suporte (svm) para classificac¸˜ao e regress˜ao,
http://www.lbd.dcc.ufmg.br/colecoes/eniac/2016/004.pdf, 2016.
H. Nakahara, A. Jinguji, S. Sato, and T. Sasao, A random forest using a
multi-valued decision diagram on an fpga, ISMVL, 2017.
The AV says: Your hardware definitions were updated! 34 / 35 ReCoSoC’19