SlideShare a Scribd company logo
Motivation Challenges & Alternatives Experiments Concluding Remarks
Malware Variants Identification in Practice
Marcus Botacin1, Andr´e Gr´egio1, Paulo L´ıcio de Geus2
1Federal University of Paran´a (UFPR) – {mfbotacin, gregio}@inf.ufpr.br
2University of Campinas (Unicamp) – paulo@lasca.ic.unicamp.br
SBSEG 2019
Malware Variants Identification in Practice 1 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Agenda
1 Motivation
Motivation
2 Challenges & Alternatives
Challenge 1
Challenge 2
3 Experiments
Evaluation
4 Concluding Remarks
Limitations
Conclusions
Malware Variants Identification in Practice 2 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Motivation
Agenda
1 Motivation
Motivation
2 Challenges & Alternatives
Challenge 1
Challenge 2
3 Experiments
Evaluation
4 Concluding Remarks
Limitations
Conclusions
Malware Variants Identification in Practice 3 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Motivation
Malware at Scale!
Figure: Source: https://www.infosecurity-magazine.com/news/
360k-new-malware-samples-every-day/
Figure: Source: https://money.cnn.com/2015/04/14/technology/
security/cyber-attack-hacks-security/index.html
Malware Variants Identification in Practice 4 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Motivation
Current Approaches.
Figure: Function-based, Graph Modeling.
Malware Variants Identification in Practice 5 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Challenge 1
Agenda
1 Motivation
Motivation
2 Challenges & Alternatives
Challenge 1
Challenge 2
3 Experiments
Evaluation
4 Concluding Remarks
Limitations
Conclusions
Malware Variants Identification in Practice 6 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Challenge 1
Same-Behavior Function Replacement
CreateProcess
CreateFile
RegCreateKey
NtSuspendThread
Figure: Original
sample’s CG.
CreateThread
CreateFileTransacted
RegLoadKey
RtlWow64SuspendThread
Figure: Variant sample’s
CG.
Modularity
File System Change
Registry Change
Application Interference
Figure: Behavioral
graph from both
samples.
Malware Variants Identification in Practice 7 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Challenge 1
Behavioral Classification
Compression
Cryptography.
Debug.
Delay.
Environment.
Escalation.
Exfiltration.
Fingerprint.
File System.
Interference.
Internet.
Modularity.
Monitoring.
Registry.
Evidence Removal.
Side Effects.
System Changes.
Target Information.
Timing Attacks.
Malware Variants Identification in Practice 8 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Challenge 1
Behavior-based Graph.
Figure: Behavior-based graph for a given sample.
Malware Variants Identification in Practice 9 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Challenge 2
Agenda
1 Motivation
Motivation
2 Challenges & Alternatives
Challenge 1
Challenge 2
3 Experiments
Evaluation
4 Concluding Remarks
Limitations
Conclusions
Malware Variants Identification in Practice 10 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Challenge 2
Malware Embedding
Figure: Sample 1. The original sample.
Figure: Sample 2. Variant sample embedding the original one.
Malware Variants Identification in Practice 11 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Challenge 2
Matching Metrics
Definition
The similarity of two malware, represented as sets, A and B, of
vertices or edges of two graphs, is defined as:
Sim(A, B) =
|A ∩ B|
|A ∪ B|
(1)
Definition
The similarity of two malware, represented as sets, A and B, of
vertices or edges of two graphs, is defined as:
Sim(A, B) = max
|A ∩ B|
|B|
,
|B ∩ A|
|A|
(2)
Malware Variants Identification in Practice 12 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Challenge 2
Continence Results
Table: Continence of
Sample 1 in Sample
2.
CG A B C
I 0.66 0.52 0.64
J 0.75 0.49 0.50
K 0.42 0.80 0.44
Table: Continence of
Sample 2 in Sample
1.
CG A B C
I 0.57 0.56 0.43
J 0.33 0.51 0.44
K 0.76 0.65 0.44
Table: Maximum
continence of Sample
1 and Sample 2.
CG A B C
I 0.66 0.56 0.64
J 0.75 0.51 0.50
K 0.76 0.80 0.44
Malware Variants Identification in Practice 13 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Evaluation
Agenda
1 Motivation
Motivation
2 Challenges & Alternatives
Challenge 1
Challenge 2
3 Experiments
Evaluation
4 Concluding Remarks
Limitations
Conclusions
Malware Variants Identification in Practice 14 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Evaluation
The Whitelisting Effect.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Mimail.a x Mimail.c Mimail.c x Mimail.e Mimail.f x Mimail.g
Evaluating the whitelist effect.
Without whitelist
With whitelist
Figure: Evaluating whitelisting effect. Similarity scores are higher when
using the whitelist-based approach.
Malware Variants Identification in Practice 15 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Evaluation
Advantages of the Behavioral model.
0
0.1
0.2
0.3
0.4
0.5
Mimail.a x Klez.a Mimail.c x Klez.a Mimail.k x Klez.g
Comparing function−based to the behavior−based approaches.
Function−based
Behavior−based
Figure: Function vs. Behavior-based approaches. Scores are higher when
considering behavioral patterns.
Malware Variants Identification in Practice 16 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Evaluation
Evaluating Metrics.
0
0.2
0.4
0.6
0.8
1
Mimail.a x Mimail.c Mimail.c x Mimail.e Mimail.f x Mimail.g
Evaluating the proposed metric.
Usual metric
Proposed metric
Figure: Proposed metric. Scores are higher when using it in comparison
to the usual one.
Malware Variants Identification in Practice 17 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Evaluation
Solutions Comparison.
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
120.0%
H x Q I x Q J x Q L x Q M x Q
Mimail’s similarity.
Solution #1
Solution #2
Our solution
Figure: Mimail’s sample similarity. Our solution’s scores are higher when
compared to other ones.
Malware Variants Identification in Practice 18 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Evaluation
Domain Transformation and Similarity Measures.
0
10
20
30
40
50
60
70
80
90
100
50% 60% 70% 80% 90%
Threshold evaluation
F.Mimail
F.Klez
F.Cross
B.Mimail
B.Klez
B.Cross
Figure: Threshold evaluation. This should be higher than 80% in order to
proper label the cross dataset.
Malware Variants Identification in Practice 19 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Evaluation
Real World Experiments (1/2)
Table: Identified variants among unknown, wild-collected samples.
Family Sample Hash Label
1
A c2ef1aabb15c979e932f5ea1d214cbeb Generic vb.OBY
B 747b9fe5819a76529abc161bb449b8eb Generic vb.OBO
C 39a04a11234d931bfa60d68ba8df9021 Generic vb.OBL
2
A 96d13246971e4368b9ed90c6f996a884 Atros4.CENI
B e23588078ba6a5f5ca1c961a8336ec08 Atros4.CENI
C 31a2b6adc781328cb1d77e5debb318ff Atros4.CENI
Malware Variants Identification in Practice 20 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Evaluation
Real World Experiments (2/2)
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
1.a x 1.b 1.a x 1.c 1.b x 1.c 2.a x 2.b 2.a x 2.c 2.b x 2.c
Study case − Identified variants
ssdeep
Our solution
Coverage
Figure: Study case: variant identification. Our approach outperforms
others even on low coverage scenarios.
Malware Variants Identification in Practice 21 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Limitations
Agenda
1 Motivation
Motivation
2 Challenges & Alternatives
Challenge 1
Challenge 2
3 Experiments
Evaluation
4 Concluding Remarks
Limitations
Conclusions
Malware Variants Identification in Practice 22 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Limitations
Matching Complex Behaviors is Challenging!
Figure: DLL injection functions among other function calls.
Figure: Proposed DLL injection class.
Malware Variants Identification in Practice 23 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Conclusions
Agenda
1 Motivation
Motivation
2 Challenges & Alternatives
Challenge 1
Challenge 2
3 Experiments
Evaluation
4 Concluding Remarks
Limitations
Conclusions
Malware Variants Identification in Practice 24 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Conclusions
Conclusions
Challenges & Lessons
Anti-disassembly breaks CG extraction.
Transparent, dynamic tracing is a viable alternative.
Same-Function Replacement breaks malware clustering.
Behavior-based clustering is a viable alternative.
Dead code breaks similarity metrics.
Continence metric is a viable alternative.
Malware Variants Identification in Practice 25 / 26 SBSeg’19
Motivation Challenges & Alternatives Experiments Concluding Remarks
Conclusions
Questions & Comments.
Contact
mfbotacin@inf.ufpr.br
Additional Material
https://github.com/marcusbotacin/Malware.Variants
Malware Variants Identification in Practice 26 / 26 SBSeg’19

More Related Content

Similar to Malware Variants Identification in Practice

How do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideHow do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideMarcus Botacin
 
Đề thi mẫu 1(ISTQB)
Đề thi mẫu 1(ISTQB)Đề thi mẫu 1(ISTQB)
Đề thi mẫu 1(ISTQB)Jenny Nguyen
 
Question ISTQB foundation 1
Question ISTQB foundation 1Question ISTQB foundation 1
Question ISTQB foundation 1Jenny Nguyen
 
General Factor Factorial Design
General Factor Factorial DesignGeneral Factor Factorial Design
General Factor Factorial DesignNoraziah Ismail
 
5.Black Box Testing and Levels of Testing.ppt
5.Black Box Testing and Levels of Testing.ppt5.Black Box Testing and Levels of Testing.ppt
5.Black Box Testing and Levels of Testing.pptSyedAhmad732853
 
Testing Assumptions in repeated Measures Design using SPSS
Testing Assumptions in repeated Measures Design using SPSSTesting Assumptions in repeated Measures Design using SPSS
Testing Assumptions in repeated Measures Design using SPSSJ P Verma
 
Six Sigma Methods and Formulas for Successful Quality Management
Six Sigma Methods and Formulas for Successful Quality ManagementSix Sigma Methods and Formulas for Successful Quality Management
Six Sigma Methods and Formulas for Successful Quality ManagementIJERA Editor
 
LNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine LearningLNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine Learningbutest
 
Paper multi-modal biometric system using fingerprint , face and speech
Paper   multi-modal biometric system using fingerprint , face and speechPaper   multi-modal biometric system using fingerprint , face and speech
Paper multi-modal biometric system using fingerprint , face and speechAalaa Khattab
 
German credit score shivaram prakash
German credit score shivaram prakashGerman credit score shivaram prakash
German credit score shivaram prakashShivaram Prakash
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareAndrey Karpov
 
Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Es...
Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Es...Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Es...
Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Es...CS, NcState
 
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxIHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxwilcockiris
 
Model validation strategies ftc 2018
Model validation strategies ftc 2018Model validation strategies ftc 2018
Model validation strategies ftc 2018Philip Ramsey
 
Tutorial on Operationalizing Treatments against Bias: Challenges and Solution...
Tutorial on Operationalizing Treatments against Bias: Challenges and Solution...Tutorial on Operationalizing Treatments against Bias: Challenges and Solution...
Tutorial on Operationalizing Treatments against Bias: Challenges and Solution...Mirko Marras
 
Computer simulation technique the definitive introduction - harry perros
Computer simulation technique   the definitive introduction - harry perrosComputer simulation technique   the definitive introduction - harry perros
Computer simulation technique the definitive introduction - harry perrosJesmin Rahaman
 
Six Sigma Methods and Formulas for Successful Quality Management
Six Sigma Methods and Formulas for Successful Quality ManagementSix Sigma Methods and Formulas for Successful Quality Management
Six Sigma Methods and Formulas for Successful Quality ManagementRSIS International
 
Stochastic Modeling - Model Risk - Sampling Error - Scenario Reduction
Stochastic Modeling - Model Risk - Sampling Error - Scenario ReductionStochastic Modeling - Model Risk - Sampling Error - Scenario Reduction
Stochastic Modeling - Model Risk - Sampling Error - Scenario ReductionRon Harasym
 

Similar to Malware Variants Identification in Practice (20)

How do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideHow do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guide
 
Đề thi mẫu 1(ISTQB)
Đề thi mẫu 1(ISTQB)Đề thi mẫu 1(ISTQB)
Đề thi mẫu 1(ISTQB)
 
Question ISTQB foundation 1
Question ISTQB foundation 1Question ISTQB foundation 1
Question ISTQB foundation 1
 
General Factor Factorial Design
General Factor Factorial DesignGeneral Factor Factorial Design
General Factor Factorial Design
 
5.Black Box Testing and Levels of Testing.ppt
5.Black Box Testing and Levels of Testing.ppt5.Black Box Testing and Levels of Testing.ppt
5.Black Box Testing and Levels of Testing.ppt
 
Testing Assumptions in repeated Measures Design using SPSS
Testing Assumptions in repeated Measures Design using SPSSTesting Assumptions in repeated Measures Design using SPSS
Testing Assumptions in repeated Measures Design using SPSS
 
Six Sigma Methods and Formulas for Successful Quality Management
Six Sigma Methods and Formulas for Successful Quality ManagementSix Sigma Methods and Formulas for Successful Quality Management
Six Sigma Methods and Formulas for Successful Quality Management
 
LNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine LearningLNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine Learning
 
Paper multi-modal biometric system using fingerprint , face and speech
Paper   multi-modal biometric system using fingerprint , face and speechPaper   multi-modal biometric system using fingerprint , face and speech
Paper multi-modal biometric system using fingerprint , face and speech
 
German credit score shivaram prakash
German credit score shivaram prakashGerman credit score shivaram prakash
German credit score shivaram prakash
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
 
Linking Evaluation and Cost-Benefit Analysis in Criminal Justice: A Practical...
Linking Evaluation and Cost-Benefit Analysis in Criminal Justice: A Practical...Linking Evaluation and Cost-Benefit Analysis in Criminal Justice: A Practical...
Linking Evaluation and Cost-Benefit Analysis in Criminal Justice: A Practical...
 
Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Es...
Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Es...Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Es...
Promise 2011: "Local Bias and its Impacts on the Performance of Parametric Es...
 
Biostatistics in Bioequivalence
Biostatistics in BioequivalenceBiostatistics in Bioequivalence
Biostatistics in Bioequivalence
 
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxIHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
 
Model validation strategies ftc 2018
Model validation strategies ftc 2018Model validation strategies ftc 2018
Model validation strategies ftc 2018
 
Tutorial on Operationalizing Treatments against Bias: Challenges and Solution...
Tutorial on Operationalizing Treatments against Bias: Challenges and Solution...Tutorial on Operationalizing Treatments against Bias: Challenges and Solution...
Tutorial on Operationalizing Treatments against Bias: Challenges and Solution...
 
Computer simulation technique the definitive introduction - harry perros
Computer simulation technique   the definitive introduction - harry perrosComputer simulation technique   the definitive introduction - harry perros
Computer simulation technique the definitive introduction - harry perros
 
Six Sigma Methods and Formulas for Successful Quality Management
Six Sigma Methods and Formulas for Successful Quality ManagementSix Sigma Methods and Formulas for Successful Quality Management
Six Sigma Methods and Formulas for Successful Quality Management
 
Stochastic Modeling - Model Risk - Sampling Error - Scenario Reduction
Stochastic Modeling - Model Risk - Sampling Error - Scenario ReductionStochastic Modeling - Model Risk - Sampling Error - Scenario Reduction
Stochastic Modeling - Model Risk - Sampling Error - Scenario Reduction
 

More from Marcus Botacin

Machine Learning by Examples - Marcus Botacin - TAMU 2024
Machine Learning by Examples - Marcus Botacin - TAMU 2024Machine Learning by Examples - Marcus Botacin - TAMU 2024
Machine Learning by Examples - Marcus Botacin - TAMU 2024Marcus Botacin
 
Near-memory & In-Memory Detection of Fileless Malware
Near-memory & In-Memory Detection of Fileless MalwareNear-memory & In-Memory Detection of Fileless Malware
Near-memory & In-Memory Detection of Fileless MalwareMarcus Botacin
 
GPThreats-3: Is Automated Malware Generation a Threat?
GPThreats-3: Is Automated Malware Generation a Threat?GPThreats-3: Is Automated Malware Generation a Threat?
GPThreats-3: Is Automated Malware Generation a Threat?Marcus Botacin
 
[HackInTheBOx] All You Always Wanted to Know About Antiviruses
[HackInTheBOx] All You Always Wanted to Know About Antiviruses[HackInTheBOx] All You Always Wanted to Know About Antiviruses
[HackInTheBOx] All You Always Wanted to Know About AntivirusesMarcus Botacin
 
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change![Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!Marcus Botacin
 
Hardware-accelerated security monitoring
Hardware-accelerated security monitoringHardware-accelerated security monitoring
Hardware-accelerated security monitoringMarcus Botacin
 
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022Marcus Botacin
 
Extraindo Caracterı́sticas de Arquivos Binários Executáveis
Extraindo Caracterı́sticas de Arquivos Binários ExecutáveisExtraindo Caracterı́sticas de Arquivos Binários Executáveis
Extraindo Caracterı́sticas de Arquivos Binários ExecutáveisMarcus Botacin
 
On the Malware Detection Problem: Challenges & Novel Approaches
On the Malware Detection Problem: Challenges & Novel ApproachesOn the Malware Detection Problem: Challenges & Novel Approaches
On the Malware Detection Problem: Challenges & Novel ApproachesMarcus Botacin
 
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...Marcus Botacin
 
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...Marcus Botacin
 
Integridade, confidencialidade, disponibilidade, ransomware
Integridade, confidencialidade, disponibilidade, ransomwareIntegridade, confidencialidade, disponibilidade, ransomware
Integridade, confidencialidade, disponibilidade, ransomwareMarcus Botacin
 
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...Marcus Botacin
 
On the Security of Application Installers & Online Software Repositories
On the Security of Application Installers & Online Software RepositoriesOn the Security of Application Installers & Online Software Repositories
On the Security of Application Installers & Online Software RepositoriesMarcus Botacin
 
The Internet Banking [in]Security Spiral: Past, Present, and Future of Online...
The Internet Banking [in]Security Spiral: Past, Present, and Future of Online...The Internet Banking [in]Security Spiral: Past, Present, and Future of Online...
The Internet Banking [in]Security Spiral: Past, Present, and Future of Online...Marcus Botacin
 
Análise do Malware Ativo na Internet Brasileira: 4 anos depois. O que mudou?
Análise do Malware Ativo na Internet Brasileira: 4 anos depois. O que mudou?Análise do Malware Ativo na Internet Brasileira: 4 anos depois. O que mudou?
Análise do Malware Ativo na Internet Brasileira: 4 anos depois. O que mudou?Marcus Botacin
 
Towards Malware Decompilation and Reassembly
Towards Malware Decompilation and ReassemblyTowards Malware Decompilation and Reassembly
Towards Malware Decompilation and ReassemblyMarcus Botacin
 
Reverse Engineering Course
Reverse Engineering CourseReverse Engineering Course
Reverse Engineering CourseMarcus Botacin
 
Machine Learning for Malware Detection: Beyond Accuracy Rates
Machine Learning for Malware Detection: Beyond Accuracy RatesMachine Learning for Malware Detection: Beyond Accuracy Rates
Machine Learning for Malware Detection: Beyond Accuracy RatesMarcus Botacin
 

More from Marcus Botacin (20)

Machine Learning by Examples - Marcus Botacin - TAMU 2024
Machine Learning by Examples - Marcus Botacin - TAMU 2024Machine Learning by Examples - Marcus Botacin - TAMU 2024
Machine Learning by Examples - Marcus Botacin - TAMU 2024
 
Near-memory & In-Memory Detection of Fileless Malware
Near-memory & In-Memory Detection of Fileless MalwareNear-memory & In-Memory Detection of Fileless Malware
Near-memory & In-Memory Detection of Fileless Malware
 
GPThreats-3: Is Automated Malware Generation a Threat?
GPThreats-3: Is Automated Malware Generation a Threat?GPThreats-3: Is Automated Malware Generation a Threat?
GPThreats-3: Is Automated Malware Generation a Threat?
 
[HackInTheBOx] All You Always Wanted to Know About Antiviruses
[HackInTheBOx] All You Always Wanted to Know About Antiviruses[HackInTheBOx] All You Always Wanted to Know About Antiviruses
[HackInTheBOx] All You Always Wanted to Know About Antiviruses
 
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change![Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
 
Hardware-accelerated security monitoring
Hardware-accelerated security monitoringHardware-accelerated security monitoring
Hardware-accelerated security monitoring
 
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
 
Extraindo Caracterı́sticas de Arquivos Binários Executáveis
Extraindo Caracterı́sticas de Arquivos Binários ExecutáveisExtraindo Caracterı́sticas de Arquivos Binários Executáveis
Extraindo Caracterı́sticas de Arquivos Binários Executáveis
 
On the Malware Detection Problem: Challenges & Novel Approaches
On the Malware Detection Problem: Challenges & Novel ApproachesOn the Malware Detection Problem: Challenges & Novel Approaches
On the Malware Detection Problem: Challenges & Novel Approaches
 
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
 
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
 
Integridade, confidencialidade, disponibilidade, ransomware
Integridade, confidencialidade, disponibilidade, ransomwareIntegridade, confidencialidade, disponibilidade, ransomware
Integridade, confidencialidade, disponibilidade, ransomware
 
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...
 
On the Security of Application Installers & Online Software Repositories
On the Security of Application Installers & Online Software RepositoriesOn the Security of Application Installers & Online Software Repositories
On the Security of Application Installers & Online Software Repositories
 
UMLsec
UMLsecUMLsec
UMLsec
 
The Internet Banking [in]Security Spiral: Past, Present, and Future of Online...
The Internet Banking [in]Security Spiral: Past, Present, and Future of Online...The Internet Banking [in]Security Spiral: Past, Present, and Future of Online...
The Internet Banking [in]Security Spiral: Past, Present, and Future of Online...
 
Análise do Malware Ativo na Internet Brasileira: 4 anos depois. O que mudou?
Análise do Malware Ativo na Internet Brasileira: 4 anos depois. O que mudou?Análise do Malware Ativo na Internet Brasileira: 4 anos depois. O que mudou?
Análise do Malware Ativo na Internet Brasileira: 4 anos depois. O que mudou?
 
Towards Malware Decompilation and Reassembly
Towards Malware Decompilation and ReassemblyTowards Malware Decompilation and Reassembly
Towards Malware Decompilation and Reassembly
 
Reverse Engineering Course
Reverse Engineering CourseReverse Engineering Course
Reverse Engineering Course
 
Machine Learning for Malware Detection: Beyond Accuracy Rates
Machine Learning for Malware Detection: Beyond Accuracy RatesMachine Learning for Malware Detection: Beyond Accuracy Rates
Machine Learning for Malware Detection: Beyond Accuracy Rates
 

Recently uploaded

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Alison B. Lowndes
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsPaul Groth
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesThousandEyes
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 

Recently uploaded (20)

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 

Malware Variants Identification in Practice

  • 1. Motivation Challenges & Alternatives Experiments Concluding Remarks Malware Variants Identification in Practice Marcus Botacin1, Andr´e Gr´egio1, Paulo L´ıcio de Geus2 1Federal University of Paran´a (UFPR) – {mfbotacin, gregio}@inf.ufpr.br 2University of Campinas (Unicamp) – paulo@lasca.ic.unicamp.br SBSEG 2019 Malware Variants Identification in Practice 1 / 26 SBSeg’19
  • 2. Motivation Challenges & Alternatives Experiments Concluding Remarks Agenda 1 Motivation Motivation 2 Challenges & Alternatives Challenge 1 Challenge 2 3 Experiments Evaluation 4 Concluding Remarks Limitations Conclusions Malware Variants Identification in Practice 2 / 26 SBSeg’19
  • 3. Motivation Challenges & Alternatives Experiments Concluding Remarks Motivation Agenda 1 Motivation Motivation 2 Challenges & Alternatives Challenge 1 Challenge 2 3 Experiments Evaluation 4 Concluding Remarks Limitations Conclusions Malware Variants Identification in Practice 3 / 26 SBSeg’19
  • 4. Motivation Challenges & Alternatives Experiments Concluding Remarks Motivation Malware at Scale! Figure: Source: https://www.infosecurity-magazine.com/news/ 360k-new-malware-samples-every-day/ Figure: Source: https://money.cnn.com/2015/04/14/technology/ security/cyber-attack-hacks-security/index.html Malware Variants Identification in Practice 4 / 26 SBSeg’19
  • 5. Motivation Challenges & Alternatives Experiments Concluding Remarks Motivation Current Approaches. Figure: Function-based, Graph Modeling. Malware Variants Identification in Practice 5 / 26 SBSeg’19
  • 6. Motivation Challenges & Alternatives Experiments Concluding Remarks Challenge 1 Agenda 1 Motivation Motivation 2 Challenges & Alternatives Challenge 1 Challenge 2 3 Experiments Evaluation 4 Concluding Remarks Limitations Conclusions Malware Variants Identification in Practice 6 / 26 SBSeg’19
  • 7. Motivation Challenges & Alternatives Experiments Concluding Remarks Challenge 1 Same-Behavior Function Replacement CreateProcess CreateFile RegCreateKey NtSuspendThread Figure: Original sample’s CG. CreateThread CreateFileTransacted RegLoadKey RtlWow64SuspendThread Figure: Variant sample’s CG. Modularity File System Change Registry Change Application Interference Figure: Behavioral graph from both samples. Malware Variants Identification in Practice 7 / 26 SBSeg’19
  • 8. Motivation Challenges & Alternatives Experiments Concluding Remarks Challenge 1 Behavioral Classification Compression Cryptography. Debug. Delay. Environment. Escalation. Exfiltration. Fingerprint. File System. Interference. Internet. Modularity. Monitoring. Registry. Evidence Removal. Side Effects. System Changes. Target Information. Timing Attacks. Malware Variants Identification in Practice 8 / 26 SBSeg’19
  • 9. Motivation Challenges & Alternatives Experiments Concluding Remarks Challenge 1 Behavior-based Graph. Figure: Behavior-based graph for a given sample. Malware Variants Identification in Practice 9 / 26 SBSeg’19
  • 10. Motivation Challenges & Alternatives Experiments Concluding Remarks Challenge 2 Agenda 1 Motivation Motivation 2 Challenges & Alternatives Challenge 1 Challenge 2 3 Experiments Evaluation 4 Concluding Remarks Limitations Conclusions Malware Variants Identification in Practice 10 / 26 SBSeg’19
  • 11. Motivation Challenges & Alternatives Experiments Concluding Remarks Challenge 2 Malware Embedding Figure: Sample 1. The original sample. Figure: Sample 2. Variant sample embedding the original one. Malware Variants Identification in Practice 11 / 26 SBSeg’19
  • 12. Motivation Challenges & Alternatives Experiments Concluding Remarks Challenge 2 Matching Metrics Definition The similarity of two malware, represented as sets, A and B, of vertices or edges of two graphs, is defined as: Sim(A, B) = |A ∩ B| |A ∪ B| (1) Definition The similarity of two malware, represented as sets, A and B, of vertices or edges of two graphs, is defined as: Sim(A, B) = max |A ∩ B| |B| , |B ∩ A| |A| (2) Malware Variants Identification in Practice 12 / 26 SBSeg’19
  • 13. Motivation Challenges & Alternatives Experiments Concluding Remarks Challenge 2 Continence Results Table: Continence of Sample 1 in Sample 2. CG A B C I 0.66 0.52 0.64 J 0.75 0.49 0.50 K 0.42 0.80 0.44 Table: Continence of Sample 2 in Sample 1. CG A B C I 0.57 0.56 0.43 J 0.33 0.51 0.44 K 0.76 0.65 0.44 Table: Maximum continence of Sample 1 and Sample 2. CG A B C I 0.66 0.56 0.64 J 0.75 0.51 0.50 K 0.76 0.80 0.44 Malware Variants Identification in Practice 13 / 26 SBSeg’19
  • 14. Motivation Challenges & Alternatives Experiments Concluding Remarks Evaluation Agenda 1 Motivation Motivation 2 Challenges & Alternatives Challenge 1 Challenge 2 3 Experiments Evaluation 4 Concluding Remarks Limitations Conclusions Malware Variants Identification in Practice 14 / 26 SBSeg’19
  • 15. Motivation Challenges & Alternatives Experiments Concluding Remarks Evaluation The Whitelisting Effect. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Mimail.a x Mimail.c Mimail.c x Mimail.e Mimail.f x Mimail.g Evaluating the whitelist effect. Without whitelist With whitelist Figure: Evaluating whitelisting effect. Similarity scores are higher when using the whitelist-based approach. Malware Variants Identification in Practice 15 / 26 SBSeg’19
  • 16. Motivation Challenges & Alternatives Experiments Concluding Remarks Evaluation Advantages of the Behavioral model. 0 0.1 0.2 0.3 0.4 0.5 Mimail.a x Klez.a Mimail.c x Klez.a Mimail.k x Klez.g Comparing function−based to the behavior−based approaches. Function−based Behavior−based Figure: Function vs. Behavior-based approaches. Scores are higher when considering behavioral patterns. Malware Variants Identification in Practice 16 / 26 SBSeg’19
  • 17. Motivation Challenges & Alternatives Experiments Concluding Remarks Evaluation Evaluating Metrics. 0 0.2 0.4 0.6 0.8 1 Mimail.a x Mimail.c Mimail.c x Mimail.e Mimail.f x Mimail.g Evaluating the proposed metric. Usual metric Proposed metric Figure: Proposed metric. Scores are higher when using it in comparison to the usual one. Malware Variants Identification in Practice 17 / 26 SBSeg’19
  • 18. Motivation Challenges & Alternatives Experiments Concluding Remarks Evaluation Solutions Comparison. 0.0% 20.0% 40.0% 60.0% 80.0% 100.0% 120.0% H x Q I x Q J x Q L x Q M x Q Mimail’s similarity. Solution #1 Solution #2 Our solution Figure: Mimail’s sample similarity. Our solution’s scores are higher when compared to other ones. Malware Variants Identification in Practice 18 / 26 SBSeg’19
  • 19. Motivation Challenges & Alternatives Experiments Concluding Remarks Evaluation Domain Transformation and Similarity Measures. 0 10 20 30 40 50 60 70 80 90 100 50% 60% 70% 80% 90% Threshold evaluation F.Mimail F.Klez F.Cross B.Mimail B.Klez B.Cross Figure: Threshold evaluation. This should be higher than 80% in order to proper label the cross dataset. Malware Variants Identification in Practice 19 / 26 SBSeg’19
  • 20. Motivation Challenges & Alternatives Experiments Concluding Remarks Evaluation Real World Experiments (1/2) Table: Identified variants among unknown, wild-collected samples. Family Sample Hash Label 1 A c2ef1aabb15c979e932f5ea1d214cbeb Generic vb.OBY B 747b9fe5819a76529abc161bb449b8eb Generic vb.OBO C 39a04a11234d931bfa60d68ba8df9021 Generic vb.OBL 2 A 96d13246971e4368b9ed90c6f996a884 Atros4.CENI B e23588078ba6a5f5ca1c961a8336ec08 Atros4.CENI C 31a2b6adc781328cb1d77e5debb318ff Atros4.CENI Malware Variants Identification in Practice 20 / 26 SBSeg’19
  • 21. Motivation Challenges & Alternatives Experiments Concluding Remarks Evaluation Real World Experiments (2/2) 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 90.0% 100.0% 1.a x 1.b 1.a x 1.c 1.b x 1.c 2.a x 2.b 2.a x 2.c 2.b x 2.c Study case − Identified variants ssdeep Our solution Coverage Figure: Study case: variant identification. Our approach outperforms others even on low coverage scenarios. Malware Variants Identification in Practice 21 / 26 SBSeg’19
  • 22. Motivation Challenges & Alternatives Experiments Concluding Remarks Limitations Agenda 1 Motivation Motivation 2 Challenges & Alternatives Challenge 1 Challenge 2 3 Experiments Evaluation 4 Concluding Remarks Limitations Conclusions Malware Variants Identification in Practice 22 / 26 SBSeg’19
  • 23. Motivation Challenges & Alternatives Experiments Concluding Remarks Limitations Matching Complex Behaviors is Challenging! Figure: DLL injection functions among other function calls. Figure: Proposed DLL injection class. Malware Variants Identification in Practice 23 / 26 SBSeg’19
  • 24. Motivation Challenges & Alternatives Experiments Concluding Remarks Conclusions Agenda 1 Motivation Motivation 2 Challenges & Alternatives Challenge 1 Challenge 2 3 Experiments Evaluation 4 Concluding Remarks Limitations Conclusions Malware Variants Identification in Practice 24 / 26 SBSeg’19
  • 25. Motivation Challenges & Alternatives Experiments Concluding Remarks Conclusions Conclusions Challenges & Lessons Anti-disassembly breaks CG extraction. Transparent, dynamic tracing is a viable alternative. Same-Function Replacement breaks malware clustering. Behavior-based clustering is a viable alternative. Dead code breaks similarity metrics. Continence metric is a viable alternative. Malware Variants Identification in Practice 25 / 26 SBSeg’19
  • 26. Motivation Challenges & Alternatives Experiments Concluding Remarks Conclusions Questions & Comments. Contact mfbotacin@inf.ufpr.br Additional Material https://github.com/marcusbotacin/Malware.Variants Malware Variants Identification in Practice 26 / 26 SBSeg’19