SlideShare a Scribd company logo
1 of 23
Download to read offline
Automatic classification
of web-based malware
Alexander Sevtsov
Malware Analyst
Kaspersky Lab
Web-based malware
Chain of events
2
Compromised
web-site
Gate
(redirect)
Landing Page
of Exploit Kit
Machine learning
Learning
problems
Supervised
Learning
Classification Regression
Unsupervised
Learning
Clustering
3
Statement of the Problem
To develop an automatic classification
(detection) system for web-based malware
using supervised learning
4
System Diagram for the Automatic Classification
Collecting Files
Extracting Features
Data Analysis
Classifier selection
5
Collecting Files
Samples
Clean Malicious
6
Extracting Features
F1
F2
F3
F4
…
FN
7
Extracting Features
Feature Description
F1 Using external script <script src="http://
F2 The amount of script blocks
F3 Using constant obfuscation: ‘http:’ + ‘:/’ + ‘ex’ + ‘ample.c’ + ‘om’
F4 Using Meta Redirect <meta http-equiv="refresh" content="0; url=http://example.com/">
F5 Set "location.href" for the document
F6 Using hidden attributes: "display:none", "visibility:hidden", "position:absolute; left: -10000;" (Sweet Orange EK)
F7 Check the cookies associated with the current document (Fiesta EK)
8
Extracting Features
CLASS F1 F2 F3 F4 … FN
Clean False 0 351 5 0
Malware True 1 4300 25 1
Malware False 1 2542 4 0
Malware False 0 301 1 1
Clean True 1 1502 90 1
…
Clean True 1 556 2 0
9
Data Analysis
10
Dataset
Training Testing
Data Analysis
Dimensionality
Reduction
Principal
Component
Analysis
Random
Projections
11
Data Analysis
CLASS F1 F2 F3 F4 … FN
Clean False 0 351 5 0
Malware True 1 4300 25 1
Malware False 1 2542 4 0
Malware False 0 301 1 1
Clean True 1 1502 90 1
…
Clean True 1 556 2 0
12
Classifier selection
Output
13
Classifier selection
Analyze a confusion matrix
malicious clean
malicious 174 27
clean 44 41
14
Classifier selection
Machine
Learning
Algorithms
Naive Bayes
Classifier
Support Vector
Machines
Logistic
Regression
Nearest
Neighbors
Classification
Random Forest
…
15
Script Insertion
16
Using an external script
17
Constant Obfuscation
18
Using Meta-Redirect
19
Using Hidden Attributes
20
Checking document cookies
21
Finding Exploit Kits
Clean MX
Scumware
Malware
don't need
Coffee
Users
Threatglass
by Barracuda
22
Questions?
23

More Related Content

Similar to Automatic Classification of Web-Based Malware

Orange@php conf
Orange@php confOrange@php conf
Orange@php confHash Lin
 
Security in PHP - 那些在滲透測試的小技巧
Security in PHP - 那些在滲透測試的小技巧Security in PHP - 那些在滲透測試的小技巧
Security in PHP - 那些在滲透測試的小技巧Orange Tsai
 
JS Fest 2019. Виктор Турский. 6 способов взломать твое JavaScript приложение
JS Fest 2019. Виктор Турский. 6 способов взломать твое JavaScript приложениеJS Fest 2019. Виктор Турский. 6 способов взломать твое JavaScript приложение
JS Fest 2019. Виктор Турский. 6 способов взломать твое JavaScript приложениеJSFestUA
 
Sichere Web-Applikationen am Beispiel von Django
Sichere Web-Applikationen am Beispiel von DjangoSichere Web-Applikationen am Beispiel von Django
Sichere Web-Applikationen am Beispiel von DjangoMarkus Zapke-Gründemann
 
[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware DetectionAj MaChInE
 
OSCP Preparation Guide @ Infosectrain
OSCP Preparation Guide @ InfosectrainOSCP Preparation Guide @ Infosectrain
OSCP Preparation Guide @ InfosectrainInfosecTrain
 
Attack Chaining: Advanced Maneuvers for Hack Fu
Attack Chaining: Advanced Maneuvers for Hack FuAttack Chaining: Advanced Maneuvers for Hack Fu
Attack Chaining: Advanced Maneuvers for Hack FuRob Ragan
 
SplunkLive! Getting Started with Splunk Enterprise
SplunkLive! Getting Started with Splunk EnterpriseSplunkLive! Getting Started with Splunk Enterprise
SplunkLive! Getting Started with Splunk EnterpriseSplunk
 
OWASP Portland - OWASP Top 10 For JavaScript Developers
OWASP Portland - OWASP Top 10 For JavaScript DevelopersOWASP Portland - OWASP Top 10 For JavaScript Developers
OWASP Portland - OWASP Top 10 For JavaScript DevelopersLewis Ardern
 

Similar to Automatic Classification of Web-Based Malware (9)

Orange@php conf
Orange@php confOrange@php conf
Orange@php conf
 
Security in PHP - 那些在滲透測試的小技巧
Security in PHP - 那些在滲透測試的小技巧Security in PHP - 那些在滲透測試的小技巧
Security in PHP - 那些在滲透測試的小技巧
 
JS Fest 2019. Виктор Турский. 6 способов взломать твое JavaScript приложение
JS Fest 2019. Виктор Турский. 6 способов взломать твое JavaScript приложениеJS Fest 2019. Виктор Турский. 6 способов взломать твое JavaScript приложение
JS Fest 2019. Виктор Турский. 6 способов взломать твое JavaScript приложение
 
Sichere Web-Applikationen am Beispiel von Django
Sichere Web-Applikationen am Beispiel von DjangoSichere Web-Applikationen am Beispiel von Django
Sichere Web-Applikationen am Beispiel von Django
 
[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection
 
OSCP Preparation Guide @ Infosectrain
OSCP Preparation Guide @ InfosectrainOSCP Preparation Guide @ Infosectrain
OSCP Preparation Guide @ Infosectrain
 
Attack Chaining: Advanced Maneuvers for Hack Fu
Attack Chaining: Advanced Maneuvers for Hack FuAttack Chaining: Advanced Maneuvers for Hack Fu
Attack Chaining: Advanced Maneuvers for Hack Fu
 
SplunkLive! Getting Started with Splunk Enterprise
SplunkLive! Getting Started with Splunk EnterpriseSplunkLive! Getting Started with Splunk Enterprise
SplunkLive! Getting Started with Splunk Enterprise
 
OWASP Portland - OWASP Top 10 For JavaScript Developers
OWASP Portland - OWASP Top 10 For JavaScript DevelopersOWASP Portland - OWASP Top 10 For JavaScript Developers
OWASP Portland - OWASP Top 10 For JavaScript Developers
 

Automatic Classification of Web-Based Malware