SlideShare a Scribd company logo
October 1 – 4 2019
A Multi-Level Ransomware Detection
Framework using Natural Language
Processing and Machine Learning
By: Subash Poudyal, Dipankar Dasgupta,
Zahid Akhtar, Kishor Datta Gupta
October 1 – 4 2019
About Me
 Subash Poudyal @subash_spp
 Security Researcher at Center for Information
Assurance (CfIA)
 PhD Student at University of Memphis
2
October 1 – 4 2019
Purpose
 Ransomware detection Framework
 Multi-level (DLL, Function call, Assembly) feature
mining
 Use NLP and Machine learning approaches
 Apache Spark for feature processing
 PE parser and Objectdump tool of Linux system
 N-gram probability, Term Frequency- Inverse
Document Frequency (TF-IDF) from NLP
3
October 1 – 4 2019
Background Information
 Ransomware is evolving and causing damage.
 Advanced malware- encrypts your data, asks for
ransom in bitcoins for anonymity
4
[9]
October 1 – 4 2019
Some recent incidents
5
Georgia county pays a
whopping $400,000 to get rid of
a ransomware infection
Mar. 2019
Florida, Riviera and Lake city
council agreed to pay $600,000
and $500,000, respectively, to get
their data back
June 2019
South Africa City power grid
attack
July 2019
22 municipalities/local governments
in Texas , demand of 2.5M
Community hospital at Washington,
$1 million demanded
August 2019
[5,6,8,7,2]
October 1 – 4 2019
Damage caused by Ransomware
6
[3]
October 1 – 4 2019
Map of Ransomware detection
7
[10]
Ransomware detections across organizations in USA from Jan- Aug 2019
October 1 – 4 2019
PHISHING EMAIL
Run Payload and Download
Code
Generate Key and Encrypt
Communicate with
C&C
Select different file types for
encryption
Display message in Victim
Desktop
Ransomware attack steps
October 1 – 4 2019
Previous work on Malware detection
 Canzanese et al. [11] analyzed system call traces
utilizing n-gram language model and TF-IDF for Malware
detection
 Zhang et al. [27] used n-gram of opcode sequences for
ransomware family classification
 Yuki et al. [12] ] have proposed ransomware detection by
using API calls and SVM
 Poudyal et al. [12] used DLL and assembly instructions
frequencies for ransomware detection
 Difference: They study one level or other, but we deal
with three level 9
October 1 – 4 2019
Hypothesis
 We can detect ransomware with improved
accuracy (98.59% and 0.03 FPR)
 By reverse engineering, mining of multi-level features
 By leveraging NLP and ML techniques
 The previous approaches have adopted a single
approach
10
October 1 – 4 2019
Details of DLL Hierarchy
11
October 1 – 4 2019
Details of Function calls Hierarchy
12
October 1 – 4 2019
DLL and Function call level code segment
of Locky ransomware
13
October 1 – 4 2019
Assembly level code segment of Locky
ransomware
14
October 1 – 4 2019
The Proposed Multi-level Framework
15
October 1 – 4 2019
The Proposed Framework
 DLL tracker analyzes DLLs of a given binary using
the detection engine
 Function call tracker analyzes function calls of a
binary
 Assembly instruction tracker works similar to DLL
and function call trackers
• detection counter checked
 Action engine
 passive analyzer
16
October 1 – 4 2019
Workflow of Detector Engine
17
October 1 – 4 2019
Workflow of Detector Engine
 Reverse-engineer using
PE parser and Objdump
 Multi-level mining using
three different extractors
1. Dll extractor
2. Function call extractor
3. Assembly instruction extractor
18
October 1 – 4 2019
NLP Schemes
 Proved useful in recommendation system, text
classification, speech recognition and so on
 N-gram Generator: unique set of n-gram sequences
 Markov assumption by considering only the
immediate N-1 words
19
October 1 – 4 2019
NLP Schemes
20
 n-gram probability: relative frequency on a training
corpus
 TF-IDF: Term Freq X Inverse document Freq
October 1 – 4 2019
Experimental Setup
 Dataset: Virus Total and open source malware
repository theZoo
 292 only ransomware binaries and the same number
of benign executables
 Apache spark cluster configuration:
 4 data nodes
 1 name node each with 16GB RAM and 8 cores
 Ubuntu 16.04.3 operating system with1TB disk
 Hadoop version-2.7.3
 Spark-2.3
 Experimental programs written in Python, Mlib library
from Pyspark
21
October 1 – 4 2019
Experiment
 Feature generation using N-gram language model
22
October 1 – 4 2019
Experiment Cont..
23
October 1 – 4 2019
Experiment Cont..
October 1 – 4 2019
 Logistic regression accuracy for N-gram TF-IDF at
multi and combined level
25
October 1 – 4 2019
Trigram analysis
26
October 1 – 4 2019
Impact & Broader Contributions
 Provide new effective approach of
ransomware detection
 Implemented multi-level features for improved
detection
 Provide background for further analysis in
multi-level relation mapping
27
October 1 – 4 2019
Conclusion
 An efficient multi-level ransomware detection
framework (98.59% and 0.03 FPR)
 Leveraged reverse-engineering, data mining, NLP
and supervised ML techniques
 Practical implementation feasible
28
October 1 – 4 2019
Future Work & Remaining Questions
 Multi-level analysis leveraging deep learning
techniques using larger dataset
 Performance comparison between relevant
techniques
 Effect of code obfuscation techniques in machine
learning detection
 We welcome any collaboration with industry or
university on ransomware research
29
October 1 – 4 2019
30
Thanks
Questions and comments?
Subash Poudyal
@subash_spp
connect.subash@gmail.com
October 1 – 4 2019
31
References
[1] https://www.techspot.com/news/79119-jackson-county-government-gives-hackers-pays-400000.html
[2] https://healthitsecurity.com/news/hackers-demand-1m-in-grays-harbor-ransomware-attack
[3] https://heimdalsecurity.com/blog/cyber-security-threats-types/
[4] https://Malwarebytes.com
[5] https://www.zdnet.com/article/georgia-county-pays-a-whopping-400000-to-get-rid-of-a-ransomware-infection/
[6] https://www.zdnet.com/article/second-florida-city-pays-giant-ransom-to-ransomware-gang-in-a-week/
[7] https://www.npr.org/2019/08/20/752695554/23-texas-towns-hit-with-ransomware-attack-in-new-front-of-cyberassault
[8] https://www.msspalert.com/cybersecurity-breaches-and-attacks/ransomware/city-power-johannesburg-south-africa/
[9] https://www.techspot.com/news/79119-jackson-county-government-gives-hackers-pays-400000.html
[10] https://blog.malwarebytes.com/ransomware/2019/05/ransomware-isnt-just-a-big-city-problem/
[11] R. Canzanese, S. Mancoridis, and M. Kam. System call-based detection of malicious processes. In 2015 IEEE International
Conference on Software Quality, Reliability and Security, pages 119–124. IEEE, 2015.
[12] S. Poudyal, K. P. Subedi, and D. Dasgupta. A framework for analyzingransomware using machine learning. In2018
IEEE Symposium Serieson Computational Intelligence (SSCI), pages 1692–1699. IEEE, 2018

More Related Content

What's hot

A Trusted Approach Towards DDos Attack
A Trusted Approach Towards DDos AttackA Trusted Approach Towards DDos Attack
A Trusted Approach Towards DDos Attack
theijes
 
Chap04 review
Chap04 reviewChap04 review
Chap04 review
kwcard
 
literature survey for identity based secure distributed data storage
literature survey for identity based secure distributed data storage literature survey for identity based secure distributed data storage
literature survey for identity based secure distributed data storage
Sahithi Naraparaju
 
Multi dimensional sketch based sip flooding detection using hellinger distance
Multi dimensional sketch based sip flooding detection using hellinger distanceMulti dimensional sketch based sip flooding detection using hellinger distance
Multi dimensional sketch based sip flooding detection using hellinger distance
Sheik Mohideen
 
Acquisition of malicious code using active learning
Acquisition of malicious code using active learningAcquisition of malicious code using active learning
Acquisition of malicious code using active learning
UltraUploader
 
Application Programming Interface
Application Programming InterfaceApplication Programming Interface
Application Programming Interface
Seculert
 
Classifying IoT malware delivery patterns for attack detection
Classifying IoT malware delivery patterns for attack detectionClassifying IoT malware delivery patterns for attack detection
Classifying IoT malware delivery patterns for attack detection
Fabrizio Farinacci
 
BAIT1103 Chapter 2
BAIT1103 Chapter 2BAIT1103 Chapter 2
BAIT1103 Chapter 2
limsh
 
On the Use of Domain Terms in Source Code
On the Use of Domain Terms in Source CodeOn the Use of Domain Terms in Source Code
On the Use of Domain Terms in Source Code
Sonia Haiduc
 
1766 1770
1766 17701766 1770
1766 1770
Editor IJARCET
 
Review of Detection DDOS Attack Detection Using Naive Bayes Classifier for Ne...
Review of Detection DDOS Attack Detection Using Naive Bayes Classifier for Ne...Review of Detection DDOS Attack Detection Using Naive Bayes Classifier for Ne...
Review of Detection DDOS Attack Detection Using Naive Bayes Classifier for Ne...
journalBEEI
 
RaymondResume2015v5
RaymondResume2015v5RaymondResume2015v5
RaymondResume2015v5
Raymond Yan Lok Chan
 
Rumor riding
Rumor ridingRumor riding
Rumor riding
madhuvana_as
 

What's hot (13)

A Trusted Approach Towards DDos Attack
A Trusted Approach Towards DDos AttackA Trusted Approach Towards DDos Attack
A Trusted Approach Towards DDos Attack
 
Chap04 review
Chap04 reviewChap04 review
Chap04 review
 
literature survey for identity based secure distributed data storage
literature survey for identity based secure distributed data storage literature survey for identity based secure distributed data storage
literature survey for identity based secure distributed data storage
 
Multi dimensional sketch based sip flooding detection using hellinger distance
Multi dimensional sketch based sip flooding detection using hellinger distanceMulti dimensional sketch based sip flooding detection using hellinger distance
Multi dimensional sketch based sip flooding detection using hellinger distance
 
Acquisition of malicious code using active learning
Acquisition of malicious code using active learningAcquisition of malicious code using active learning
Acquisition of malicious code using active learning
 
Application Programming Interface
Application Programming InterfaceApplication Programming Interface
Application Programming Interface
 
Classifying IoT malware delivery patterns for attack detection
Classifying IoT malware delivery patterns for attack detectionClassifying IoT malware delivery patterns for attack detection
Classifying IoT malware delivery patterns for attack detection
 
BAIT1103 Chapter 2
BAIT1103 Chapter 2BAIT1103 Chapter 2
BAIT1103 Chapter 2
 
On the Use of Domain Terms in Source Code
On the Use of Domain Terms in Source CodeOn the Use of Domain Terms in Source Code
On the Use of Domain Terms in Source Code
 
1766 1770
1766 17701766 1770
1766 1770
 
Review of Detection DDOS Attack Detection Using Naive Bayes Classifier for Ne...
Review of Detection DDOS Attack Detection Using Naive Bayes Classifier for Ne...Review of Detection DDOS Attack Detection Using Naive Bayes Classifier for Ne...
Review of Detection DDOS Attack Detection Using Naive Bayes Classifier for Ne...
 
RaymondResume2015v5
RaymondResume2015v5RaymondResume2015v5
RaymondResume2015v5
 
Rumor riding
Rumor ridingRumor riding
Rumor riding
 

Similar to Multi level ransomware analysis MALCON 2019 conference

MalCon Future of Security
MalCon Future of SecurityMalCon Future of Security
MalCon Future of Security
Netskope
 
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Network Measurement and Monitori - Assigment 1, Group3, "Classification"Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Valentin Thirion
 
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWAREMINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
IJNSA Journal
 
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWAREMINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
IJNSA Journal
 
Pre-filters in-transit malware packets detection in the network
Pre-filters in-transit malware packets detection in the networkPre-filters in-transit malware packets detection in the network
Pre-filters in-transit malware packets detection in the network
TELKOMNIKA JOURNAL
 
A017510102
A017510102A017510102
A017510102
IOSR Journals
 
The Next Generation Cognitive Security Operations Center: Network Flow Forens...
The Next Generation Cognitive Security Operations Center: Network Flow Forens...The Next Generation Cognitive Security Operations Center: Network Flow Forens...
The Next Generation Cognitive Security Operations Center: Network Flow Forens...
Konstantinos Demertzis
 
AYOUB MAHDI - SUMMARY of FLOWPRINT: SEMI-SUPERVISED MOBILE-APP FINGERPRINTIN...
AYOUB MAHDI - SUMMARY of FLOWPRINT: SEMI-SUPERVISED  MOBILE-APP FINGERPRINTIN...AYOUB MAHDI - SUMMARY of FLOWPRINT: SEMI-SUPERVISED  MOBILE-APP FINGERPRINTIN...
AYOUB MAHDI - SUMMARY of FLOWPRINT: SEMI-SUPERVISED MOBILE-APP FINGERPRINTIN...
MahdiAyoub2
 
H017445260
H017445260H017445260
H017445260
IOSR Journals
 
Meetup mongo db-spark-ml-20191111
Meetup mongo db-spark-ml-20191111Meetup mongo db-spark-ml-20191111
Meetup mongo db-spark-ml-20191111
Deep Learning Italia
 
My Resume
My ResumeMy Resume
My Resume
SwapnilKishore3
 
Splunk Threat Hunting Workshop
Splunk Threat Hunting WorkshopSplunk Threat Hunting Workshop
Splunk Threat Hunting Workshop
Splunk
 
Splunk workshop-Threat Hunting
Splunk workshop-Threat HuntingSplunk workshop-Threat Hunting
Splunk workshop-Threat Hunting
Splunk
 
Security Issues in Next Generation IP and Migration Networks
Security Issues in Next Generation IP and Migration NetworksSecurity Issues in Next Generation IP and Migration Networks
Security Issues in Next Generation IP and Migration Networks
IOSR Journals
 
D017131318
D017131318D017131318
D017131318
IOSR Journals
 
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
IJNSA Journal
 
Security Delivery Platform: Best practices
Security Delivery Platform: Best practicesSecurity Delivery Platform: Best practices
Security Delivery Platform: Best practices
Mihajlo Prerad
 
The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017
SK Reddy
 
Ransomware Attack Detection based on Pertinent System Calls Using Machine Lea...
Ransomware Attack Detection based on Pertinent System Calls Using Machine Lea...Ransomware Attack Detection based on Pertinent System Calls Using Machine Lea...
Ransomware Attack Detection based on Pertinent System Calls Using Machine Lea...
IJCNCJournal
 
Ransomware Attack Detection Based on Pertinent System Calls Using Machine Lea...
Ransomware Attack Detection Based on Pertinent System Calls Using Machine Lea...Ransomware Attack Detection Based on Pertinent System Calls Using Machine Lea...
Ransomware Attack Detection Based on Pertinent System Calls Using Machine Lea...
IJCNCJournal
 

Similar to Multi level ransomware analysis MALCON 2019 conference (20)

MalCon Future of Security
MalCon Future of SecurityMalCon Future of Security
MalCon Future of Security
 
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Network Measurement and Monitori - Assigment 1, Group3, "Classification"Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
 
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWAREMINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
 
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWAREMINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWARE
 
Pre-filters in-transit malware packets detection in the network
Pre-filters in-transit malware packets detection in the networkPre-filters in-transit malware packets detection in the network
Pre-filters in-transit malware packets detection in the network
 
A017510102
A017510102A017510102
A017510102
 
The Next Generation Cognitive Security Operations Center: Network Flow Forens...
The Next Generation Cognitive Security Operations Center: Network Flow Forens...The Next Generation Cognitive Security Operations Center: Network Flow Forens...
The Next Generation Cognitive Security Operations Center: Network Flow Forens...
 
AYOUB MAHDI - SUMMARY of FLOWPRINT: SEMI-SUPERVISED MOBILE-APP FINGERPRINTIN...
AYOUB MAHDI - SUMMARY of FLOWPRINT: SEMI-SUPERVISED  MOBILE-APP FINGERPRINTIN...AYOUB MAHDI - SUMMARY of FLOWPRINT: SEMI-SUPERVISED  MOBILE-APP FINGERPRINTIN...
AYOUB MAHDI - SUMMARY of FLOWPRINT: SEMI-SUPERVISED MOBILE-APP FINGERPRINTIN...
 
H017445260
H017445260H017445260
H017445260
 
Meetup mongo db-spark-ml-20191111
Meetup mongo db-spark-ml-20191111Meetup mongo db-spark-ml-20191111
Meetup mongo db-spark-ml-20191111
 
My Resume
My ResumeMy Resume
My Resume
 
Splunk Threat Hunting Workshop
Splunk Threat Hunting WorkshopSplunk Threat Hunting Workshop
Splunk Threat Hunting Workshop
 
Splunk workshop-Threat Hunting
Splunk workshop-Threat HuntingSplunk workshop-Threat Hunting
Splunk workshop-Threat Hunting
 
Security Issues in Next Generation IP and Migration Networks
Security Issues in Next Generation IP and Migration NetworksSecurity Issues in Next Generation IP and Migration Networks
Security Issues in Next Generation IP and Migration Networks
 
D017131318
D017131318D017131318
D017131318
 
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
 
Security Delivery Platform: Best practices
Security Delivery Platform: Best practicesSecurity Delivery Platform: Best practices
Security Delivery Platform: Best practices
 
The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017
 
Ransomware Attack Detection based on Pertinent System Calls Using Machine Lea...
Ransomware Attack Detection based on Pertinent System Calls Using Machine Lea...Ransomware Attack Detection based on Pertinent System Calls Using Machine Lea...
Ransomware Attack Detection based on Pertinent System Calls Using Machine Lea...
 
Ransomware Attack Detection Based on Pertinent System Calls Using Machine Lea...
Ransomware Attack Detection Based on Pertinent System Calls Using Machine Lea...Ransomware Attack Detection Based on Pertinent System Calls Using Machine Lea...
Ransomware Attack Detection Based on Pertinent System Calls Using Machine Lea...
 

More from Kishor Datta Gupta

GAN introduction.pptx
GAN introduction.pptxGAN introduction.pptx
GAN introduction.pptx
Kishor Datta Gupta
 
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Kishor Datta Gupta
 
A safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataA safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable data
Kishor Datta Gupta
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
Kishor Datta Gupta
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defense
Kishor Datta Gupta
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Kishor Datta Gupta
 
Zero shot learning
Zero shot learning Zero shot learning
Zero shot learning
Kishor Datta Gupta
 
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Kishor Datta Gupta
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Kishor Datta Gupta
 
Machine learning in computer security
Machine learning in computer securityMachine learning in computer security
Machine learning in computer security
Kishor Datta Gupta
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detection
Kishor Datta Gupta
 
Cyber intrusion
Cyber intrusionCyber intrusion
Cyber intrusion
Kishor Datta Gupta
 
understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...
Kishor Datta Gupta
 
Different representation space for MNIST digit
Different representation space for MNIST digitDifferent representation space for MNIST digit
Different representation space for MNIST digit
Kishor Datta Gupta
 
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui..."Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
Kishor Datta Gupta
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Kishor Datta Gupta
 
Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)
Kishor Datta Gupta
 
Clustering report
Clustering reportClustering report
Clustering report
Kishor Datta Gupta
 
Basic digital image concept
Basic digital image conceptBasic digital image concept
Basic digital image concept
Kishor Datta Gupta
 
An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)
Kishor Datta Gupta
 

More from Kishor Datta Gupta (20)

GAN introduction.pptx
GAN introduction.pptxGAN introduction.pptx
GAN introduction.pptx
 
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
 
A safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataA safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable data
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defense
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
 
Zero shot learning
Zero shot learning Zero shot learning
Zero shot learning
 
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
 
Machine learning in computer security
Machine learning in computer securityMachine learning in computer security
Machine learning in computer security
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detection
 
Cyber intrusion
Cyber intrusionCyber intrusion
Cyber intrusion
 
understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...
 
Different representation space for MNIST digit
Different representation space for MNIST digitDifferent representation space for MNIST digit
Different representation space for MNIST digit
 
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui..."Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
 
Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)
 
Clustering report
Clustering reportClustering report
Clustering report
 
Basic digital image concept
Basic digital image conceptBasic digital image concept
Basic digital image concept
 
An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)
 

Recently uploaded

2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
171ticu
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
zubairahmad848137
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
NazakatAliKhoso2
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
Recycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part IIRecycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part II
Aditya Rajan Patra
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
Engine Lubrication performance System.pdf
Engine Lubrication performance System.pdfEngine Lubrication performance System.pdf
Engine Lubrication performance System.pdf
mamamaam477
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
enizeyimana36
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 

Recently uploaded (20)

2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
Recycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part IIRecycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part II
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
Engine Lubrication performance System.pdf
Engine Lubrication performance System.pdfEngine Lubrication performance System.pdf
Engine Lubrication performance System.pdf
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 

Multi level ransomware analysis MALCON 2019 conference

  • 1. October 1 – 4 2019 A Multi-Level Ransomware Detection Framework using Natural Language Processing and Machine Learning By: Subash Poudyal, Dipankar Dasgupta, Zahid Akhtar, Kishor Datta Gupta
  • 2. October 1 – 4 2019 About Me  Subash Poudyal @subash_spp  Security Researcher at Center for Information Assurance (CfIA)  PhD Student at University of Memphis 2
  • 3. October 1 – 4 2019 Purpose  Ransomware detection Framework  Multi-level (DLL, Function call, Assembly) feature mining  Use NLP and Machine learning approaches  Apache Spark for feature processing  PE parser and Objectdump tool of Linux system  N-gram probability, Term Frequency- Inverse Document Frequency (TF-IDF) from NLP 3
  • 4. October 1 – 4 2019 Background Information  Ransomware is evolving and causing damage.  Advanced malware- encrypts your data, asks for ransom in bitcoins for anonymity 4 [9]
  • 5. October 1 – 4 2019 Some recent incidents 5 Georgia county pays a whopping $400,000 to get rid of a ransomware infection Mar. 2019 Florida, Riviera and Lake city council agreed to pay $600,000 and $500,000, respectively, to get their data back June 2019 South Africa City power grid attack July 2019 22 municipalities/local governments in Texas , demand of 2.5M Community hospital at Washington, $1 million demanded August 2019 [5,6,8,7,2]
  • 6. October 1 – 4 2019 Damage caused by Ransomware 6 [3]
  • 7. October 1 – 4 2019 Map of Ransomware detection 7 [10] Ransomware detections across organizations in USA from Jan- Aug 2019
  • 8. October 1 – 4 2019 PHISHING EMAIL Run Payload and Download Code Generate Key and Encrypt Communicate with C&C Select different file types for encryption Display message in Victim Desktop Ransomware attack steps
  • 9. October 1 – 4 2019 Previous work on Malware detection  Canzanese et al. [11] analyzed system call traces utilizing n-gram language model and TF-IDF for Malware detection  Zhang et al. [27] used n-gram of opcode sequences for ransomware family classification  Yuki et al. [12] ] have proposed ransomware detection by using API calls and SVM  Poudyal et al. [12] used DLL and assembly instructions frequencies for ransomware detection  Difference: They study one level or other, but we deal with three level 9
  • 10. October 1 – 4 2019 Hypothesis  We can detect ransomware with improved accuracy (98.59% and 0.03 FPR)  By reverse engineering, mining of multi-level features  By leveraging NLP and ML techniques  The previous approaches have adopted a single approach 10
  • 11. October 1 – 4 2019 Details of DLL Hierarchy 11
  • 12. October 1 – 4 2019 Details of Function calls Hierarchy 12
  • 13. October 1 – 4 2019 DLL and Function call level code segment of Locky ransomware 13
  • 14. October 1 – 4 2019 Assembly level code segment of Locky ransomware 14
  • 15. October 1 – 4 2019 The Proposed Multi-level Framework 15
  • 16. October 1 – 4 2019 The Proposed Framework  DLL tracker analyzes DLLs of a given binary using the detection engine  Function call tracker analyzes function calls of a binary  Assembly instruction tracker works similar to DLL and function call trackers • detection counter checked  Action engine  passive analyzer 16
  • 17. October 1 – 4 2019 Workflow of Detector Engine 17
  • 18. October 1 – 4 2019 Workflow of Detector Engine  Reverse-engineer using PE parser and Objdump  Multi-level mining using three different extractors 1. Dll extractor 2. Function call extractor 3. Assembly instruction extractor 18
  • 19. October 1 – 4 2019 NLP Schemes  Proved useful in recommendation system, text classification, speech recognition and so on  N-gram Generator: unique set of n-gram sequences  Markov assumption by considering only the immediate N-1 words 19
  • 20. October 1 – 4 2019 NLP Schemes 20  n-gram probability: relative frequency on a training corpus  TF-IDF: Term Freq X Inverse document Freq
  • 21. October 1 – 4 2019 Experimental Setup  Dataset: Virus Total and open source malware repository theZoo  292 only ransomware binaries and the same number of benign executables  Apache spark cluster configuration:  4 data nodes  1 name node each with 16GB RAM and 8 cores  Ubuntu 16.04.3 operating system with1TB disk  Hadoop version-2.7.3  Spark-2.3  Experimental programs written in Python, Mlib library from Pyspark 21
  • 22. October 1 – 4 2019 Experiment  Feature generation using N-gram language model 22
  • 23. October 1 – 4 2019 Experiment Cont.. 23
  • 24. October 1 – 4 2019 Experiment Cont..
  • 25. October 1 – 4 2019  Logistic regression accuracy for N-gram TF-IDF at multi and combined level 25
  • 26. October 1 – 4 2019 Trigram analysis 26
  • 27. October 1 – 4 2019 Impact & Broader Contributions  Provide new effective approach of ransomware detection  Implemented multi-level features for improved detection  Provide background for further analysis in multi-level relation mapping 27
  • 28. October 1 – 4 2019 Conclusion  An efficient multi-level ransomware detection framework (98.59% and 0.03 FPR)  Leveraged reverse-engineering, data mining, NLP and supervised ML techniques  Practical implementation feasible 28
  • 29. October 1 – 4 2019 Future Work & Remaining Questions  Multi-level analysis leveraging deep learning techniques using larger dataset  Performance comparison between relevant techniques  Effect of code obfuscation techniques in machine learning detection  We welcome any collaboration with industry or university on ransomware research 29
  • 30. October 1 – 4 2019 30 Thanks Questions and comments? Subash Poudyal @subash_spp connect.subash@gmail.com
  • 31. October 1 – 4 2019 31 References [1] https://www.techspot.com/news/79119-jackson-county-government-gives-hackers-pays-400000.html [2] https://healthitsecurity.com/news/hackers-demand-1m-in-grays-harbor-ransomware-attack [3] https://heimdalsecurity.com/blog/cyber-security-threats-types/ [4] https://Malwarebytes.com [5] https://www.zdnet.com/article/georgia-county-pays-a-whopping-400000-to-get-rid-of-a-ransomware-infection/ [6] https://www.zdnet.com/article/second-florida-city-pays-giant-ransom-to-ransomware-gang-in-a-week/ [7] https://www.npr.org/2019/08/20/752695554/23-texas-towns-hit-with-ransomware-attack-in-new-front-of-cyberassault [8] https://www.msspalert.com/cybersecurity-breaches-and-attacks/ransomware/city-power-johannesburg-south-africa/ [9] https://www.techspot.com/news/79119-jackson-county-government-gives-hackers-pays-400000.html [10] https://blog.malwarebytes.com/ransomware/2019/05/ransomware-isnt-just-a-big-city-problem/ [11] R. Canzanese, S. Mancoridis, and M. Kam. System call-based detection of malicious processes. In 2015 IEEE International Conference on Software Quality, Reliability and Security, pages 119–124. IEEE, 2015. [12] S. Poudyal, K. P. Subedi, and D. Dasgupta. A framework for analyzingransomware using machine learning. In2018 IEEE Symposium Serieson Computational Intelligence (SSCI), pages 1692–1699. IEEE, 2018