SlideShare a Scribd company logo
자연어처리 연구실
M2020064
조단비
Content
1. Background
2. Introduce
3. Idea
4. Model
5. Experiments
6. Summary
#Kookmin_University #Natural_Language_Processing_lab. 1
Background
1) WAF(Web Application Firewall)
: Detecting and Blocking Web attacks
(ex: SQL Injection, Cross-Site Scripting(XSS))
- Signature-based detection
: one-to-one response method
- Rule-based detection
: signature + AI method
#Kookmin_University #Natural_Language_Processing_lab. 2
https://m.blog.naver.com/pentamkt/222084745829
Background
2) Zero-day attack
: Attacks vulnerability in computer software
(ex: spear phishing)
3) Whitelist
: Trusted target list
( blacklist)
#Kookmin_University #Natural_Language_Processing_lab. 3
Background
4) Recurrent Neural Network (RNN, LSTM)
#Kookmin_University #Natural_Language_Processing_lab. 4
https://ratsgo.github.io/natural%20language%20processing/2017/03/09/rnnlstm/
Background
4) Recurrent Neural Network (RNN, LSTM)
#Kookmin_University #Natural_Language_Processing_lab. 5
https://ratsgo.github.io/natural%20language%20processing/2017/03/09/rnnlstm/
Introduce
> Why a zero-day attack is difficult to detect?
1) Have not been previously seen
2) Can be carried out by a single malicious HTTP request
3) Very rare within a large number of Web requests
#Kookmin_University #Natural_Language_Processing_lab. 6
> Most supervised approaches are inappropriate
> Contextual information is not helpful
> Collective and statistical information are not effective
ZeroWall
- Unsupervised approach
(work with an existing WAF in pipeline)
- Detecting a zero-day Web attack
hidden in an individual Web request
Introduce
> What we want?
1) WAF detects those known attacks effectively
2) ZeroWall detects unknown attacks ignored by WAF rules
#Kookmin_University #Natural_Language_Processing_lab. 7
> Filter out known attacks
> Report new attack patterns to operators
and security engineers to update WAF rules
Idea
1) HTTP request: string following HTTP
2) Most requests are benign, malicious requests are rare
#Kookmin_University #Natural_Language_Processing_lab. 8
> Consider an HTTP request as one sentence in the HTTP request language
> Train a kind of language model based on historical logs,
to learn this language from benign requests
Historical
Web logs
Language
Model
Train
One request
Can
understand
Cannot
understand
Benign Malicious
Idea : Self-Translate Machine
> How to learn this ‘Hyper-Text’ language?
: Use Neural Network Translation model to train a Self-Translate Machine
- Encoder: encode the original request into one representation
- Decoder: decode it back
#Kookmin_University #Natural_Language_Processing_lab. 9
Idea : Self-Translate Machine
> How to quantify the self-translation quality(anomaly score)?
(Translation Quality = Anomaly Score)
An attack detection problem = A machine translation quality assessment problem
: use machine translation metrics
#Kookmin_University #Natural_Language_Processing_lab. 10
Historical
Web logs
Self-Translate
Machine
Train
One request
Good
Translation
Bad
Translation
Benign Malicious
Idea : Self-Translate Machine
Translation Quality = Anomaly Score
: Use BLEU(Bi-Lingual Evaluation Understudy)
(Malicious Score = 1-BLEU_score)
#Kookmin_University #Natural_Language_Processing_lab. 11
𝐵𝐿𝐸𝑈 = 𝐵𝑃 × exp ෍
𝑛=1
𝑁
𝑤𝑛 𝑙𝑜𝑔𝑝𝑛
𝐵𝑃 = ቐ
1, 𝑖𝑓 𝑐 > 𝑟
𝑒
1−
𝑟
𝑐 , 𝑖𝑓 𝑐 ≤ 𝑟
𝑝𝑛 =
σ𝑛−𝑔𝑟𝑎𝑚∈𝐶𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝐶𝑜𝑢𝑛𝑡𝑐𝑙𝑖𝑝(𝑛 − 𝑔𝑟𝑎𝑚)
σ𝑛−𝑔𝑟𝑎𝑚∈𝐶𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝐶𝑜𝑢𝑛𝑡(𝑛 − 𝑔𝑟𝑎𝑚)
Model: ZeroWall
#Kookmin_University #Natural_Language_Processing_lab. 12
Model: ZeroWall
> Offline Periodic Retraining
: build and update vocabulary and re-train the model
#Kookmin_University #Natural_Language_Processing_lab. 13
1
2 3
Model: ZeroWall
1) Building Vocabulary
#Kookmin_University #Natural_Language_Processing_lab. 14
Raw log
Bag of
Words
Vocabulary
Filtering
- Stop words
- variables
Model: ZeroWall
2) Parsing
#Kookmin_University #Natural_Language_Processing_lab. 15
Vocabulary
Model: ZeroWall
> Online Detecting
: Detect anomalies in real-time requests for manual investigation
#Kookmin_University #Natural_Language_Processing_lab. 16
1 2
3
4
Model: ZeroWall
1) Parsing = parsing in offline training
2) Translation
#Kookmin_University #Natural_Language_Processing_lab. 17
translation
Model: ZeroWall
3) Detection
4) Investigate
(1) BLEU metrics
(2) Threshold [Larger? Yes ⇨ Go to step 3; No ⇨ Benign]
(3) Check whitelist [Not in whitelist? Yes ⇨ Go to step 4; No ⇨ Benign]
(4) Investigation [True Attacks ⇨ Update WAF/IDS; False Alarms ⇨ Update whitelist rules]
#Kookmin_University #Natural_Language_Processing_lab. 18
Original sequence(token sequence) vs. Translated sequence(recovered token sequence)
1 2 3 4
Experiments
> Data
- 8 real world trace from an internet company
- Over 1.4 billion requests in a week
> Overview
- Captured 28 different types of zero-day attacks
- Contribute to 141,583 of zero-day attack requests in total
#Kookmin_University #Natural_Language_Processing_lab. 19
# B2M: Ratio of Benign to Malicious (in WAF)
# B2Z: Ratio of Benign to Zero-Day
Experiments
> Baseline & labels
1) Unsupervised Approaches
2) Supervised Approaches
#Kookmin_University #Natural_Language_Processing_lab. 20
> SAE(stacked auto-encoder), HMM and DFA(Deterministic Finite Automata)
> Use data filtered by WAF as training set
> CNN, RNN and DT
> Use all data (allowed/dropped) as training set and WAF results as labels
Experiments
> Evaluation Results
#Kookmin_University #Natural_Language_Processing_lab. 21
Experiments
> Zero-Day case
- These attack is detected by ZeroWall, CNN, and RNN
- WAF are usually based on keywords, e.g., eval, request, select, and execute
- ZeroWall is based on the “understanding” of benign requests
- The structure of this zero-day attack request is more like a programming language
#Kookmin_University #Natural_Language_Processing_lab. 22
ZeroWall
CNN and RNN
Experiments
> Whitelist
- To mitigate False Alarms
- The numbers of whitelist rules refer to “how many whitelist rules are added each day”,
based on the FPs labeled on that day
(No rules applied on 0602 since it is the first day of testing set)
- The results shows that the whitelist reduces the number of FPs with low overhead
(Numbers of rules are very small)
- Based on these results, we believe ZeroWall is practical in real-world deployment
#Kookmin_University #Natural_Language_Processing_lab. 23
Summary
1) Present a zero-day web attack detection system ZeroWall
2) Deployed in the wild
#Kookmin_University #Natural_Language_Processing_lab. 24
- Augmenting existing signature-based WAFs
- Use Encoder-Decoder Network to learn patterns from normal requests
- Use Self-Translation Machine & BLEU metrics
- Over 1.4 billion requests
- Captured 28 different types of zero-day attacks
(100K of zero-day attack requests)
- Low overhead
Thank You.
25
#Kookmin_University #Natural_Language_Processing_lab.

More Related Content

Similar to Zero wall detecting zero-day web attacks through encoder-decoder recurrent neural networks

Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
SubhashreddyPalleti
 
MIT-6-determina-vps.ppt
MIT-6-determina-vps.pptMIT-6-determina-vps.ppt
MIT-6-determina-vps.pptwebhostingguy
 
A Scalable Approach for Malware Detec2on through Bounded Feature Space Behavi...
A Scalable Approach for Malware Detec2on through Bounded Feature Space Behavi...A Scalable Approach for Malware Detec2on through Bounded Feature Space Behavi...
A Scalable Approach for Malware Detec2on through Bounded Feature Space Behavi...
Lionel Briand
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
Thomas Zimmermann
 
M phil-computer-science-cryptography-projects
M phil-computer-science-cryptography-projectsM phil-computer-science-cryptography-projects
M phil-computer-science-cryptography-projects
Vijay Karan
 
B-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive DefenseB-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive Defense
Stephan Chenette
 
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Priyanka Aash
 
OSCP Preparation Guide @ Infosectrain
OSCP Preparation Guide @ InfosectrainOSCP Preparation Guide @ Infosectrain
OSCP Preparation Guide @ Infosectrain
InfosecTrain
 
Good Security Starts with Software Assurance - Software Assurance Market Plac...
Good Security Starts with Software Assurance - Software Assurance Market Plac...Good Security Starts with Software Assurance - Software Assurance Market Plac...
Good Security Starts with Software Assurance - Software Assurance Market Plac...Phil Agcaoili
 
Frehsher Testing Resume 1
Frehsher  Testing  Resume 1Frehsher  Testing  Resume 1
Frehsher Testing Resume 1
ncct
 
Penetration testing dont just leave it to chance
Penetration testing dont just leave it to chancePenetration testing dont just leave it to chance
Penetration testing dont just leave it to chance
Dr. Anish Cheriyan (PhD)
 
A Case Study Injecting Safety-Critical Thinking Into Graduate Software Engin...
A Case Study  Injecting Safety-Critical Thinking Into Graduate Software Engin...A Case Study  Injecting Safety-Critical Thinking Into Graduate Software Engin...
A Case Study Injecting Safety-Critical Thinking Into Graduate Software Engin...
Arlene Smith
 
Keeping Up with the Adversary: Creating a Threat-Based Cyber Team
Keeping Up with the Adversary:  Creating a Threat-Based Cyber TeamKeeping Up with the Adversary:  Creating a Threat-Based Cyber Team
Keeping Up with the Adversary: Creating a Threat-Based Cyber Team
Priyanka Aash
 
Syed Ubaid Ali Jafri - Black Box Penetration testing for Associates
Syed Ubaid Ali Jafri - Black Box Penetration testing for AssociatesSyed Ubaid Ali Jafri - Black Box Penetration testing for Associates
Syed Ubaid Ali Jafri - Black Box Penetration testing for Associates
Syed Ubaid Ali Jafri
 
Zerovm backgroud
Zerovm backgroudZerovm backgroud
Zerovm backgroud
UT, San Antonio
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Pluribus One
 
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat Security Conference
 
Today
TodayToday
Building functional Quality Gates with ReportPortal
Building functional Quality Gates with ReportPortalBuilding functional Quality Gates with ReportPortal
Building functional Quality Gates with ReportPortal
Dmitriy Gumeniuk
 

Similar to Zero wall detecting zero-day web attacks through encoder-decoder recurrent neural networks (20)

Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
MIT-6-determina-vps.ppt
MIT-6-determina-vps.pptMIT-6-determina-vps.ppt
MIT-6-determina-vps.ppt
 
A Scalable Approach for Malware Detec2on through Bounded Feature Space Behavi...
A Scalable Approach for Malware Detec2on through Bounded Feature Space Behavi...A Scalable Approach for Malware Detec2on through Bounded Feature Space Behavi...
A Scalable Approach for Malware Detec2on through Bounded Feature Space Behavi...
 
Kenneth Delos Santos -SQA - 7 years - long
Kenneth Delos Santos -SQA - 7 years - longKenneth Delos Santos -SQA - 7 years - long
Kenneth Delos Santos -SQA - 7 years - long
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
 
M phil-computer-science-cryptography-projects
M phil-computer-science-cryptography-projectsM phil-computer-science-cryptography-projects
M phil-computer-science-cryptography-projects
 
B-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive DefenseB-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive Defense
 
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
 
OSCP Preparation Guide @ Infosectrain
OSCP Preparation Guide @ InfosectrainOSCP Preparation Guide @ Infosectrain
OSCP Preparation Guide @ Infosectrain
 
Good Security Starts with Software Assurance - Software Assurance Market Plac...
Good Security Starts with Software Assurance - Software Assurance Market Plac...Good Security Starts with Software Assurance - Software Assurance Market Plac...
Good Security Starts with Software Assurance - Software Assurance Market Plac...
 
Frehsher Testing Resume 1
Frehsher  Testing  Resume 1Frehsher  Testing  Resume 1
Frehsher Testing Resume 1
 
Penetration testing dont just leave it to chance
Penetration testing dont just leave it to chancePenetration testing dont just leave it to chance
Penetration testing dont just leave it to chance
 
A Case Study Injecting Safety-Critical Thinking Into Graduate Software Engin...
A Case Study  Injecting Safety-Critical Thinking Into Graduate Software Engin...A Case Study  Injecting Safety-Critical Thinking Into Graduate Software Engin...
A Case Study Injecting Safety-Critical Thinking Into Graduate Software Engin...
 
Keeping Up with the Adversary: Creating a Threat-Based Cyber Team
Keeping Up with the Adversary:  Creating a Threat-Based Cyber TeamKeeping Up with the Adversary:  Creating a Threat-Based Cyber Team
Keeping Up with the Adversary: Creating a Threat-Based Cyber Team
 
Syed Ubaid Ali Jafri - Black Box Penetration testing for Associates
Syed Ubaid Ali Jafri - Black Box Penetration testing for AssociatesSyed Ubaid Ali Jafri - Black Box Penetration testing for Associates
Syed Ubaid Ali Jafri - Black Box Penetration testing for Associates
 
Zerovm backgroud
Zerovm backgroudZerovm backgroud
Zerovm backgroud
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
 
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
 
Today
TodayToday
Today
 
Building functional Quality Gates with ReportPortal
Building functional Quality Gates with ReportPortalBuilding functional Quality Gates with ReportPortal
Building functional Quality Gates with ReportPortal
 

More from Danbi Cho

Crf based named entity recognition using a korean lexical semantic network
Crf based named entity recognition using a korean lexical semantic networkCrf based named entity recognition using a korean lexical semantic network
Crf based named entity recognition using a korean lexical semantic network
Danbi Cho
 
Gpt models
Gpt modelsGpt models
Gpt models
Danbi Cho
 
Attention boosted deep networks for video classification
Attention boosted deep networks for video classificationAttention boosted deep networks for video classification
Attention boosted deep networks for video classification
Danbi Cho
 
A survey on deep learning based approaches for action and gesture recognition...
A survey on deep learning based approaches for action and gesture recognition...A survey on deep learning based approaches for action and gesture recognition...
A survey on deep learning based approaches for action and gesture recognition...
Danbi Cho
 
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
ELECTRA_Pretraining Text Encoders as Discriminators rather than GeneratorsELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
Danbi Cho
 
A survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in textA survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in text
Danbi Cho
 
Decision tree and ensemble
Decision tree and ensembleDecision tree and ensemble
Decision tree and ensemble
Danbi Cho
 
Can recurrent neural networks warp time
Can recurrent neural networks warp timeCan recurrent neural networks warp time
Can recurrent neural networks warp time
Danbi Cho
 
Man is to computer programmer as woman is to homemaker debiasing word embeddings
Man is to computer programmer as woman is to homemaker debiasing word embeddingsMan is to computer programmer as woman is to homemaker debiasing word embeddings
Man is to computer programmer as woman is to homemaker debiasing word embeddings
Danbi Cho
 
Situation recognition visual semantic role labeling for image understanding
Situation recognition visual semantic role labeling for image understandingSituation recognition visual semantic role labeling for image understanding
Situation recognition visual semantic role labeling for image understanding
Danbi Cho
 
Mitigating unwanted biases with adversarial learning
Mitigating unwanted biases with adversarial learningMitigating unwanted biases with adversarial learning
Mitigating unwanted biases with adversarial learning
Danbi Cho
 

More from Danbi Cho (11)

Crf based named entity recognition using a korean lexical semantic network
Crf based named entity recognition using a korean lexical semantic networkCrf based named entity recognition using a korean lexical semantic network
Crf based named entity recognition using a korean lexical semantic network
 
Gpt models
Gpt modelsGpt models
Gpt models
 
Attention boosted deep networks for video classification
Attention boosted deep networks for video classificationAttention boosted deep networks for video classification
Attention boosted deep networks for video classification
 
A survey on deep learning based approaches for action and gesture recognition...
A survey on deep learning based approaches for action and gesture recognition...A survey on deep learning based approaches for action and gesture recognition...
A survey on deep learning based approaches for action and gesture recognition...
 
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
ELECTRA_Pretraining Text Encoders as Discriminators rather than GeneratorsELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
 
A survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in textA survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in text
 
Decision tree and ensemble
Decision tree and ensembleDecision tree and ensemble
Decision tree and ensemble
 
Can recurrent neural networks warp time
Can recurrent neural networks warp timeCan recurrent neural networks warp time
Can recurrent neural networks warp time
 
Man is to computer programmer as woman is to homemaker debiasing word embeddings
Man is to computer programmer as woman is to homemaker debiasing word embeddingsMan is to computer programmer as woman is to homemaker debiasing word embeddings
Man is to computer programmer as woman is to homemaker debiasing word embeddings
 
Situation recognition visual semantic role labeling for image understanding
Situation recognition visual semantic role labeling for image understandingSituation recognition visual semantic role labeling for image understanding
Situation recognition visual semantic role labeling for image understanding
 
Mitigating unwanted biases with adversarial learning
Mitigating unwanted biases with adversarial learningMitigating unwanted biases with adversarial learning
Mitigating unwanted biases with adversarial learning
 

Recently uploaded

Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Hivelance Technology
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
Sharepoint Designs
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
Jelle | Nordend
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
ayushiqss
 

Recently uploaded (20)

Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
 

Zero wall detecting zero-day web attacks through encoder-decoder recurrent neural networks

  • 2. Content 1. Background 2. Introduce 3. Idea 4. Model 5. Experiments 6. Summary #Kookmin_University #Natural_Language_Processing_lab. 1
  • 3. Background 1) WAF(Web Application Firewall) : Detecting and Blocking Web attacks (ex: SQL Injection, Cross-Site Scripting(XSS)) - Signature-based detection : one-to-one response method - Rule-based detection : signature + AI method #Kookmin_University #Natural_Language_Processing_lab. 2 https://m.blog.naver.com/pentamkt/222084745829
  • 4. Background 2) Zero-day attack : Attacks vulnerability in computer software (ex: spear phishing) 3) Whitelist : Trusted target list ( blacklist) #Kookmin_University #Natural_Language_Processing_lab. 3
  • 5. Background 4) Recurrent Neural Network (RNN, LSTM) #Kookmin_University #Natural_Language_Processing_lab. 4 https://ratsgo.github.io/natural%20language%20processing/2017/03/09/rnnlstm/
  • 6. Background 4) Recurrent Neural Network (RNN, LSTM) #Kookmin_University #Natural_Language_Processing_lab. 5 https://ratsgo.github.io/natural%20language%20processing/2017/03/09/rnnlstm/
  • 7. Introduce > Why a zero-day attack is difficult to detect? 1) Have not been previously seen 2) Can be carried out by a single malicious HTTP request 3) Very rare within a large number of Web requests #Kookmin_University #Natural_Language_Processing_lab. 6 > Most supervised approaches are inappropriate > Contextual information is not helpful > Collective and statistical information are not effective ZeroWall - Unsupervised approach (work with an existing WAF in pipeline) - Detecting a zero-day Web attack hidden in an individual Web request
  • 8. Introduce > What we want? 1) WAF detects those known attacks effectively 2) ZeroWall detects unknown attacks ignored by WAF rules #Kookmin_University #Natural_Language_Processing_lab. 7 > Filter out known attacks > Report new attack patterns to operators and security engineers to update WAF rules
  • 9. Idea 1) HTTP request: string following HTTP 2) Most requests are benign, malicious requests are rare #Kookmin_University #Natural_Language_Processing_lab. 8 > Consider an HTTP request as one sentence in the HTTP request language > Train a kind of language model based on historical logs, to learn this language from benign requests Historical Web logs Language Model Train One request Can understand Cannot understand Benign Malicious
  • 10. Idea : Self-Translate Machine > How to learn this ‘Hyper-Text’ language? : Use Neural Network Translation model to train a Self-Translate Machine - Encoder: encode the original request into one representation - Decoder: decode it back #Kookmin_University #Natural_Language_Processing_lab. 9
  • 11. Idea : Self-Translate Machine > How to quantify the self-translation quality(anomaly score)? (Translation Quality = Anomaly Score) An attack detection problem = A machine translation quality assessment problem : use machine translation metrics #Kookmin_University #Natural_Language_Processing_lab. 10 Historical Web logs Self-Translate Machine Train One request Good Translation Bad Translation Benign Malicious
  • 12. Idea : Self-Translate Machine Translation Quality = Anomaly Score : Use BLEU(Bi-Lingual Evaluation Understudy) (Malicious Score = 1-BLEU_score) #Kookmin_University #Natural_Language_Processing_lab. 11 𝐵𝐿𝐸𝑈 = 𝐵𝑃 × exp ෍ 𝑛=1 𝑁 𝑤𝑛 𝑙𝑜𝑔𝑝𝑛 𝐵𝑃 = ቐ 1, 𝑖𝑓 𝑐 > 𝑟 𝑒 1− 𝑟 𝑐 , 𝑖𝑓 𝑐 ≤ 𝑟 𝑝𝑛 = σ𝑛−𝑔𝑟𝑎𝑚∈𝐶𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝐶𝑜𝑢𝑛𝑡𝑐𝑙𝑖𝑝(𝑛 − 𝑔𝑟𝑎𝑚) σ𝑛−𝑔𝑟𝑎𝑚∈𝐶𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝐶𝑜𝑢𝑛𝑡(𝑛 − 𝑔𝑟𝑎𝑚)
  • 14. Model: ZeroWall > Offline Periodic Retraining : build and update vocabulary and re-train the model #Kookmin_University #Natural_Language_Processing_lab. 13 1 2 3
  • 15. Model: ZeroWall 1) Building Vocabulary #Kookmin_University #Natural_Language_Processing_lab. 14 Raw log Bag of Words Vocabulary Filtering - Stop words - variables
  • 16. Model: ZeroWall 2) Parsing #Kookmin_University #Natural_Language_Processing_lab. 15 Vocabulary
  • 17. Model: ZeroWall > Online Detecting : Detect anomalies in real-time requests for manual investigation #Kookmin_University #Natural_Language_Processing_lab. 16 1 2 3 4
  • 18. Model: ZeroWall 1) Parsing = parsing in offline training 2) Translation #Kookmin_University #Natural_Language_Processing_lab. 17 translation
  • 19. Model: ZeroWall 3) Detection 4) Investigate (1) BLEU metrics (2) Threshold [Larger? Yes ⇨ Go to step 3; No ⇨ Benign] (3) Check whitelist [Not in whitelist? Yes ⇨ Go to step 4; No ⇨ Benign] (4) Investigation [True Attacks ⇨ Update WAF/IDS; False Alarms ⇨ Update whitelist rules] #Kookmin_University #Natural_Language_Processing_lab. 18 Original sequence(token sequence) vs. Translated sequence(recovered token sequence) 1 2 3 4
  • 20. Experiments > Data - 8 real world trace from an internet company - Over 1.4 billion requests in a week > Overview - Captured 28 different types of zero-day attacks - Contribute to 141,583 of zero-day attack requests in total #Kookmin_University #Natural_Language_Processing_lab. 19 # B2M: Ratio of Benign to Malicious (in WAF) # B2Z: Ratio of Benign to Zero-Day
  • 21. Experiments > Baseline & labels 1) Unsupervised Approaches 2) Supervised Approaches #Kookmin_University #Natural_Language_Processing_lab. 20 > SAE(stacked auto-encoder), HMM and DFA(Deterministic Finite Automata) > Use data filtered by WAF as training set > CNN, RNN and DT > Use all data (allowed/dropped) as training set and WAF results as labels
  • 22. Experiments > Evaluation Results #Kookmin_University #Natural_Language_Processing_lab. 21
  • 23. Experiments > Zero-Day case - These attack is detected by ZeroWall, CNN, and RNN - WAF are usually based on keywords, e.g., eval, request, select, and execute - ZeroWall is based on the “understanding” of benign requests - The structure of this zero-day attack request is more like a programming language #Kookmin_University #Natural_Language_Processing_lab. 22 ZeroWall CNN and RNN
  • 24. Experiments > Whitelist - To mitigate False Alarms - The numbers of whitelist rules refer to “how many whitelist rules are added each day”, based on the FPs labeled on that day (No rules applied on 0602 since it is the first day of testing set) - The results shows that the whitelist reduces the number of FPs with low overhead (Numbers of rules are very small) - Based on these results, we believe ZeroWall is practical in real-world deployment #Kookmin_University #Natural_Language_Processing_lab. 23
  • 25. Summary 1) Present a zero-day web attack detection system ZeroWall 2) Deployed in the wild #Kookmin_University #Natural_Language_Processing_lab. 24 - Augmenting existing signature-based WAFs - Use Encoder-Decoder Network to learn patterns from normal requests - Use Self-Translation Machine & BLEU metrics - Over 1.4 billion requests - Captured 28 different types of zero-day attacks (100K of zero-day attack requests) - Low overhead