Oral presentation of Seq2Seq Model for Real-time Intrusion detection at 31st Annual IEEE Canadian Conference on Electrical and Computer Engineering (CCECE 2018).
Abstract:
Network intrusions can be modeled as anomalies in network traffic in which the expected order of packets and their attributes deviate from regular traffic. Algorithms that predict the next sequence of events based on previous sequences are a promising avenue for detecting such anomalies. In this paper, we present a novel multi-attribute model for predicting a network packet sequence based on previous packets using a sequence-to-sequence (Seq2Seq) encoder-decoder model. This model is trained on an attack-free dataset to learn the normal sequence of packets in TCP connections and then it is used to detect anomalous packets in TCP traffic. We show that in DARPA 1999 dataset, the proposed multi-attribute Seq2Seq model detects anomalous raw TCP packets which are part of intrusions with 97% accuracy. Also, it can detect selected intrusions in real-time with 100% accuracy and outperforms existing algorithms based on recurrent neural network models such as LSTM.
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detection in Network Traffic
1. Sequence to Sequence Pattern Learning Algorithm
for Real-time Anomaly Detection in Network Traffic
Gobinath Loganathan∗ , Jagath Samarabandu† and Xianbin Wang‡
Department of Electrical and Computer Engineering
The University of Western Ontario
London, Ontario, N6A 5B9, Canada
∗ lgobinat@uwo.ca, † jagath@uwo.ca, ‡ xianbin.wang@uwo.ca
3. Introduction
● Network Intrusion - Intentional violation of expected behavior or protocol rule
● A network rule can be defined as a sequence of packets
● Network Intrusion → Anomalous sequential order of packets
Image Credits: https://www.cisco.com/c/en/us/about/press/internet-protocol-journal/back-issues/table-contents-34/syn-flooding-attacks.html
4. Problem
● Flattened datasets do not capture the sequential relationship
○ KDD 1999 [1] contains 42 attributes
■ Duration: length (number of seconds) of the connection
■ Count: number of connections to the same host in the past two seconds
● An Intrusion Detection System should not wait until a connection completes
5. Bontemps’ Solution
● Look for anomalus sequences in stream of packets
● Train a machine learner using legitimate traffic
○ Anomalies are defined based on prediction error
● Bontemps et al. used Long Short-Term Memory (LSTM) model [4]
○ Trained LSTM Recurrent Neural Network (RNN) using normal traffic from Darpa 1999
○ 100 % True Positives with 63 False Positives
○ Neptune DOS only
6. Sequence To Sequence (Seq2Seq) Model
● An encoder-decoder model developed by Luong et. al using LSTM [2]
Image Credits: https://www.tensorflow.org/tutorials/seq2seq
7. Seq2Seq Model for Intrusion Detection
● Consider a connection C = {y1, y2, y3, y4… yn}
● Encoder input I1 = {y1, y2, y3}
● Decoder input I2 = {EOC,y4,y5,y6… yn}
● Decoder output: O = {y4
’,y5
’,y6
’… yn
’,EOC’}
● Prediction error: E = Diff(O, {y4,y5,y6… yn,EOC})
● Attack: IF E > Threshold
Seq2Seq ModelI1
I2
O
8. Seq2Seq Model for Intrusion Detection
● Neural Machine Translation (NMT)
○ Sequence: A sentence (Meaningfully ordered words)
○ Element: A word (1 dimension)
○ Encoding: One-hot encoding - Ideally, vector size is equivalent to number of words in the language
● Intrusion Detection
○ Sequence: A connection (Meaningfully ordered packets)
○ Element: A network packet (multi-dimension)
○ Encoding: One-hot
9. Methodology
● Built a multi-attribute seq2seq model for intrusion detection
● Trained the model using attack-free TCP traffic from Darpa 1999 dataset [3]
○ Packets were split into connections
○ Connections with less than 4 packets were ignored
○ Connections with more than 60 packets were pruned to 60 - 96.96% connections had less than 60
○ Connections with packets between 4 - 59 were padded with empty packets
○ Selected attributes were encoded into one-hot vector
10. Test A - Batch Processing
● Dataset: DARPA 1999 pcaps → Packets between same source and destination
● Model determines the end of connection
○ Decoder reached EOC
○ Reached Ⲧ number of packets (100 in our case)
● Hypothesis
○ Model reached the limit Ⲧ → Sequence has no connection or connection has more than Ⲧ packets
○ High accuracy → Packets follow the standard flow
○ Ⲧ packets and Low accuracy → Anomaly
12. Test B - Real-time Processing
● Dataset: DARPA 1999 pcaps → Packets between same source and destination
● System raise an alarm if the average accuracy of predicted packets is < 12.25%
● Result:
○ Attacks: Neptune and Port Scan
○ Anomalous packets: 97.02% detection ratio and 0.07% False Alarms
○ Attack detection: 100% true positives with 1 false alarm
■ LSTM RNN by Botemps gives 100% TP and 63 FP for Neptune attack in Darpa 1999 [4]
Attack detection:100% TP & Anomalous packet detection: 90% TP
13. Conclusion
● Multi-attribute Seq2Seq model for real-time intrusion detection
● Select “Ⲧ” based on average number of packets per connection in your network
● Progress:
○ Trained UDP packets
○ Built an Intrusion Detection System (IDS) using the proposed model
14. References
1. University of California, “KDD Cup 1999 Data,” may 2018. [Online]. Available:
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
2. M.-T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine
translation,” in Empirical Methods in Natural Language Processing (EMNLP). Association for
Computational Linguistics, 2015, pp. 1412–1421. [Online]. Available: http://aclweb.org/anthology/D15-
1166
3. R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das, “The 1999 darpa off-line intrusion detection
evaluation,” Comput. Netw., vol. 34, no. 4, pp. 579–595, 2000. [Online].
Available:http://dx.doi.org/10.1016/S1389-1286(00)00139-0
4. L. Bontemps, V. L. Cao, J. McDermott, and N. Le-Khac, “Collective anomaly detection based on long short
term memory recurrent neural network,” 2017. [Online]. Available: http://arxiv.org/abs/1703.09752
15. Acknowledgement
● We gratefully acknowledge financial supporters
○ Western Engineering
○ National Science and Engineering Research Council (NS, Canada