Intrusion detection using data mining


Published on

Published in: Education
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Intrusion detection using data mining

  1. 1. By Shishir Shandilya(0610101041) Rajesh Ghildiyal(06180101036) Balbeer Singh Rawat(06180101006) Under the Guidance of MR. Ajit Singh
  2. 2. Problem Definition  An Intrusion Detection System is an important part of the Security Management system for computers and networks that tries to detect break-ins or break-in attempts.  Approaches to Solution  Signature-Based  Anomaly Based.
  3. 3. Types of Intrusion Detection  Classification I  Real Time  After-the-fact (offline)  Classification II  Host Based  Network Based
  4. 4. Approaches to IDS Technique Signature Based Anomaly Based Concept  Model well-known Model is based on normal behavior of the attacks system  use these known Try to flag the deviation from normal patterns to identify pattern as intrusion intrusion. Pros and  Specific to attacks  Usual changes due to traffic etc may lead Cons can not extend to higher number of false alarms . unknown intrusion patterns( False Negatives)
  5. 5. Approaches for IDS Network-Based Host-Based •Are installed on N/W •Are installed locally on Switches host machines •Detect some of the attacks, that host-based systems don’t. E.g.. DOS, Fragmented Packets.
  6. 6. Recommended Approach  None provides a complete solution  A hybrid approach using HIDS on local machines as well as powerful NIDS on switches
  7. 7. Attack Simulation  Types of attacks  NIDS ○ SYN-Flood Attack  HIDS ○ ssh Daemon attack.
  8. 8. NIDS – Data Preprocessing  Input data  tcpdump trace.  Huge  One data record per packet  Features extracted(Using Perl Scripts)  Content-Based Group records and construct new features corresponding to single connection  Time-Based Adding time-window based information to the connection records (Param: Time-window)  Connection-Based Adding connection-window based information (Param: Time-window)
  9. 9. Preprocessing on tcpdump  From the tcpdump data we extracted following fields  src_ip ,dst_ip  src_port, dst_port  num_packets_src_dest / num_packets_dest_src  num_ack_src_dst/ num_ack_dst_src  num_bytes_src_dst/ num_bytes_dst_src  num_retransmit_src_dst/ num_retransmit_dst_src  num_pushed_src_dst/ num_pushed_dst_src  num_syn_src_dst/ num_syn_dst_src  num_fin_src_dst/ num_fin_dst_src  connection status
  10. 10. Preprocessing on tcpdump cont…  Time-Window Based Features  Count_src/count_dst  Count_serv_src/ count_serv_dest  Connection-Window Based  Count_src1 /count_dst1  Count_serv_src1/ count_serv_dest1
  11. 11. NIDS- Datamining Technique  Outlier Detection  Clustering Based Approach(K-Means) ○ Outlier Threshold ○ Preprocessed dataset  K-NN Based Approach ○ distance threshold ○ Preprocessed dataset  Results  Clustering did not give good results. ○ Limited Data  K-NN ○ Giving Alarms
  12. 12. HIDS – Data Preprocesing  Input data  “strace” system call logs for a particular process(sshd)  One data record per system call  Sliding-Window Size for grouping.  Features extracted(Using Perl Scripts)  Sliding the window over the trace to generate possible sequences of system calls.
  13. 13. HIDS – Data Preprocessing cont… a d f g a e d a e b s d e a ad f g d f g a f g a e g a e d a e d a e d a e d a e b a e b s e b s d b s d e s d e a
  14. 14. Datamining Technique Used  Learning to predict system calls  Predict ith system call for each test record<p1, p2,p3>  Done using Classification (Decision Trees)  Anomaly Detection  Use of misclassification score to detect anomalies
  15. 15. Literature Survey  Types of attacks (Host and Network Based)  Techniques  Association rules and Frequent Episode Rules over host based and network based  Outlier Detection using clustering  classification
  16. 16. Future Work  NIDS  To incorporate threshold distance as a configurable parameter for K-Means Algorithm used  HIDS  Try out meta-learning algorithms for classification  A small user Interface for configuring parameters.
  17. 17. References  “Mining in a data-flow Environment: Experience in Network Intrusion Detection”, W. Lee, S. Stolfo, K. Mok.  “Mining audit data to build intrusion detection models”, W. Lee, S. Stolfo, K. Mok.  “Data Mining approaches for Intrusion Detection”, W. Lee S. Stolfo.  “A comparative study of anomaly detection schemes in network intrusion detection”, A. Lazarevic, A ozgur, L. Ertoz, J. Srivastava, Vipin Kumar.  “Anomaly Intrusion detection by internet datamining pf traffic episodes” Min Qin & Kai Gwang.  “A database of computer attacks for the evaluation of Intrusion Detection System”, Thesis by Kristopher Kendall.