Amm Icict 12 2005

569 views

Published on

This is a presentation I gave at a conference organized by ITI, it\'s about my Master work: Adapting data mining techniques for intrusion detection.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
569
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Amm Icict 12 2005

  1. 1. AN ARCHITECTURE FOR MINING THE EGYPTIAN E - GOVERNMENT NETWORK TRAFFIC FOR INTRUSION DETECTION Prepared By: Mervat M. Fahmy Supervised By: Dr. A. M. Riad, Dr. M. A. Sharkawy 06/07/09 Mansoura University Faculty of Computer & Information Sciences Information Systems Dept.
  2. 2. Agenda <ul><li>Intrusion detection systems </li></ul><ul><ul><li>Components </li></ul></ul><ul><ul><li>Measures of Performance </li></ul></ul><ul><li>Data mining as a means for intrusion detection </li></ul><ul><li>Our IDS architecture (MEGNTID) </li></ul><ul><ul><li>The Egyptian E-Government environment </li></ul></ul><ul><ul><li>The MEGNTID architecture </li></ul></ul><ul><ul><li>The clustering algorithm </li></ul></ul><ul><ul><li>MEGNTID anticipated benefits </li></ul></ul><ul><li>Proof of Concept implementation & To Do List </li></ul>06/07/09
  3. 3. Intrusion Detection Systems <ul><li>What are Intrusion Detection Systems? </li></ul><ul><li>Software or hardware products that automate the process of monitoring and analyzing the events occurring in a computer or network system for signs of intrusions , which are ‘ attempts to compromise confidentiality, integrity, or availability of the system. ’ </li></ul>06/07/09
  4. 4. Components of Intrusion Detection Systems 06/07/09 capture and analyze network packets using sensors placed at various points in a network, and report attacks to a management console. operate on data collected from individual computer systems (OS audit trails, system logs, etc.) analyze the events occurring within a software application that are registered in the application’s transaction log files. analyzing system activity to find events that match predefined patterns (signatures) describing known attacks. identifying abnormal behavior (anomalies) on a host or network. The main assumption is ‘attacks are different from legitimate activity and can be detected by systems that identify these differences ’. IDS components Data sources Analysis Approach Response Mechanism Network Monitoring Host Monitoring Application Monitoring Misuse Detection Anomaly Detection Active Passive
  5. 5. Components of intrusion detection systems <ul><li>The analysis approach is the core of the IDS. </li></ul>06/07/09 Misuse Detection Anomaly Detection Basic Approach Attack Signatures Normal Profiles Correct Detection High for recorded attacks Depends on sensitivity level Wrong Detection Low High Missed attacks Can be high Depends on sensitivity level
  6. 6. Measures of Performance <ul><li>IDSs have three main performance measures </li></ul><ul><ul><li>Detection Rate: </li></ul></ul><ul><ul><li>False Positives Rate </li></ul></ul><ul><ul><li>False Negatives Rate </li></ul></ul>06/07/09 Detected Intrusions Actual Intrusions False Positives False Negatives Correct Intrusions All Data Instances
  7. 7. Agenda <ul><li>Intrusion detection systems </li></ul><ul><ul><li>Components </li></ul></ul><ul><ul><li>Measures of Performance </li></ul></ul><ul><li>Data mining as a means for intrusion detection </li></ul><ul><li>Our IDS architecture (MEGNTID) </li></ul><ul><ul><li>The Egyptian E-Government environment </li></ul></ul><ul><ul><li>The MEGNTID architecture </li></ul></ul><ul><ul><li>The clustering algorithm </li></ul></ul><ul><ul><li>MEGNTID anticipated benefits </li></ul></ul><ul><li>Proof of Concept implementation & To Do List </li></ul>06/07/09
  8. 8. Data Mining as a means for Intrusion Detection <ul><li>Network intrusion detection problem: finding attacks (rare incidents) among a large volume of traffic data </li></ul><ul><li>Data mining digs into large datasets to find hidden (rare) interesting patterns </li></ul><ul><ul><li>A reasonable match! </li></ul></ul><ul><li>Data mining techniques are mainly used for anomaly detection. </li></ul><ul><ul><li>Clustering techniques are gaining popularity in anomaly detection. </li></ul></ul><ul><li>Recent works deployed data mining classification techniques for misuse detection. </li></ul><ul><ul><li>Problem? Adaptability to new categories of behavior. </li></ul></ul>06/07/09
  9. 9. Data Mining as a means for Intrusion Detection <ul><li>But deploying data mining for NIDSs suffers from: </li></ul><ul><li>High false alarm rates because they are used mainly for anomaly detection </li></ul><ul><li>The need for extensive training over attack-free correctly labeled data instances </li></ul><ul><li>The tendency to focus on efficient detection rather than discovering new knowledge about intrusions </li></ul>06/07/09
  10. 10. Agenda <ul><li>Intrusion detection systems </li></ul><ul><ul><li>Components </li></ul></ul><ul><ul><li>Measures of Performance </li></ul></ul><ul><li>Data mining as a means for intrusion detection </li></ul><ul><li>Our IDS architecture (MEGNTID) </li></ul><ul><ul><li>The Egyptian E-Government environment </li></ul></ul><ul><ul><li>MEGNTID architecture </li></ul></ul><ul><ul><li>The clustering algorithm </li></ul></ul><ul><ul><li>MEGNTID anticipated benefits </li></ul></ul><ul><li>Proof of Concept implementation & To Do List </li></ul>06/07/09
  11. 11. The Egyptian e-government as an environment 06/07/09 Ministries & Public Bodies Security System Investors Organizations Experts Internet & Telephony E-Government’s Private Network Citizens National Databases Service Brokers
  12. 12. The Egyptian E-Government environment <ul><li>The Egyptian e-gov network looks similar to the majority of corporate networks </li></ul><ul><li>Central point of entry to networked local sites. </li></ul><ul><li>MEGNTID’s design follows this hierarchy. </li></ul><ul><li>MEGNTID has high sensitivity to abnormal traffic patterns. </li></ul><ul><li>It is deigned for environments where strong security measures are enforced. </li></ul><ul><li>This makes it applicable to other corporate networks with same network architecture and same security policy. </li></ul>06/07/09
  13. 13. MEGNTID Architecture 06/07/09 <ul><li>The Normal Profiles Building Phase </li></ul><ul><ul><li>Construct classifiers that can discriminate and predict the local systems behaviors. </li></ul></ul><ul><ul><li>Implemented in each local site. </li></ul></ul>Preprocessing and Feature Extraction Engine Correctly Classified Data Instances Local Profile Classification Engine Sensor Formatted Traffic Data Traffic Data Profile Rules Building Engine Local System Profile Matching Engine Profile Rules Correctly Labeled Data Instances Traffic Database Relevant Attributes Formatted Traffic Data
  14. 14. MEGNTID Architecture 06/07/09 Sensor Unlabeled Traffic Data Formatted Traffic Data Traffic Data Known Attacks Preprocessing Engine Global Layer of Intrusion Detection System Known Attacks New Attacks Rules Highly Suspicious Anomalies Global Known-Attacks Detection Engine Global Response Engine High-Ranked Anomalies Global Anomaly Analysis Engine Attacks Database Local Layer of Intrusion Detection System Normal Data Records Local Profile Matching Engine Local Response Engine Normal Records High-Ranked Anomalies Anomalous Data Records New Normal Profile Rules Local Anomaly Detection Engine Normal Profile Database Local Site 1 The Real-Time Building Phase Local Site n
  15. 15. The clustering algorithm 06/07/09 Normal Clusters
  16. 16. The clustering algorithm 06/07/09 Updated Cluster Attack Clusters New Bursty Attack New Stealthy Attack New Normal Cluster
  17. 17. The clustering algorithm 06/07/09 <ul><li>Attack clusters are ‘forced’ since attack instances with the same class are clustered together regardless of distance function. </li></ul><ul><li>As a new data instance is labeled normal, it is placed in one of the set of normal clusters if possible. </li></ul><ul><li>Data instances that don’t fit into normal clusters are matched against attack clusters. </li></ul><ul><li>Data instances that don’t fit into attacks clusters form their own new clusters. </li></ul><ul><li>The rate and duration of cluster formation determines whether it represents normal or intrusive behavior. </li></ul>
  18. 18. MEGNTID Anticipated Benefits 06/07/09 <ul><li>MEGNTID should provide the following: </li></ul><ul><li>Low false alarms rate: as the volume of data to be analyzed is reduced, the probability to wrongly classify data as intrusions is reduced. </li></ul><ul><li>High detection rate: known intrusions will be detected as early as possible, and anomalous data is not flagged unless it is highly suspicious. </li></ul><ul><li>Continuous adaptation to the changing environment and the new intrusion techniques. </li></ul><ul><li>Clusters can be studied to define new attacks characteristics and rate of change over time. </li></ul>
  19. 19. Agenda <ul><li>Intrusion detection systems </li></ul><ul><ul><li>Components </li></ul></ul><ul><ul><li>Measures of Performance </li></ul></ul><ul><li>Data mining as a means for intrusion detection </li></ul><ul><li>Our IDS architecture (MEGNTID) </li></ul><ul><ul><li>The Egyptian E-Government environment </li></ul></ul><ul><ul><li>The MEGNTID architecture </li></ul></ul><ul><ul><li>The clustering algorithm </li></ul></ul><ul><ul><li>The anticipated benefits </li></ul></ul><ul><li>Proof of Concept implementation & To Do List </li></ul>06/07/09
  20. 20. Proof of Concept implementation <ul><li>We use the DARPA 1999 dataset. </li></ul><ul><li>This dataset contains two weeks of attack-free traffic and three weeks of mixed traffic. </li></ul><ul><li>Week 1 Monday outside dump is used to compute the normal profile. </li></ul><ul><li>Week 4 Monday outside dump is used to train the detection module. </li></ul><ul><li>Week 5 Monday outside dump is used to test MEGNTID and fine-tune its parameters. </li></ul>06/07/09
  21. 21. Proof of Concept implementation <ul><li>To format the data for the mining task, we extended the functionality of the JpcapDumper sensor. </li></ul><ul><li>We filtered the dump files to have only TCP connections. </li></ul><ul><li>Only packet header information are analyzed. </li></ul><ul><li>Week 1 Monday dump contains 1247366 packets. </li></ul><ul><li>Week 4 Monday dump contains 1167662 packets, of which 11819 are attack instances. </li></ul><ul><li>Week 5 Monday dump contains 1273177 packets. </li></ul>06/07/09
  22. 22. Proof of Concept implementation <ul><li>Relevant features produced by attribute information gain are: source and destination addresses, source and destination ports, packet length, packet flags, time since last packet, and window size (totaling 21). </li></ul><ul><li>Normal profile is built using Apriori algorithm, resulting in 37 effective rules. </li></ul><ul><li>We have collected 399 attack signatures from the Internet and formatted them as if-then rules. </li></ul><ul><ul><li>Ex. If ((src_Port=5400) and (dst_Port>=1024) and (dst_Port<=65535) and (SYN_Flag=true) and (ACK_Flag=true)) then Class= bladerunner_trojan </li></ul></ul>06/07/09
  23. 23. Proof of Concept implementation <ul><li>We used the attack instances from the training dump to build an initial set of 7 clusters. </li></ul><ul><li>We want to see if the clusters structure and number will change in testing. </li></ul><ul><li>The algorithm has parameters that need to be preset (similarity threshold, short & long time windows). </li></ul><ul><li>We chose to set the threshold to 2.32 σ as it was proven that 99% of cluster instances stay within a circle with this radius value (Yu Guan et al – 2003 IEEE). </li></ul>06/07/09
  24. 24. To Do List <ul><li>Try different values for the short & long time windows until reaching a reasonable performance. </li></ul><ul><li>Study the new clusters and see if their data instances are logically related (are the clusters logically valid?) </li></ul><ul><li>Link the system modules together in an integrated environment. </li></ul><ul><li>Record results and compare them to other systems. </li></ul>06/07/09
  25. 25. Thank you…… 06/07/09

×