Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

To use the concept of Data Mining and machine learning concept for Cyber security and Intrusion detection


Published on

Regarding to survey on intrusion detection in cyber security using Data mining and machine learning techniques

Published in: Engineering
  • Login to see the comments

To use the concept of Data Mining and machine learning concept for Cyber security and Intrusion detection

  1. 1. Presented By : Nishant D. Mehta Class : TE-A Roll No. : TA1326 Subject : Seminar & Technical Communication Laboratory Guidance : Prof. Ashwini Taksal To Use The Concept of Data Mining and Machine Learning Methods for Cyber Security & Intrusion DetectionMonday, December 5, 2016 1 J.S.P.M.’s Imperial College of Engineering & Research Department of Computer Engineering
  2. 2. Monday, December 5, 2016 2 Agenda  Abstract  Introduction  Objectives  Literature Review  Overview And Study  Conclusion  Future Scope  References
  3. 3. Abstract Monday, December 5, 2016 3 The main focus of this project is on survey of machine learning (ML) and data mining (DM) methods for cyber analytics in support of intrusion detection. The data are so important in ML/DM approaches, some well-known cyber data sets used in ML/DM are described. Discussion of challenges for using ML/DM for cyber security is presented, and some recommendations on when to use a given methods are provided.
  4. 4. Monday, December 5, 2016 4 Introduction  Due to the proliferation of high-speed Internet access, more and more organizations are becoming vulnerable to potential cyber attacks, such as network intrusions.  The ML/DM methods are described, as well as several applications of each method to cyber intrusion detection problems also stated.
  5. 5. Objective Monday, December 5, 2016 5 The main focus is on cyber intrusion detection as it applies to wired networks. With a wired network, an adversary must pass through several layers of defense at firewalls and operating systems, or gain physical access to the network. However, a wireless network can be targeted at any node, so it is naturally more vulnerable to malicious attacks than a wired network. The ML and DM methods covered in this seminar are fully applicable to the intrusion and misuse detection problems in both wired and wireless networks.
  6. 6. Literature Survey Monday, December 5, 2016 6 Sr. No. Name of the author Year of Project Research Paper Name Disadvantages 1. T. T. T. Nguyen, G. Armitage 2008 “A survey of techniques for internet traffic classification using machine learning “ Focused only on IP Flows and Cyber Data. Methods Are not introduced for intrusion detection. 2. P. Garcia-Teodoro , J. Diaz-Verdejo , G. Maciá- Fernández and E. Vázquez 2009 “Anomaly-based network intrusion detection: Techniques, systems and challenges ” Does not present a full set of state-of- the-art machine learning methods.
  7. 7. Literature Survey Monday, December 5, 2016 7 Sr. No. Name of the author Year of Project Research Paper Name Disadvantages 3. A. Sperotto G. Schaffrath R. Sadre, C. Morariu Pras and B. Stiller 2010 “An overview of IP flow- based intrusion detection ” There is no explanation of the technical details of the individual methods 4. S. X. Wu and W. Banzhaf 2010 “The use of computational intelligence in intrusion detection systems: A review” Only Computational Intelligence methods are described, major ML/DM methods such as clustering , decision trees , and rule mining are not included
  8. 8. What is Cyber Crime ? Monday, December 5, 2016 8  Crime committed using a computer and the internet to steal data or information.  The computer used as an object or subject of crime..  Malicious programs.  Illegal imports.  Computer Vandalism.
  9. 9. What is Cyber Security ? Monday, December 5, 2016 9  Set of technologies and processes designed computers, networks and data from unauthorized access, change or destruction.  Composed of computer security system and network security systems.
  10. 10. Continue… Monday, December 5, 2016 10  Cyber Security includes : • Firewall. • Antivirus Software. • Intrusion Detection System (IDS).
  11. 11. Intrusion Detection System Monday, December 5, 2016 11 Fig : Intrusion Detection System
  12. 12. Continue… Monday, December 5, 2016 12  There are three main types of cyber analytics for supporting IDS : 1) Misuse Based. 2) Anomaly Based. 3) Hybrid.
  13. 13. Continue… Monday, December 5, 2016 13  Misuse Based Detection • Designed to detect known attacks by using signatures of those attacks. • Effective detecting known type of attacks without generating false alarms. • Frequent manual updating of data is required. • Cannot detect Novel (Zero-day) attacks.
  14. 14. Continue… Monday, December 5, 2016 14  Anomaly Based Detection • Identifies the anomalies from normal behavior • Able to detect Zero-Day Attack • Profiles of normal activity are customized for every system  Hybrid Detection • Combination of misuse and anomaly detection. • Increases the detection rate and decreases the false alarm generation.
  15. 15. What is Machine Learning And Data Mining ? Monday, December 5, 2016 15  Machine Learning : • Introduced in 1960’s • It gives ability to computers to learn without being explicitly programmed. • Need of goal from domain • There should be three phases :- 1. Training, 2. Validation and 3. Testing.  Data Mining : • Introduced in 1980’s • Focused on discovery of previously unknown and important properties in data. • Used for extracting patterns from data
  16. 16. CRISP- DM Model Monday, December 5, 2016 16 Fig : CRISP-DM Model
  17. 17. Monday, December 5, 2016 17 Cyber Security Data Sets For DM & ML The Cyber Security data sets for DM and ML are given below : a) Packet Level Data b) Netflow Data c) Public Data Sets
  18. 18. Packet Level Data Monday, December 5, 2016 18  Protocols are used for transmission of packet through network.  The network packets are transmitted and received at the physical interface.  Packets are captured by API in computers called as pcap.  For Linux it is Libpcap and for windows it is WinPCap.  Ethernet port have payload called as IP payload.
  19. 19. NetFlow Data Monday, December 5, 2016 19  Introduced as a router feature by Cisco.  Version 5 defines unidirectional flow of packets.  The packet attributes are : ingress interface, source IP address, destination IP address, IP protocol, source port, destination port and type of services.  Netflow includes compressed and preprocessed packets.
  20. 20. Public Data Set Monday, December 5, 2016 20  The Defense Advance Research Projects Agency (DARPA) in 1998 and 1999 data sets are mostly used.  This Data Set has basic features captured by pcap.  DARPA defines four types of attacks in 1998 : DOS Attack, U2R Attack, R2LAttack, Probe or Scan.
  21. 21. Monday, December 5, 2016 21 Cyber Security Methods For DM & ML  Artificial Neural Networks (ANN)  Association Rules & Fuzzy Association Rules  Bayesian Network  Clustering  Decision Tree  Ensemble Learning  Evolutionary Computation  Hidden Markov Model  Inductive Learning  Nalve Bayes  Sequential Pattern Mining  Support Vector Machine
  22. 22. Monday, December 5, 2016 22 Artificial Neural Network  Network of Neurons  Output of one node is input to other.  ANN can be used as a multi-category classifier of intrusion detection  Data processing stage used to select 9 features: protocol ID, source port, destination port, source address, destination address, ICMP type, ICMP code, raw data length and raw data.
  23. 23. Association Rule & Fuzzy Association Rule Monday, December 5, 2016 23
  24. 24. Monday, December 5, 2016 24 Fig : Bayesian Network Bayesian Network
  25. 25. Hidden Markov Model Monday, December 5, 2016 25 Fig : Hidden Markov Model
  26. 26. Conclusion Monday, December 5, 2016 26  This seminar describes the literature review of ML and DM methods used for Cyber Security. Different ML and DM techniques in the cyber domain can be used for both Misuse Detection and Anomaly Detection.  There are some peculiarities of the cyber problem that make ML and DM methods more difficult to use.  They are especially related to how often the model needs to be retrained.
  27. 27. References Monday, December 5, 2016 27  A. Mukkamala, and A. Sung, and A. Abraham, “Cyber security challenges: designing efficient intrusion detection systems and antivirus tools,” Vemuri, V. Rao, Enhancing Computer Security with Smart Technology.(Auerbach, 2006) (2005), pp. 125–163  T. T. T. Nguyen, and G. Armitage, “A survey of techniques for internet traffic classification using machine learning,” IEEE Communications Surveys & Tutorials, no. 4, 2008, pp. 56–76  P. Garcia-Teodoro, J. Diaz-Verdejo, G. Maciá- Fernández, and E. Vázquez, “Anomaly-based network intrusion detection: Techniques, systems and challenges,” Computers & security 28, no. 1, 2009, pp. 18–28
  28. 28. References Monday, December 5, 2016 28  S. X. Wu, and W. Banzhaf, “The use of computational intelligence in intrusion detection systems: A review,” Applied Soft Computing 10, no. 1, 2010, pp. 1–35  Y. Zhang, L. Wenke, and Yi-An Huang, “Intrusion detection techniques for mobile wireless networks,” Wireless Networks 9.5, 2003, pp. 545-556.
  29. 29. Questions ? Monday, December 5, 2016 29
  30. 30. Thank You !!! Monday, December 5, 2016 30