Presentation at CDIS Spring Conference 2024.
The ubiquity and evolving nature of cyber attacks is of growing concern to industry and society. In response, the automation of security processes and functions is the focus of many current research efforts. In this talk we will present a framework for automated network intrusion response, in which we model the interaction between an attacker and a defender as a partially observed Markov game. Within this framework, reinforcement learning enables the controlled evolution of attack and defense strategies towards a Nash equilibrium through the process of self-play. To realize and experiment with the self-play process on a practical IT infrastructure, we have developed a software platform for creating digital twins, which provide two key functions for our framework: (i) a safe and realistic test environment; and (ii) a tool for evaluation that enables closed-loop learning of security strategies.
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Kim Hammar
We study automated intrusion response and formulate the interaction between an attacker and a defender on an IT infrastructure as a stochastic game where attack and defense strategies evolve through reinforcement learning and self-play. Direct application of reinforcement learning to any non-trivial instantiation of this game is impractical due to the exponential growth of the state and action spaces with the number of components in the infrastructure. We propose a decompositional approach to deal with this challenge and prove that under assumptions generally met in practice the game decomposes into a) additive subgames on the workflow-level that can be optimized independently; and b) subgames on the component-level that satisfy the optimal substructure property. We further show that the optimal defender strategies on the component-level exhibit threshold structures. To solve the decomposed game we develop Decompositional Fictitious Self-Play (\dfsp), an efficient fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that \dfsp outperforms a state-of-the-art algorithm for our use case. To evaluate the learned strategies, we deploy them in a a virtual IT infrastructure in which we run real network intrusions and real response actions. From our experimental investigation we conclude that our approach can produce effective defender strategies for a practical IT infrastructure.
Learning Optimal Intrusion Responses via DecompositionKim Hammar
We study automated intrusion response and formulate the interaction between an attacker and a defender on an IT infrastructure as a stochastic game where attack and defense strategies evolve through reinforcement learning and self-play. Direct application of reinforcement learning to any non-trivial instantiation of this game is impractical due to the exponential growth of the state and action spaces with the number of components in the infrastructure. We propose a decompositional approach to deal with this challenge and prove that under assumptions generally met in practice, the game decomposes into a) additive subgames on the workflow-level that can be optimized independently; and b) subgames on the component-level that satisfy the optimal substructure property. We further show that the optimal defender strategies on the component-level exhibit threshold structures. To solve the decomposed game we develop Decompositional Fictitious Self-Play (\dfsp), an efficient fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that \dfsp outperforms a state-of-the-art algorithm for our use case. To evaluate the learned strategies, we deploy them in a a virtual IT infrastructure in which we run real network intrusions and real response actions. From our experimental investigation we conclude that our approach can produce effective defender strategies for a practical IT infrastructure.
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Kim Hammar
We study automated intrusion response and formulate the interaction between an attacker and a defender on an IT infrastructure as a stochastic game where attack and defense strategies evolve through reinforcement learning and self-play. Direct application of reinforcement learning to any non-trivial instantiation of this game is impractical due to the exponential growth of the state and action spaces with the number of components in the infrastructure. We propose a decompositional approach to deal with this challenge and prove that under assumptions generally met in practice, the game decomposes into a) additive subgames on the workflow-level that can be optimized independently; and b) subgames on the component-level that satisfy the optimal substructure property. We further show that the optimal defender strategies on the component-level exhibit threshold structures. To solve the decomposed game we develop Decompositional Fictitious Self-Play (\dfsp), an efficient fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that \dfsp outperforms a state-of-the-art algorithm for our use case. To evaluate the learned strategies, we deploy them in a a virtual IT infrastructure in which we run real network intrusions and real response actions. From our experimental investigation we conclude that our approach can produce effective defender strategies for a practical IT infrastructure.
The document discusses a framework for automated intrusion response using reinforcement learning. It involves creating a digital twin of the target infrastructure, learning defender strategies through simulation, and evaluating strategies. The goal is to develop self-learning systems that can optimize intrusion response over time as attacks evolve.
—We present a novel emulation system for creating
high-fidelity digital twins of IT infrastructures. The digital twins
replicate key functionality of the corresponding infrastructures
and allow to play out security scenarios in a safe environment.
We show that this capability can be used to automate the process
of finding effective security policies for a target infrastructure. In
our approach, a digital twin of the target infrastructure is used
to run security scenarios and collect data. The collected data is
then used to instantiate simulations of Markov decision processes
and learn effective policies through reinforcement learning, whose
performances are validated in the digital twin. This closed-loop
learning process executes iteratively and provides continuously
evolving and improving security policies. We apply our approach
to an intrusion response scenario. Our results show that the
digital twin provides the necessary evaluative feedback to learn
near-optimal intrusion response policies.
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...Kim Hammar
1) The document describes a framework for using reinforcement learning and simulation to automatically learn near-optimal intrusion responses for large-scale IT infrastructures.
2) A key challenge is the high sample and computational complexity of scaling reinforcement learning to large infrastructures.
3) The framework addresses this by decomposing the infrastructure into additive subgames and exploiting the optimal substructure property to learn intrusion responses through scalable decomposition methods.
Self-learning systems for cyber securityKim Hammar
The document discusses using reinforcement learning and game theory to develop self-learning systems for cybersecurity. It describes challenges like evolving attacks and complex infrastructure. The approach models network attacks and defense as games and uses reinforcement learning to learn effective security policies. The work focuses on intrusion prevention, using emulation to create models of real systems and identify dynamics models for simulations. This allows training policies through reinforcement learning to automate security tasks and adapt to changing threats.
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Kim Hammar
We study automated intrusion response and formulate the interaction between an attacker and a defender on an IT infrastructure as a stochastic game where attack and defense strategies evolve through reinforcement learning and self-play. Direct application of reinforcement learning to any non-trivial instantiation of this game is impractical due to the exponential growth of the state and action spaces with the number of components in the infrastructure. We propose a decompositional approach to deal with this challenge and prove that under assumptions generally met in practice the game decomposes into a) additive subgames on the workflow-level that can be optimized independently; and b) subgames on the component-level that satisfy the optimal substructure property. We further show that the optimal defender strategies on the component-level exhibit threshold structures. To solve the decomposed game we develop Decompositional Fictitious Self-Play (\dfsp), an efficient fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that \dfsp outperforms a state-of-the-art algorithm for our use case. To evaluate the learned strategies, we deploy them in a a virtual IT infrastructure in which we run real network intrusions and real response actions. From our experimental investigation we conclude that our approach can produce effective defender strategies for a practical IT infrastructure.
Learning Optimal Intrusion Responses via DecompositionKim Hammar
We study automated intrusion response and formulate the interaction between an attacker and a defender on an IT infrastructure as a stochastic game where attack and defense strategies evolve through reinforcement learning and self-play. Direct application of reinforcement learning to any non-trivial instantiation of this game is impractical due to the exponential growth of the state and action spaces with the number of components in the infrastructure. We propose a decompositional approach to deal with this challenge and prove that under assumptions generally met in practice, the game decomposes into a) additive subgames on the workflow-level that can be optimized independently; and b) subgames on the component-level that satisfy the optimal substructure property. We further show that the optimal defender strategies on the component-level exhibit threshold structures. To solve the decomposed game we develop Decompositional Fictitious Self-Play (\dfsp), an efficient fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that \dfsp outperforms a state-of-the-art algorithm for our use case. To evaluate the learned strategies, we deploy them in a a virtual IT infrastructure in which we run real network intrusions and real response actions. From our experimental investigation we conclude that our approach can produce effective defender strategies for a practical IT infrastructure.
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Kim Hammar
We study automated intrusion response and formulate the interaction between an attacker and a defender on an IT infrastructure as a stochastic game where attack and defense strategies evolve through reinforcement learning and self-play. Direct application of reinforcement learning to any non-trivial instantiation of this game is impractical due to the exponential growth of the state and action spaces with the number of components in the infrastructure. We propose a decompositional approach to deal with this challenge and prove that under assumptions generally met in practice, the game decomposes into a) additive subgames on the workflow-level that can be optimized independently; and b) subgames on the component-level that satisfy the optimal substructure property. We further show that the optimal defender strategies on the component-level exhibit threshold structures. To solve the decomposed game we develop Decompositional Fictitious Self-Play (\dfsp), an efficient fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that \dfsp outperforms a state-of-the-art algorithm for our use case. To evaluate the learned strategies, we deploy them in a a virtual IT infrastructure in which we run real network intrusions and real response actions. From our experimental investigation we conclude that our approach can produce effective defender strategies for a practical IT infrastructure.
The document discusses a framework for automated intrusion response using reinforcement learning. It involves creating a digital twin of the target infrastructure, learning defender strategies through simulation, and evaluating strategies. The goal is to develop self-learning systems that can optimize intrusion response over time as attacks evolve.
—We present a novel emulation system for creating
high-fidelity digital twins of IT infrastructures. The digital twins
replicate key functionality of the corresponding infrastructures
and allow to play out security scenarios in a safe environment.
We show that this capability can be used to automate the process
of finding effective security policies for a target infrastructure. In
our approach, a digital twin of the target infrastructure is used
to run security scenarios and collect data. The collected data is
then used to instantiate simulations of Markov decision processes
and learn effective policies through reinforcement learning, whose
performances are validated in the digital twin. This closed-loop
learning process executes iteratively and provides continuously
evolving and improving security policies. We apply our approach
to an intrusion response scenario. Our results show that the
digital twin provides the necessary evaluative feedback to learn
near-optimal intrusion response policies.
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...Kim Hammar
1) The document describes a framework for using reinforcement learning and simulation to automatically learn near-optimal intrusion responses for large-scale IT infrastructures.
2) A key challenge is the high sample and computational complexity of scaling reinforcement learning to large infrastructures.
3) The framework addresses this by decomposing the infrastructure into additive subgames and exploiting the optimal substructure property to learn intrusion responses through scalable decomposition methods.
Self-learning systems for cyber securityKim Hammar
The document discusses using reinforcement learning and game theory to develop self-learning systems for cybersecurity. It describes challenges like evolving attacks and complex infrastructure. The approach models network attacks and defense as games and uses reinforcement learning to learn effective security policies. The work focuses on intrusion prevention, using emulation to create models of real systems and identify dynamics models for simulations. This allows training policies through reinforcement learning to automate security tasks and adapt to changing threats.
Self-Learning Systems for Cyber SecurityKim Hammar
The document discusses using reinforcement learning to develop self-learning systems for cyber security. It proposes modeling network attack and defense as games and using reinforcement learning to learn effective security policies. The approach involves emulating computer infrastructures, creating models from the emulations, and using reinforcement learning and simulations to evaluate policies and estimate models. The goal is to automate security tasks and develop systems that can adapt to changing attack methods.
I will talk about innovation in the area of cyber security analytics - developing machine learning methods to detect and block cyber attacks (e.g. detecting ransomware within 4 seconds of execution and killing the underlying processes). Rather than just focusing on this as a 'black box', I'll pull it apart and talk about how we can use these methods to enable security practitioners (SOC/CIRT etc) to ask and answer questions about 'what' and 'why' these methods are flagging attacks. I'll also talk about resilience of machine learning methods to manipulation and adversarial attacks - how stable these approaches are to diversity and evolution of malware for example.
Research of Intrusion Preventio System based on SnortFrancis Yang
This document is a graduation thesis from Beijing University of Posts and Telecommunications titled "Research of Intrusion Prevention System based on Snort". It designs and implements an intrusion prevention module based on the open-source Snort intrusion detection system. The document introduces intrusion detection systems and intrusion prevention systems. It then analyzes Snort and describes building a Snort environment. Finally, it presents two designs for transforming Snort into an intrusion prevention system using Netfilter and Netlink sockets.
This document discusses using machine learning to detect ransomware through analyzing microbehaviors rather than static signatures. It introduces the concept of using machine learning for cybersecurity and labeling data to help algorithms learn. The document then discusses modeling ransomware behaviors like file system modifications and callbacks. It outlines a plan to take labeled exploit and benign traffic data, extract microbehaviors, use machine learning to detect anomalies, and generate indicators of compromise.
With the growth of computer networking, electronic commerce and web services, security networking systems have become very important to protect infomation and networks againts malicious usage or attacks. In this report, it is designed an Intrusion Detection System using two artificial neural networks: one for Intrusion Detection and the another for Attack Classification.
The document discusses security operation centers (SOCs) and their functions. It describes what a SOC is and its main purpose of monitoring, preventing, detecting, investigating and responding to cyber threats. It outlines the typical roles in a SOC including tier 1, 2 and 3 analysts and security engineers. It also discusses the common tools, skills needed for each role, and types of SOCs such as dedicated, distributed, multifunctional and virtual SOCs.
Intrusion Prevention through Optimal StoppingKim Hammar
1) The document discusses formulating intrusion prevention as an optimal stopping problem where a defender must monitor an infrastructure over discrete time steps to detect an intrusion by an attacker.
2) If an intrusion is detected, the defender can take defensive actions to stop the intrusion at various time steps, with the goal of stopping intrusions optimally based on observations.
3) The document proposes modeling this problem as a partially observable Markov decision process (POMDP) and using reinforcement learning techniques like simulation and policy evaluation to develop automated security prevention policies that can optimally stop intrusions.
The document discusses creating a digital twin of a target infrastructure for self-learning intrusion prevention systems. Key aspects include emulating hosts, network services, vulnerabilities, and network properties using software like Docker containers and NetEm. The digital twin allows modeling the target infrastructure and evaluating security strategies through simulation and reinforcement learning before deployment.
The document discusses self-learning systems for cyber defense. It outlines an approach using network emulation, digital twin modeling, and reinforcement learning to develop self-learning security systems that can automate tasks and adapt to changing attack methods. As an example use case, it examines how this approach could be applied to intrusion prevention by formulating it as an optimal multiple stopping problem and using techniques like stochastic game simulation to learn effective prevention strategies.
A SIMULATION APPROACH TO PREDICATE THE RELIABILITY OF A PERVASIVE SOFTWARE SY...Osama M. Khaled
This document discusses simulating a pervasive software system to predict its reliability. It begins by introducing pervasive computing and its importance. It then describes the reference architecture model, which includes a smart environment conceptual model, smart object model, and pervasive system architecture model. The document outlines the simulation approach, including scenarios, specifications of simulation entities, assumptions, and extreme assumptions. The simulation aims to predict reliability and availability under different scenarios and control variable values. Key metrics like MTBF and MTTR will be measured to calculate reliability and availability.
An efficient HIDS using System Call Traces.pptxSandeep Maurya
Cybercrime has been rising prohibitively in the last decades. Companies and organizations have stepped into different strategies to protect their IT systems from cyber-attacks. A new method for host-based intrusion detection has been explored in this paper using the ADFA-LD dataset. For this reason, the continuous skip-gram model has been used for word embedding. It is used to capture many precise syntactic and semantic word relationships. To this, we have exploited its characteristic for the search of relations between sub-sequences of system calls. This gave us very interesting results not only in detection but also in prevention too. A new algorithm has been proposed based on the skip-gram model for analyzing system call traces. In this study, results have been obtained from the prototype that has only three types of traces (Normal, Adduser, and Webshell attacks). The results obtained are very satisfying, as the algorithm classifies correctly a system call trace based on its sub-sequences.
The document discusses penetration testing using Metasploit. It begins by defining penetration testing and why it is important for security. It then provides an overview of Metasploit, explaining what it is and some key terminology. The document demonstrates a sample penetration test against a virtual network, using Metasploit to exploit a Windows vulnerability. It evaluates the impact and recommends countermeasures like patching, code reviews, and periodic testing. The goal is to show how Metasploit can be used to test network security by simulating real-world attacks.
Situational awareness for computer network securitymmubashirkhan
This document discusses situational awareness, both traditional and cyber, and presents a scenario of a cyber attack. It introduces the Instance Based Learning Theory (IBLT) model for evaluating a security analyst's cyber situational awareness when responding to network events during an island hopping attack. Key factors that influence the analyst's decisions are described. The IBLT model represents situations, decisions, and utilities to model how analysts make judgments based on prior experiences stored in memory.
Automated Vulnerability Testing Using Machine Learning and Metaheuristic SearchLionel Briand
The document discusses automated vulnerability testing of web applications using machine learning and metaheuristic search. It discusses three key areas:
1. Testing front-end web applications for XML injection vulnerabilities by automatically generating malicious XML messages and using search-based techniques to find input strings that allow the generation of these messages.
2. Testing web application firewalls (WAFs) using machine learning to learn attack patterns from successful and blocked attacks to generate new attacks, and a multi-objective genetic algorithm to automatically repair vulnerable WAF rule sets.
3. Using machine learning to detect malicious SQL statements at the database level by training on legitimate and attack SQL logs and classifying new statements as legitimate or attacks.
Understanding Cyber Kill Chain and OODA loopDavid Sweigert
The document discusses using an attacker's tactics and techniques to design effective cybersecurity defenses. It provides examples of mapping security controls and tools to different stages of common attack models like the Lockheed Martin Kill Chain. This allows an organization to see where in the attack cycle they have visibility and can disrupt threats. The document advocates taking a strategic, intelligence-driven approach to cyber defense by understanding adversaries' full operations in order to implement controls earlier in the attack cycle.
The module explains that a Security Operations Center (SOC) uses people, processes, and technologies to defend against cyber threats. SOCs assign roles across multiple tiers, with tier 1 analysts monitoring alerts and tier 3 experts conducting in-depth investigations. A SOC relies on security information and event management (SIEM) systems to collect and analyze data, while security orchestration, automation and response (SOAR) helps automate workflows. Key performance indicators like mean time to detect threats are used to measure a SOC's effectiveness. The module also discusses qualifications and experience needed for a career in cybersecurity operations.
This document discusses using machine learning and big data technologies to improve security workflows. It describes the challenges of analyzing large amounts of security data from many sources to detect threats. Machine learning can help by analyzing patterns in the data at scale. The document introduces the Lambda Defense approach, which applies a lambda architecture to build a "central nervous system" for security. This combines batch and real-time machine learning models to detect threats based on both sequential and unordered behaviors.
4 technologies including AI, IoT, blockchain, and virtual reality will revolutionize the maritime sector. AI and blockchain will have many applications including supply chain visibility, data science, voyage optimization, and diagnosis systems. Training seafarers for an AI-based future will require changes to curriculum, including teaching mathematics, programming, cybersecurity, and manual operation skills. Ensuring safe return to port capabilities with redundancy will also be important as autonomy increases.
4 technologies including AI, IoT, blockchain, and virtual reality will revolutionize the maritime sector. AI and blockchain will have many applications including supply chain visibility, data science, voyage optimization, and diagnosis systems. Training seafarers for an AI-based future will require changes to curriculum, including teaching mathematics, programming, cybersecurity, and manual operation skills. Ensuring safe return to port capabilities with redundancy will also be important as autonomy increases.
4 key technologies - AI, IoT, blockchain, and virtual reality - will revolutionize the maritime sector. AI and blockchain will have many applications, including supply chain visibility, data-driven insights, voyage optimization, and diagnostic systems. Training seafarers for an AI-based future will require changes to curriculum, including teaching fundamentals of AI systems, cybersecurity awareness, and skills for manual operation and troubleshooting of automated systems. Ensuring safe operation will also require redundancy measures for full autonomous control.
Automated Security Response through Online Learning with Adaptive Con jecturesKim Hammar
The document describes an approach called conjectural online learning for automated security response in complex IT systems. It uses Bayesian learning to continuously update conjectures about the true but unknown system model, and rollout-based strategy adaptation to update the defender's strategy based on the current conjecture. The conjectures are proven to converge asymptotically to consistent conjectures that minimize divergence from the true model. Evaluation on a target infrastructure models the evolving number of clients and correlations between observations and the true unknown model parameter.
More Related Content
Similar to Automated Intrusion Response - CDIS Spring Conference 2024
Self-Learning Systems for Cyber SecurityKim Hammar
The document discusses using reinforcement learning to develop self-learning systems for cyber security. It proposes modeling network attack and defense as games and using reinforcement learning to learn effective security policies. The approach involves emulating computer infrastructures, creating models from the emulations, and using reinforcement learning and simulations to evaluate policies and estimate models. The goal is to automate security tasks and develop systems that can adapt to changing attack methods.
I will talk about innovation in the area of cyber security analytics - developing machine learning methods to detect and block cyber attacks (e.g. detecting ransomware within 4 seconds of execution and killing the underlying processes). Rather than just focusing on this as a 'black box', I'll pull it apart and talk about how we can use these methods to enable security practitioners (SOC/CIRT etc) to ask and answer questions about 'what' and 'why' these methods are flagging attacks. I'll also talk about resilience of machine learning methods to manipulation and adversarial attacks - how stable these approaches are to diversity and evolution of malware for example.
Research of Intrusion Preventio System based on SnortFrancis Yang
This document is a graduation thesis from Beijing University of Posts and Telecommunications titled "Research of Intrusion Prevention System based on Snort". It designs and implements an intrusion prevention module based on the open-source Snort intrusion detection system. The document introduces intrusion detection systems and intrusion prevention systems. It then analyzes Snort and describes building a Snort environment. Finally, it presents two designs for transforming Snort into an intrusion prevention system using Netfilter and Netlink sockets.
This document discusses using machine learning to detect ransomware through analyzing microbehaviors rather than static signatures. It introduces the concept of using machine learning for cybersecurity and labeling data to help algorithms learn. The document then discusses modeling ransomware behaviors like file system modifications and callbacks. It outlines a plan to take labeled exploit and benign traffic data, extract microbehaviors, use machine learning to detect anomalies, and generate indicators of compromise.
With the growth of computer networking, electronic commerce and web services, security networking systems have become very important to protect infomation and networks againts malicious usage or attacks. In this report, it is designed an Intrusion Detection System using two artificial neural networks: one for Intrusion Detection and the another for Attack Classification.
The document discusses security operation centers (SOCs) and their functions. It describes what a SOC is and its main purpose of monitoring, preventing, detecting, investigating and responding to cyber threats. It outlines the typical roles in a SOC including tier 1, 2 and 3 analysts and security engineers. It also discusses the common tools, skills needed for each role, and types of SOCs such as dedicated, distributed, multifunctional and virtual SOCs.
Intrusion Prevention through Optimal StoppingKim Hammar
1) The document discusses formulating intrusion prevention as an optimal stopping problem where a defender must monitor an infrastructure over discrete time steps to detect an intrusion by an attacker.
2) If an intrusion is detected, the defender can take defensive actions to stop the intrusion at various time steps, with the goal of stopping intrusions optimally based on observations.
3) The document proposes modeling this problem as a partially observable Markov decision process (POMDP) and using reinforcement learning techniques like simulation and policy evaluation to develop automated security prevention policies that can optimally stop intrusions.
The document discusses creating a digital twin of a target infrastructure for self-learning intrusion prevention systems. Key aspects include emulating hosts, network services, vulnerabilities, and network properties using software like Docker containers and NetEm. The digital twin allows modeling the target infrastructure and evaluating security strategies through simulation and reinforcement learning before deployment.
The document discusses self-learning systems for cyber defense. It outlines an approach using network emulation, digital twin modeling, and reinforcement learning to develop self-learning security systems that can automate tasks and adapt to changing attack methods. As an example use case, it examines how this approach could be applied to intrusion prevention by formulating it as an optimal multiple stopping problem and using techniques like stochastic game simulation to learn effective prevention strategies.
A SIMULATION APPROACH TO PREDICATE THE RELIABILITY OF A PERVASIVE SOFTWARE SY...Osama M. Khaled
This document discusses simulating a pervasive software system to predict its reliability. It begins by introducing pervasive computing and its importance. It then describes the reference architecture model, which includes a smart environment conceptual model, smart object model, and pervasive system architecture model. The document outlines the simulation approach, including scenarios, specifications of simulation entities, assumptions, and extreme assumptions. The simulation aims to predict reliability and availability under different scenarios and control variable values. Key metrics like MTBF and MTTR will be measured to calculate reliability and availability.
An efficient HIDS using System Call Traces.pptxSandeep Maurya
Cybercrime has been rising prohibitively in the last decades. Companies and organizations have stepped into different strategies to protect their IT systems from cyber-attacks. A new method for host-based intrusion detection has been explored in this paper using the ADFA-LD dataset. For this reason, the continuous skip-gram model has been used for word embedding. It is used to capture many precise syntactic and semantic word relationships. To this, we have exploited its characteristic for the search of relations between sub-sequences of system calls. This gave us very interesting results not only in detection but also in prevention too. A new algorithm has been proposed based on the skip-gram model for analyzing system call traces. In this study, results have been obtained from the prototype that has only three types of traces (Normal, Adduser, and Webshell attacks). The results obtained are very satisfying, as the algorithm classifies correctly a system call trace based on its sub-sequences.
The document discusses penetration testing using Metasploit. It begins by defining penetration testing and why it is important for security. It then provides an overview of Metasploit, explaining what it is and some key terminology. The document demonstrates a sample penetration test against a virtual network, using Metasploit to exploit a Windows vulnerability. It evaluates the impact and recommends countermeasures like patching, code reviews, and periodic testing. The goal is to show how Metasploit can be used to test network security by simulating real-world attacks.
Situational awareness for computer network securitymmubashirkhan
This document discusses situational awareness, both traditional and cyber, and presents a scenario of a cyber attack. It introduces the Instance Based Learning Theory (IBLT) model for evaluating a security analyst's cyber situational awareness when responding to network events during an island hopping attack. Key factors that influence the analyst's decisions are described. The IBLT model represents situations, decisions, and utilities to model how analysts make judgments based on prior experiences stored in memory.
Automated Vulnerability Testing Using Machine Learning and Metaheuristic SearchLionel Briand
The document discusses automated vulnerability testing of web applications using machine learning and metaheuristic search. It discusses three key areas:
1. Testing front-end web applications for XML injection vulnerabilities by automatically generating malicious XML messages and using search-based techniques to find input strings that allow the generation of these messages.
2. Testing web application firewalls (WAFs) using machine learning to learn attack patterns from successful and blocked attacks to generate new attacks, and a multi-objective genetic algorithm to automatically repair vulnerable WAF rule sets.
3. Using machine learning to detect malicious SQL statements at the database level by training on legitimate and attack SQL logs and classifying new statements as legitimate or attacks.
Understanding Cyber Kill Chain and OODA loopDavid Sweigert
The document discusses using an attacker's tactics and techniques to design effective cybersecurity defenses. It provides examples of mapping security controls and tools to different stages of common attack models like the Lockheed Martin Kill Chain. This allows an organization to see where in the attack cycle they have visibility and can disrupt threats. The document advocates taking a strategic, intelligence-driven approach to cyber defense by understanding adversaries' full operations in order to implement controls earlier in the attack cycle.
The module explains that a Security Operations Center (SOC) uses people, processes, and technologies to defend against cyber threats. SOCs assign roles across multiple tiers, with tier 1 analysts monitoring alerts and tier 3 experts conducting in-depth investigations. A SOC relies on security information and event management (SIEM) systems to collect and analyze data, while security orchestration, automation and response (SOAR) helps automate workflows. Key performance indicators like mean time to detect threats are used to measure a SOC's effectiveness. The module also discusses qualifications and experience needed for a career in cybersecurity operations.
This document discusses using machine learning and big data technologies to improve security workflows. It describes the challenges of analyzing large amounts of security data from many sources to detect threats. Machine learning can help by analyzing patterns in the data at scale. The document introduces the Lambda Defense approach, which applies a lambda architecture to build a "central nervous system" for security. This combines batch and real-time machine learning models to detect threats based on both sequential and unordered behaviors.
4 technologies including AI, IoT, blockchain, and virtual reality will revolutionize the maritime sector. AI and blockchain will have many applications including supply chain visibility, data science, voyage optimization, and diagnosis systems. Training seafarers for an AI-based future will require changes to curriculum, including teaching mathematics, programming, cybersecurity, and manual operation skills. Ensuring safe return to port capabilities with redundancy will also be important as autonomy increases.
4 technologies including AI, IoT, blockchain, and virtual reality will revolutionize the maritime sector. AI and blockchain will have many applications including supply chain visibility, data science, voyage optimization, and diagnosis systems. Training seafarers for an AI-based future will require changes to curriculum, including teaching mathematics, programming, cybersecurity, and manual operation skills. Ensuring safe return to port capabilities with redundancy will also be important as autonomy increases.
4 key technologies - AI, IoT, blockchain, and virtual reality - will revolutionize the maritime sector. AI and blockchain will have many applications, including supply chain visibility, data-driven insights, voyage optimization, and diagnostic systems. Training seafarers for an AI-based future will require changes to curriculum, including teaching fundamentals of AI systems, cybersecurity awareness, and skills for manual operation and troubleshooting of automated systems. Ensuring safe operation will also require redundancy measures for full autonomous control.
Similar to Automated Intrusion Response - CDIS Spring Conference 2024 (20)
Automated Security Response through Online Learning with Adaptive Con jecturesKim Hammar
The document describes an approach called conjectural online learning for automated security response in complex IT systems. It uses Bayesian learning to continuously update conjectures about the true but unknown system model, and rollout-based strategy adaptation to update the defender's strategy based on the current conjecture. The conjectures are proven to converge asymptotically to consistent conjectures that minimize divergence from the true model. Evaluation on a target infrastructure models the evolving number of clients and correlations between observations and the true unknown model parameter.
Intrusion Tolerance for Networked Systems through Two-level Feedback ControlKim Hammar
This document outlines a two-level feedback control architecture for intrusion tolerance in networked systems. At the local level, node controllers monitor individual replicas and make recovery decisions based on belief states about their health. At the global level, a system controller scales the replication factor based on belief state information from the node controllers. The architecture provides correct service if certain conditions are met, including maintaining a sufficient number of replicas. The two-level approach models intrusion tolerance as classical machine replacement and inventory replenishment control problems.
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...Kim Hammar
1) The document describes a model for scalable learning of intrusion response through recursive decomposition. It involves a defender protecting an infrastructure of connected components from an attacker seeking to intrude.
2) The system is modeled as a directed tree where each component has states related to defense, attack, and risk. The defender takes actions to maintain workflows and stop intrusions, while the attacker aims to disrupt workflows and compromise components.
3) Components are organized into workflows and the defender and attacker choose actions based on partial observations from intrusion detection systems. The problem is formulated as a Stackelberg game to find strategies that maximize the defender's objectives while minimizing the attacker's objectives.
Intrusion Response through Optimal StoppingKim Hammar
This document discusses formulating intrusion response as an optimal stopping problem known as a Dynkin game. The system evolves in discrete time steps with an attacker and defender. The defender observes the infrastructure and aims to determine the optimal time to stop and take defensive action based on observations. The approach involves modeling the interaction as a zero-sum partially observed stochastic game and using reinforcement learning to determine optimal strategies for both players.
Learning Security Strategies through Game Play and Optimal StoppingKim Hammar
1) The document describes a game-theoretic model of an intrusion prevention problem between an attacker and defender.
2) The defender owns an infrastructure and monitors it for intrusions, having a list of defensive actions to take. The attacker seeks to compromise components through reconnaissance and exploits.
3) Both players' strategies determine when to take actions, with the defender deciding when to deploy defenses and the attacker deciding when to start/stop intrusions. This interaction is modeled as an optimal stopping game.
Intrusion Prevention through Optimal StoppingKim Hammar
This document provides an overview of research on optimal intrusion prevention through optimal stopping. It discusses using reinforcement learning and a formal model to learn effective security strategies. The research aims to model intrusion prevention as an optimal stopping problem and partially observed Markov decision process. The goal is to learn multi-threshold policies for determining when to take defensive actions and which actions to take by emulating target infrastructures and applying reinforcement learning techniques.
Intrusion Prevention through Optimal Stopping and Self-PlayKim Hammar
1) The document discusses using optimal stopping and reinforcement learning for intrusion prevention. It presents a formal model of intrusion prevention as a partially observed stochastic game between a defender and attacker.
2) The approach uses self-play reinforcement learning within an emulated infrastructure to find optimal defender strategies. The defender's strategies involve optimally choosing when to take defensive actions like reconfiguring an intrusion prevention system.
3) Results from applying this approach to a modeled infrastructure are presented, showing its effectiveness at preventing intrusions through adaptive defensive actions.
Intrusion Prevention through Optimal Stopping.Kim Hammar
The document discusses intrusion prevention through optimal stopping. It presents a method for finding effective security strategies using reinforcement learning. The method involves modeling the target infrastructure, implementing security policies through simulation, and using reinforcement learning to evaluate policies and estimate models. This allows generating automated and self-learning security systems.
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...Kim Hammar
This document presents a game theoretic analysis of intrusion detection in access control systems. It first describes a use case involving an attacker, defender (intrusion detection system), and IT infrastructure with components and sensors. It then models this scenario as a finite extensive-form game and computes a mixed Nash equilibrium. The document notes limitations of this approach and proposes an alternative continuous-kernel game model that is more scalable and allows for probabilistic sensor detections. Key elements of this alternative model are described, including action spaces for the attacker, defender, and sensors/nature, as well as cost parameters.
Learning Intrusion Prevention Policies through Optimal Stopping - CNSM2021Kim Hammar
We study automated intrusion prevention using reinforcement learning. In a novel approach, we formulate the problem of intrusion prevention as an optimal stopping problem. This formulation allows us insight into the structure of the optimal policies, which turn out to be threshold based. Since the computation of the optimal defender policy using dynamic programming is not feasible for practical cases, we approximate the optimal policy through reinforcement learning in a simulation environment. To define the dynamics of the simulation, we emulate the target infrastructure and collect measurements. Our evaluations show that the learned policies are close to optimal and that they indeed can be expressed using thresholds.
Learning Intrusion Prevention Policies Through Optimal StoppingKim Hammar
CDIS Research Workshop 2021 Balingsholm.
We study automated intrusion prevention using reinforcement learning. In a novel approach, we formulate the problem of intrusion prevention as an optimal multiple stopping problem. This formulation allows us insight into the structure of the optimal policies, which we show to be threshold based. Since the computation of the optimal defender policy using dynamic programming is not feasible for practical cases, we develop a reinforcement learning approach to approximate the optimal policy in a target infrastructure. The approach uses an emulation of the infrastructure to evaluate policies and to instantiate a simulation model which then is used to train policies through reinforcement learning. Our results show that the learned policies are close to optimal and that they indeed can be expressed using thresholds.
Learning Intrusion Prevention Policies Through Optimal StoppingKim Hammar
The document discusses formulating intrusion prevention as an optimal stopping problem. It describes a use case where a defender monitors an infrastructure for signs of intrusion by an attacker. The defender can take defensive actions or stops at different time steps, with the goal of stopping an intrusion. This problem is modeled as a partially observable Markov decision process (POMDP) where the optimal strategy is to determine the optimal times to stop and take defensive actions based on observations over time.
Self-Learning Systems for Cyber SecurityKim Hammar
The document discusses challenges in cybersecurity from evolving and automated attacks on complex infrastructures. The goal is to automate security tasks and develop self-learning systems that can adapt to changing attack methods. The proposed approach is to model network attacks and defenses as games and use reinforcement learning to learn policies that can be incorporated into self-learning systems. Key aspects of the approach include emulating computer infrastructures, using the emulations to create system models, and applying reinforcement learning and policy mapping to the models to develop effective security strategies that can be implemented and help systems automate tasks and continuously self-improve over time.
ScyllaDB is making a major architecture shift. We’re moving from vNode replication to tablets – fragments of tables that are distributed independently, enabling dynamic data distribution and extreme elasticity. In this keynote, ScyllaDB co-founder and CTO Avi Kivity explains the reason for this shift, provides a look at the implementation and roadmap, and shares how this shift benefits ScyllaDB users.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
QA or the Highway - Component Testing: Bridging the gap between frontend appl...zjhamm304
These are the slides for the presentation, "Component Testing: Bridging the gap between frontend applications" that was presented at QA or the Highway 2024 in Columbus, OH by Zachary Hamm.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
"NATO Hackathon Winner: AI-Powered Drug Search", Taras KlobaFwdays
This is a session that details how PostgreSQL's features and Azure AI Services can be effectively used to significantly enhance the search functionality in any application.
In this session, we'll share insights on how we used PostgreSQL to facilitate precise searches across multiple fields in our mobile application. The techniques include using LIKE and ILIKE operators and integrating a trigram-based search to handle potential misspellings, thereby increasing the search accuracy.
We'll also discuss how the azure_ai extension on PostgreSQL databases in Azure and Azure AI Services were utilized to create vectors from user input, a feature beneficial when users wish to find specific items based on text prompts. While our application's case study involves a drug search, the techniques and principles shared in this session can be adapted to improve search functionality in a wide range of applications. Join us to learn how PostgreSQL and Azure AI can be harnessed to enhance your application's search capability.
High performance Serverless Java on AWS- GoTo Amsterdam 2024Vadym Kazulkin
Java is for many years one of the most popular programming languages, but it used to have hard times in the Serverless community. Java is known for its high cold start times and high memory footprint, comparing to other programming languages like Node.js and Python. In this talk I'll look at the general best practices and techniques we can use to decrease memory consumption, cold start times for Java Serverless development on AWS including GraalVM (Native Image) and AWS own offering SnapStart based on Firecracker microVM snapshot and restore and CRaC (Coordinated Restore at Checkpoint) runtime hooks. I'll also provide a lot of benchmarking on Lambda functions trying out various deployment package sizes, Lambda memory settings, Java compilation options and HTTP (a)synchronous clients and measure their impact on cold and warm start times.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving
What began over 115 years ago as a supplier of precision gauges to the automotive industry has evolved into being an industry leader in the manufacture of product branding, automotive cockpit trim and decorative appliance trim. Value-added services include in-house Design, Engineering, Program Management, Test Lab and Tool Shops.
Automated Intrusion Response - CDIS Spring Conference 2024
1. 0/40
Automated Intrusion Response
CDIS Spring Conference
Kim Hammar
kimham@kth.se
Division of Network and Systems Engineering
KTH Royal Institute of Technology
May 22, 2024
2. 1/40
Use Case: Intrusion Response
I A defender owns an infrastructure
I Consists of connected components
I Components run network services
I Defender defends the infrastructure
by monitoring and active defense
I Has partial observability
I An attacker seeks to intrude on the
infrastructure
I Has a partial view of the
infrastructure
I Wants to compromise specific
components
I Attacks by reconnaissance,
exploitation and pivoting
Attacker Clients
. . .
Defender
1 IPS
1
alerts
Gateway
7 8 9 10 11
6
5
4
3
2
12
13 14 15 16
17
18
19
21
23
20
22
24
25 26
27 28 29 30 31
3. 2/40
Automated Intrusion Response
Levels of security automation
No automation.
Manual detection.
Manual prevention.
Lack of tools.
1980s 1990s 2000s-Now Research
Operator assistance.
Audit logs
Manual detection.
Manual prevention.
Partial automation.
Manual configuration.
Intrusion detection systems.
Intrusion prevention systems.
High automation.
System automatically
updates itself.
4. 2/40
Automated Intrusion Response
Levels of security automation
No automation.
Manual detection.
Manual prevention.
Lack of tools.
1980s 1990s 2000s-Now Research
Operator assistance.
Audit logs
Manual detection.
Manual prevention.
Partial automation.
Manual configuration.
Intrusion detection systems.
Intrusion prevention systems.
High automation.
System automatically
updates itself.
Can we find effective security strategies through decision-theoretic methods?
5. 3/40
Our Framework for Automated Intrusion Response
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Optimization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
6. 3/40
Our Framework for Automated Intrusion Response
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Optimization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
7. 3/40
Our Framework for Automated Intrusion Response
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Optimization
Model estimation
Automation &
Self-learning systems
8. 3/40
Our Framework for Automated Intrusion Response
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Optimization
Model estimation
Automation &
Self-learning systems
9. 3/40
Our Framework for Automated Intrusion Response
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Optimization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
11. 3/40
Our Framework for Automated Intrusion Response
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Optimization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
12. 3/40
Our Framework for Automated Intrusion Response
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Optimization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
13. 4/40
Creating a Digital Twin of the Target Infrastructure
Σ Configuration Space
σi
*
* *
172.18.4.0/24
172.18.19.0/24
172.18.61.0/24
Digital Twins
R1 R1 R1
I Given an infrastructure configuration, our framework
automates the creation of a digital twin.
I The configuration space defines the class of infrastructures
that we can emulate.
14. 5/40
Example Infrastructure Configuration
I 64 nodes
I 24 ovs switches
I 3 gateways
I 6 honeypots
I 8 application servers
I 4 administration servers
I 15 compute servers
I 11 vulnerabilities
I cve-2010-0426
I cve-2015-3306
I etc.
I Management
I 1 sdn controller
I 1 Kafka server
I 1 elastic server
r&d zone
App servers Honeynet
dmz
admin
zone
workflow
Gateway idps
quarantine
zone
alerts
Defender
. . .
Attacker Clients
2
1
3 12
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22 23
24
25
26
27
28
29
30 31
32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
15. 6/40
Emulating Physical Components
Containers
Physical server
Operating system
Docker engine
Our framework
I We emulate physical components with Docker containers
I Focus on linux-based systems
I Our framework provides the orchestration layer
16. 7/40
Emulating Network Connectivity
Management node 1
Emulated IT infrastructure
Management node 2
Emulated IT infrastructure
Management node n
Emulated IT infrastructure
VXLAN VXLAN . . . VXLAN
IP network
I We emulate network connectivity on the same host using
network namespaces
I Connectivity across physical hosts is achieved using VXLAN
tunnels with Docker swarm
17. 8/40
Emulating Network Conditions
I Traffic shaping using NetEm
I Allows to configure:
I Delay
I Capacity
I Packet Loss
I Jitter
I Queueing delays
I etc.
User space
. . .
Application processes
Kernel
TCP/UDP
IP/Ethernet/802.11
OS
TCP/IP
stack
Queueing
discipline
Device driver
queue (FIFO)
NIC
Netem config:
latency,
jitter, etc.
Sockets
18. 9/40
Emulating Clients Client population
. . .
Arrival rate λ Departure
Service time µ
. . .
.
.
.
.
.
.
.
.
.
w1 w2 w|W|
Workflows (Markov processes)
I Homogeneous client population
I Clients arrive according to Po(λ)
I Client service times Exp(µ)
I Service dependencies (St)t=1,2,... ∼ mc
19. 10/40
Emulating The Attacker and The Defender
I API for automated
defender and attacker
actions
I Attacker actions:
I Exploits
I Reconnaissance
I Pivoting
I etc.
I Defender actions:
I Shut downs
I Redirect
I Isolate
I Recover
I Migrate
I etc.
Markov Decision Process
s1,1 s1,2 s1,3 . . . s1,4
s2,1 s2,2 s2,3 . . . s2,4
Digital Twin
. . .
Virtual
network
Virtual
devices
Emulated
services
Emulated
actors
IT Infrastructure
Configuration
& change events
System traces
Verified security policy
Optimized security policy
20. 11/40
Software framework
Metastore
Python libraries
Management api
rest api cli
grpc
Management api
Digital twins
I More details about the software framework
I Source code: https://github.com/Limmen/csle
I Documentation: http://limmen.dev/csle/
I Demo: https://www.youtube.com/watch?v=iE2KPmtIs2A
I Installation:
https://www.youtube.com/watch?v=l_g3sRJwwhc
21. 12/40
System Identification
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
22. 13/40
System Model
H C
∅
Crashed
Healthy Compromised
Model
complexity
Static attacker
Small set of responses
Dynamic attacker
Small set of responses
Dynamic attacker
Large set of responses
I Intrusion response can be modeled in many ways
I As a parametric optimization problem
I As an optimal stopping problem
I As a dynamic program
I As a game
I etc.
23. 14/40
Related Work on Learning Automated Intrusion Response
External validity
model
complexity
Goal
Georgia et al. 2000.
(Next generation
intrusion detection:
reinforcement learning)
Xu et al. 2005.
(An RL approach to
host-based
intrusion detection)
Servin et al. 2008.
(Multi-agent RL for
intrusion detection)
Malialis et al. 2013.
(Decentralized
RL response to
DDoS attacks)
Zhu et al. 2019.
(Adaptive
Honeypot engagement)
Apruzzese et al. 2020.
(Deep RL to
evade botnets)
Xiao et al. 2021.
(RL approach to APT)
etc. 2022-2023
Our work 2020-2023
24. 15/40
Intrusion Response through Optimal Stopping
I Suppose
I The attacker follows a fixed strategy (no adaptation)
I We only have one response action, e.g., block the gateway
I Formulate intrusion response as optimal stopping
Intrusion event
time-step t = 1 Intrusion ongoing
t
t = T
Early stopping times Stopping times that
affect the intrusion
Episode
25. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
26. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
27. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
28. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
29. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
30. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
31. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
32. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
33. 16/40
Intrusion Response from the Defender’s Perspective
20 40 60 80 100 120 140 160 180 200
0.5
1
t
b
20 40 60 80 100 120 140 160 180 200
50
100
150
200
t
Alerts
20 40 60 80 100 120 140 160 180 200
5
10
15
t
Logins
When to take a defensive action?
34. 17/40
The Defender’s Optimal Stopping Problem (1/3)
I Infrastructure is a discrete-time dynamical system (st)T
t=1
I Defender observes a noisy observation process (ot)T
t=1
I Two options at each time t: (C)ontinue and (S)stop
I Find the optimal stopping time τ?:
τ?
∈ arg max
τ
Eτ
" τ−1
X
t=1
γt−1
RC
st st+1
+ γτ−1
RS
sτ sτ
#
where RS
ss0 & RC
ss0 are the stop/continue rewards and τ is
τ = inf{t : t > 0, at = S}
t
ot
τ∗
35. 18/40
The Defender’s Optimal Stopping Problem (2/3)
I Objective: stop the attack as soon as possible
I Let the state space be S = {H, C, ∅}
H C
∅
Stopped
Healthy Compromised
36. 19/40
The Defender’s Optimal Stopping Problem (3/3)
I Let the observation process (ot)T
t=1 represent ids alerts
0 2000 4000 6000 8000
probability
cve-2010-0426
0 2000 4000 6000 8000
cve-2015-3306
0 2000 4000 6000 8000
cve-2015-5602
0 2000 4000 6000 8000
cve-2016-10033
0 2000 4000 6000 8000
cwe-89
0 2000 4000 6000 8000
O
probability
cve-2017-7494
0 2000 4000 6000 8000
O
cve-2014-6271
0 5000 10000 15000 20000
O
ftp brute force
0 5000 10000 15000 20000
O
ssh brute force
0 5000 10000 15000 20000
O
telnet brute force
intrusion no intrusion intrusion model normal operation model
I Estimate the observation distribution based on M samples
from the twin
I E.g., compute empirical distribution b
Z as estimate of Z
I b
Z →a.s Z as M → ∞ (Glivenko-Cantelli theorem)
37. 20/40
Optimal Stopping Strategy
I The defender can compute the belief
bt , P[St = C | b1, o1, o2, . . . ot]
I Stopping strategy:
π(b) : [0, 1] → {S, C}
Defender
Belief
38. 20/40
Optimal Threshold Strategy
Theorem
There exists an optimal defender strategy of the form:
π?
(b) = S ⇐⇒ b ≥ α?
α?
∈ [0, 1]
i.e., the stopping set is S = [α?, 1]
b
0 1
belief space B = [0, 1]
S1
α?
1
39. 21/40
Optimal Multiple Stopping
I Suppose the defender can take L ≥ 1 response actions
I Find the optimal stopping times τ?
L , τ?
L−1, . . . , τ?
1 :
(τ?
l )l=1,...,L ∈ arg max
τ1,...,τL
Eτ1,...,τL
" τL−1
X
t=1
γt−1
RC
st st+1
+ γτL−1
RS
sτL
sτL
+
τL−1−1
X
t=τL+1
γt−1
RC
st st+1
+ γτL−1−1
RS
sτL−1
sτL−2
+ . . . +
τ1−1
X
t=τ2+1
γt−1
RC
st st+1
+ γτ1−1
RS
sτ1 sτ1
#
where τl denotes the stopping time with l stops remaining.
t
ot
τ?
L τ?
L−1 τ?
L−2 . . .
40. 22/40
Optimal Multi-Threshold Strategy
Theorem
I Stopping sets are nested Sl−1 ⊆ Sl for l = 2, . . . L.
I If (ot)t≥1 is totally positive of order 2 (TP2), there exists an
optimal defender strategy of the form:
π?
l (b) = S ⇐⇒ b ≥ α?
l , l = 1, . . . , L
where α?
l ∈ [0, 1] is decreasing in l.
b
0 1
belief space B = [0, 1]
S1
S2
.
.
.
SL
α?
1
α?
2
α?
L
. . .
41. 23/40
Optimal Stopping Game
I Suppose the attacker is dynamic and decides when to start
and abort its intrusion.
Attacker
Defender
t = 1
t = T
τ1,1 τ1,2 τ1,3
τ2,1
t
Stopped
Game episode
Intrusion
I Find the optimal stopping times
maximize
τD,1,...,τD,L
minimize
τA,1,τA,2
E[J]
where J is the defender’s objective.
42. 24/40
Best-Response Multi-Threshold Strategies (1/2)
Theorem
I The defender’s best response is of the form:
π̃D,l (b) = S ⇐⇒ b ≥ α̃l , l = 1, . . . , L
I The attacker’s best response is of the form:
π̃A,l (b) = C ⇐⇒ π̃D,l (S | b) ≥ β̃H,l , l = 1, . . . , L, s = H
π̃A,l (b) = S ⇐⇒ π̃D,l (S | b) ≥ β̃C,l , l = 1, . . . , L, s = C
43. 25/40
Best-Response Multi-Threshold Strategies (2/2)
b
Defender
0 1
S
(D)
1,π2,l
S
(D)
2,π2,l
.
.
.
S
(D)
L,π2,l
α̃1
α̃2
α̃L . . .
b
Attacker
0 1
S
(A)
C,1,πD,l
S
(A)
C,L,πD,l
β̃C,1
β̃C,L
β̃H,1
. . . β̃H,L
. . .
S
(A)
H,1,πD,l
S
(A)
H,L,πD,l
44. 26/40
Efficient Computation of Best Responses
Algorithm 1: Threshold Optimization
1 Input: Objective function J, number of thresholds L,
parametric optimizer PO
2 Output: A approximate best response strategy π̂θ
3 Algorithm
4 Θ ← [0, 1]L
5 For each θ ∈ Θ, define πθ(bt) as
6 πθ(bt) ,
(
S if bt ≥ θi
C otherwise
7 Jθ ← Eπθ
[J]
8 π̂θ ← PO(Θ, Jθ)
9 return π̂θ
I Examples of parameteric optimization algorithmns: CEM, BO,
CMA-ES, DE, SPSA, etc.
45. 27/40
Threshold-Fictitious Play to Approximate an Equilibrium
π̃A ∈ BA(πD)
πA
πD
π̃D ∈ BD(πA)
π̃0
A ∈ BA(π0
D)
π0
A
π0
D
π̃0
D ∈ BD(π0
A)
. . .
π?
A ∈ BA(π?
D)
π?
D ∈ BD(π?
A)
Fictitious play: iterative averaging of best responses.
I Learn best response strategies iteratively
I Average best responses to approximate the equilibrium
46. 28/40
Comparison against State-of-the-art Algorithms
0 10 20 30 40 50 60
training time (min)
0
50
100
Reward per episode against Novice
0 10 20 30 40 50 60
training time (min)
Reward per episode against Experienced
0 10 20 30 40 50 60
training time (min)
Reward per episode against Expert
PPO ThresholdSPSA Shiryaev’s Algorithm (α = 0.75) HSVI upper bound
47. 29/40
Learning Curves in Simulation and Digital Twin
0
50
100
Novice
Reward per episode
0
50
100
Episode length (steps)
0.0
0.5
1.0
P[intrusion interrupted]
0.0
0.5
1.0
1.5
P[early stopping]
5
10
Duration of intrusion
−50
0
50
100
experienced
0
50
100
150
0.0
0.5
1.0
0.0
0.5
1.0
1.5
0
5
10
0 20 40 60
training time (min)
−50
0
50
100
expert
0 20 40 60
training time (min)
0
50
100
150
0 20 40 60
training time (min)
0.0
0.5
1.0
0 20 40 60
training time (min)
0.0
0.5
1.0
1.5
2.0
0 20 40 60
training time (min)
0
5
10
15
20
πθ,l simulation πθ,l emulation (∆x + ∆y) ≥ 1 baseline Snort IPS upper bound
0 20 40 60 80 100
# training iterations
0
1
2
3
Exploitability
0 20 40 60 80 100
# training iterations
−5.0
−2.5
0.0
2.5
5.0
Defender reward per episode
0 20 40 60 80 100
# training iterations
0
1
2
3
Intrusion length
(π1,l, π2,l) emulation (π1,l, π2,l) simulation Snort IPS ot ≥ 1 upper bound
48. 29/40
Learning Curves in Simulation and Digital Twin
0
50
100
Novice
Reward per episode
0
50
100
Episode length (steps)
0.0
0.5
1.0
P[intrusion interrupted]
0.0
0.5
1.0
1.5
P[early stopping]
5
10
Duration of intrusion
−50
0
50
100
experienced
0
50
100
150
0.0
0.5
1.0
0.0
0.5
1.0
1.5
0
5
10
0 20 40 60
training time (min)
−50
0
50
100
expert
0 20 40 60
training time (min)
0
50
100
150
0 20 40 60
training time (min)
0.0
0.5
1.0
0 20 40 60
training time (min)
0.0
0.5
1.0
1.5
2.0
0 20 40 60
training time (min)
0
5
10
15
20
πθ,l simulation πθ,l emulation (∆x + ∆y) ≥ 1 baseline Snort IPS upper bound
0 20 40 60 80 100
# training iterations
0
1
2
3
Exploitability
0 20 40 60 80 100
# training iterations
−5.0
−2.5
0.0
2.5
5.0
Defender reward per episode
0 20 40 60 80 100
# training iterations
0
1
2
3
Intrusion length
(π1,l, π2,l) emulation (π1,l, π2,l) simulation Snort IPS ot ≥ 1 upper bound
Stopping is about timing; now we consider timing + action selection
49. 30/40
General Intrusion Response Game
I Suppose the defender and the attacker
can take L actions per node
I G = h{gw} ∪ V, Ei: directed tree
representing the virtual infrastructure
I V: set of virtual nodes
I E: set of node dependencies
I Z: set of zones
r&d zone
App servers Honeynet
dmz
admin
zone
workflow
Gateway idps
quarantine
zone
alerts
Defender
. . .
Attacker Clients
2
1
3 12
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22 23
24
25
26
27
28
29
30 31
32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
50. 30/40
General Intrusion Response Game
I Suppose the defender and the attacker
can take L actions per node
I G = h{gw} ∪ V, Ei: directed tree
representing the virtual infrastructure
I V: set of virtual nodes
I E: set of node dependencies
I Z: set of zones
r&d zone
App servers Honeynet
dmz
admin
zone
workflow
Gateway idps
quarantine
zone
alerts
Defender
. . .
Attacker Clients
2
1
3 12
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22 23
24
25
26
27
28
29
30 31
32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
51. 31/40
State Space
I Each i ∈ V has a state
vi,t = (v
(Z)
t,i
|{z}
D
, v
(I)
t,i , v
(R)
t,i
| {z }
A
)
I System state st = (vt,i )i∈V ∼ St
I Markovian time-homogeneous
dynamics:
st+1 ∼ f (· | St, At)
At = (A
(A)
t , A
(D)
t ) are the actions.
s1 s2 s3
s4 s5 s4
.
.
.
.
.
.
.
.
.
52. 31/40
State Space
I Each i ∈ V has a state
vi,t = (v
(Z)
t,i
|{z}
D
, v
(I)
t,i , v
(R)
t,i
| {z }
A
)
I System state st = (vt,i )i∈V ∼ St
I Markovian time-homogeneous
dynamics:
st+1 ∼ f (· | St, At)
At = (A
(A)
t , A
(D)
t ) are the actions.
s1 s2 s3
s4 s5 s4
.
.
.
.
.
.
.
.
.
53. 31/40
State Space
I Each i ∈ V has a state
vi,t = (v
(Z)
t,i
|{z}
D
, v
(I)
t,i , v
(R)
t,i
| {z }
A
)
I System state st = (vt,i )i∈V ∼ St
I Markovian time-homogeneous
dynamics:
st+1 ∼ f (· | St, At)
At = (A
(A)
t , A
(D)
t ) are the actions.
s1 s2 s3
s4 s5 s4
.
.
.
.
.
.
.
.
.
54. 32/40
Observations
I idpss inspect network traffic and
generate alert vectors:
ot ,
ot,1, . . . , ot,|V|
∈ N
|V|
0
ot,i is the number of alerts related to
node i ∈ V at time-step t.
I ot = (ot,1, . . . , ot,|V|) is a realization
of the random vector Ot with joint
distribution Z
idps
idps
idps
idps
alerts
Defender
. . .
Attacker Clients
2
1
3 12
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22 23
24
25
26
27
28
29
30 31
32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
55. 32/40
Observations
I idpss inspect network traffic and
generate alert vectors:
ot ,
ot,1, . . . , ot,|V|
∈ N
|V|
0
ot,i is the number of alerts related to
node i ∈ V at time-step t.
I ot = (ot,1, . . . , ot,|V|) is a realization
of the random vector Ot with joint
distribution Z
idps
idps
idps
idps
alerts
Defender
. . .
Attacker Clients
2
1
3 12
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22 23
24
25
26
27
28
29
30 31
32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
57. 34/40
The (General) Intrusion Response Problem
maximizeπD∈ΠD
minimizeπA∈ΠA
E(πD,πA) [J]
E(πD,πA) denotes the expectation of the random vectors
(St, Ot, At)t∈{1,...,T} when following the strategy profile (πD, πA)
58. 35/40
The Curse of Dimensionality
I Solving the game is computationally intractable. The state,
action, and observation spaces of the game grow
exponentially with |V|.
1 2 3 4 5
104
105
2
105
|S|
|O|
|Ai |
|V|
Growth of |S|, |O|, and |Ai | in function of the number of nodes |V|
59. 35/40
The Curse of Dimensionality
I While (1) has a solution (i.e the game Γ has a value (Thm
1)), computing it is intractable since the state, action, and
observation spaces of the game grow exponentially with |V|.
1 2 3 4 5
104
105
2
105
|S|
|O|
|Ai |
|V|
Growth of |S|, |O|, and |Ai | in function of the number of nodes |V|
We tackle the scability challenge with decomposition
60. 36/40
Intuitively..
rd zone
App servers Honeynet
dmz
admin
zone
workflow
Gateway idps
quarantine
zone
alerts
Defender
. . .
Attacker Clients
2
1
3 12
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22 23
24
25
26
27
28
29
30 31
32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 65
The optimal
action here...
Does not directly
depend on the state or
action of a node
down here
61. 36/40
Intuitively..
rd zone
App servers Honeynet
dmz
admin
zone
workflow
Gateway idps
quarantine
zone
alerts
Defender
. . .
Attacker Clients
2
1
3 12
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22 23
24
25
26
27
28
29
30 31
32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 65
The optimal
action here...
But they are
not completely
independent either.
How can we
exploit this
structure?
Does not directly
depend on the state
or action of a node
down here
62. 37/40
Scalable Learning through Decomposition
1 2 3 4 5 6 7 8 9 10
2
4
6
8
10
linear
measured
# parallel processes n
|V| = 10
Speedup
S
n
Speedup of best response computation for the decomposed game; Tn
denotes the completion time with n processes; the speedup is calculated
as Sn = T1
Tn
; the error bars indicate standard deviations from 3
measurements.
63. 38/40
Learning Equilibrium Strategies
0 20 40 60 80 100
running time (h)
0
5
b
δ = 0.4
Approximate exploitability b
δ
0 20 40 60 80 100
running time (h)
0.0
0.5
1.0
Defender utility per episode
dfsp simulation dfsp digital twin upper bound oi,t 0 random defense
Learning curves obtained during training of dfsp to find optimal
(equilibrium) strategies in the intrusion response game; red and blue
curves relate to dfsp; black, orange and green curves relate to baselines.
64. 39/40
Comparison with NFSP
0 10 20 30 40 50 60 70 80
running time (h)
0.0
2.5
5.0
7.5
Approximate exploitability
dfsp nfsp
Learning curves obtained during training of dfsp and nfsp to find
optimal (equilibrium) strategies in the intrusion response game; the red
curve relate to dfsp and the purple curve relate to nfsp; all curves
show simulation results.
65. 40/40
Conclusions
I We develop a framework to
automatically learn security strategies.
I We apply the framework to an
intrusion response use case.
I We derive properties of optimal
security strategies.
I We evaluate strategies on a digital
twin.
I Questions → demonstration
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Emulation
Target
System
Model Creation
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation
Learning