SlideShare a Scribd company logo
Self-learning Systems for Cyber Security1
Kim Hammar & Rolf Stadler
kimham@kth.se & stadler@kth.se
Division of Network and Systems Engineering
KTH Royal Institute of Technology
October 15, 2020
1
Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and
Self-Play”. In: International Conference on Network and Service Management (CNSM 2020) (CNSM 2020). Izmir,
Turkey, Nov. 2020.
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 1 / 12
Game Learning Programs
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 2 / 12
Challenges: Evolving and Automated Attacks
Challenges:
Evolving & automated attacks
Complex infrastructures
172.18.1.0/24
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
Goal: Automation and Learning
Challenges
Evolving & automated attacks
Complex infrastructures
Our Goal:
Automate security tasks
Adapt to changing attack methods
Security Agent Controller
Observations Actions
172.18.1.0/24
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
Approach: Game Model & Reinforcement Learning
Challenges:
Evolving & automated attacks
Complex infrastructures
Our Goal:
Automate security tasks
Adapt to changing attack methods
Our Approach:
Model network as Markov Game
MG = hS, A1, . . . , AN , T , R1, . . . RN i
Compute policies π for MG
Incorporate π in self-learning systems Security Agent Controller
st+1,
rt+1
at
π(a|s; θ)
172.18.1.0/24
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
Related Work
Game-Learning Programs:
TD-Gammon2
, AlphaGo Zero3
, OpenAI Five etc.
=⇒ Impressive empirical results of RL and self-play
Network Security:
Automated threat modeling4
, automated intrusion detection etc.
=⇒ Need for automation and better security tooling
Game Theory:
Network Security: A Decision and Game-Theoretic Approach5
.
=⇒ Many security operations involves strategic decision making
2
Gerald Tesauro. “TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play”. In:
Neural Comput. 6.2 (Mar. 1994), 215–219. ISSN: 0899-7667. DOI: 10.1162/neco.1994.6.2.215. URL:
https://doi.org/10.1162/neco.1994.6.2.215.
3
David Silver et al. “Mastering the game of Go without human knowledge”. In: Nature 550 (Oct. 2017),
pp. 354–. URL: http://dx.doi.org/10.1038/nature24270.
4
Pontus Johnson, Robert Lagerström, and Mathias Ekstedt. “A Meta Language for Threat Modeling and Attack
Simulations”. In: Proceedings of the 13th International Conference on Availability, Reliability and Security. ARES
2018. Hamburg, Germany: Association for Computing Machinery, 2018. ISBN: 9781450364485. DOI:
10.1145/3230833.3232799. URL: https://doi.org/10.1145/3230833.3232799.
5
Tansu Alpcan and Tamer Basar. Network Security: A Decision and Game-Theoretic Approach. 1st. USA:
Cambridge University Press, 2010. ISBN: 0521119324.
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 4 / 12
Outline
Use Case
Markov Game Model for Intrusion Prevention
Reinforcement Learning Problem
Method
Results
Conclusions
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 5 / 12
Use Case: Intrusion Prevention
A Defender owns a network infrastructure
Consists of connected components
Components run network services
Defends by monitoring and patching
An Attacker seeks to intrude on the infrastructure
Has a partial view of the infrastructure
Wants to compromise a specific component
Attacks by reconnaissance and exploitation
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 6 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
(3) State space |S| = (w + 1)|N|·m·(m+1)
SA
3 = [0, 0, 0, 0]
SD
3 = [9, 1, 5, 8, 1]
SA
0 = [0, 0, 0, 0]
SD
0 = [0, 0, 0, 0, 0]
SA
1 = [0, 0, 0, 0]
SD
1 = [1, 9, 9, 8, 1]
SA
2 = [0, 0, 0, 0]
SD
2 = [9, 9, 9, 8, 1]
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
(3) State space |S| = (w + 1)|N|·m·(m+1)
SA
3 = [0, 0, 0, 0]
SD
3 = [9, 1, 5, 8, 1]
SA
0 = [0, 0, 0, 0]
SD
0 = [0, 0, 0, 0, 0]
SA
1 = [0, 0, 0, 0]
SD
1 = [1, 9, 9, 8, 1]
SA
2 = [0, 0, 0, 0]
SD
2 = [9, 9, 9, 8, 1]
(4) Action space |A| = |N| · (m + 1)
?/0 ?/0 ?/0 ?/0
?/0
?/0
?/0 ?/0 ?/0
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
(3) State space |S| = (w + 1)|N|·m·(m+1)
SA
3 = [0, 0, 0, 0]
SD
3 = [9, 1, 5, 8, 1]
SA
0 = [0, 0, 0, 0]
SD
0 = [0, 0, 0, 0, 0]
SA
1 = [0, 0, 0, 0]
SD
1 = [1, 9, 9, 8, 1]
SA
2 = [0, 0, 0, 0]
SD
2 = [9, 9, 9, 8, 1]
(4) Action space |A| = |N| · (m + 1)
?/0 ?/0 ?/0 ?/0
?/0
?/0
?/0 ?/0 ?/0
(5) Game Dynamics T , R1, R2, ρ0
4/0 0/1 8/0 ?/0
?/0
?/0
0/0 9/0 8/0
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
(3) State space |S| = (w + 1)|N|·m·(m+1)
SA
3 = [0, 0, 0, 0]
SD
3 = [9, 1, 5, 8, 1]
SA
0 = [0, 0, 0, 0]
SD
0 = [0, 0, 0, 0, 0]
SA
1 = [0, 0, 0, 0]
SD
1 = [1, 9, 9, 8, 1]
SA
2 = [0, 0, 0, 0]
SD
2 = [9, 9, 9, 8, 1]
(4) Action space |A| = |N| · (m + 1)
?/0 ?/0 ?/0 ?/0
?/0
?/0
?/0 ?/0 ?/0
(5) Game Dynamics T , R1, R2, ρ0
4/0 0/1 8/0 ?/0
?/0
?/0
0/0 9/0 8/0
∴
- Markov game
- Zero-sum
- 2 players
- Partially observed
- Stochastic elements
- Round-based
MG = hS, A1, A2, T , R1, R2, γ, ρ0i
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Automatic Learning of Security Strategies
1
2 2 2
q1 = 3 q2 = 4 q3 = 6
(18,18) (15,20) (9,18)
q2 = 3
q2 = 4
q2 = 6
(20,15) (15,16) (8,12)
q2 = 3
q2 = 4
q2 = 6
(18,9) (12,8) (0,0)
q2 = 3
q2 = 4
q2 = 6
Finding strategies for the Markov game model:
Evolutionary methods
Computational game theory
Self-Play Reinforcement learning
Attacker vs Defender
Strategies evolve without human intervention
Motivation for Reinforcement Learning:
Strong empirical results in related work
Can adapt to new attack methods and threats
Can be used for complex domains that are hard to model exactly
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 8 / 12
The Reinforcement Learning Problem
Goal:
Approximate π∗
= arg maxπ E
hPT
t=0 γt
rt+1
i
Learning Algorithm:
Represent π by πθ
Define objective J(θ) = Eo∼ρπθ ,a∼πθ
[R]
Maximize J(θ) by stochastic gradient ascent with gradient
∇θJ(θ) = Eo∼ρπθ ,a∼πθ
[∇θ log πθ(a|o)Aπθ
(o, a)]
Domain-Specific Challenges:
Partial observability: captured in the model
Large state space |S| = (w + 1)|N|·m·(m+1)
Large action space |A| = |N| · (m + 1)
Non-stationary Environment due to presence of adversary
Agent
Environment
at
st+1
rt+1
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
The Reinforcement Learning Problem
Goal:
Approximate π∗
= arg maxπ E
hPT
t=0 γt
rt+1
i
Learning Algorithm:
Represent π by πθ
Define objective J(θ) = Eo∼ρπθ ,a∼πθ
[R]
Maximize J(θ) by stochastic gradient ascent with gradient
∇θJ(θ) = Eo∼ρπθ ,a∼πθ
[∇θ log πθ(a|o)Aπθ
(o, a)]
Domain-Specific Challenges:
Partial observability: captured in the model
Large state space |S| = (w + 1)|N|·m·(m+1)
Large action space |A| = |N| · (m + 1)
Non-stationary Environment due to presence of adversary
Agent
Environment
at
st+1
rt+1
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
The Reinforcement Learning Problem
Goal:
Approximate π∗
= arg maxπ E
hPT
t=0 γt
rt+1
i
Learning Algorithm:
Represent π by πθ
Define objective J(θ) = Eo∼ρπθ ,a∼πθ
[R]
Maximize J(θ) by stochastic gradient ascent with gradient
∇θJ(θ) = Eo∼ρπθ ,a∼πθ
[∇θ log πθ(a|o)Aπθ
(o, a)]
Domain-Specific Challenges:
Partial observability: captured in the model
Large state space |S| = (w + 1)|N|·m·(m+1)
Large action space |A| = |N| · (m + 1)
Non-stationary Environment due to presence of adversary
Agent
Environment
at
st+1
rt+1
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
Our Reinforcement Learning Method
Policy Gradient & Function Approximation
To deal with large state space S
πθ parameterized by weights θ ∈ Rd
of NN.
PPO & REINFORCE (stochastic π)
Auto-Regressive Policy Representation
To deal with large action space A
To minimize interference
π(a, n|o) = π(a|n, o) · π(n|o)
Opponent Pool
To avoid overfitting
Want agent to learn a general strategy
Attacker π(a|s; θ2)
Defender π(a|s; θ1)
st+1
rt+1
ha1
t , a2
t i
Opponent pool
s0 s1 s2 . . .
Markov Game
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
Our Reinforcement Learning Method
Policy Gradient & Function Approximation
To deal with large state space S
πθ parameterized by weights θ ∈ Rd
of NN.
PPO & REINFORCE (stochastic π)
Auto-Regressive Policy Representation
To deal with large action space A
To minimize interference
π(a, n|o) = π(a|n, o) · π(n|o)
Opponent Pool
To avoid overfitting
Want agent to learn a general strategy
Attacker π(a|s; θ2)
Defender π(a|s; θ1)
st+1
rt+1
ha1
t , a2
t i
Opponent pool
s0 s1 s2 . . .
Markov Game
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
Our Reinforcement Learning Method
Policy Gradient & Function Approximation
To deal with large state space S
πθ parameterized by weights θ ∈ Rd
of NN.
PPO & REINFORCE (stochastic π)
Auto-Regressive Policy Representation
To deal with large action space A
To minimize interference
π(a, n|o) = π(a|n, o) · π(n|o)
Opponent Pool
To avoid overfitting
Want agent to learn a general strategy
Attacker π(a|s; θ2)
Defender π(a|s; θ1)
st+1
rt+1
ha1
t , a2
t i
Opponent pool
s0 s1 s2 . . .
Markov Game
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
Experimentation: Learning from Zero Knowledge
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 2
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 3
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.3
0.4
0.5
0.6
0.7
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 2
REINFORCE
PPO AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 3
REINFORCE
PPO AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 2
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 3
REINFORCE
PPO-AR
PPO
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 11 / 12
Experimentation: Learning from Zero Knowledge
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 2
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 3
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.3
0.4
0.5
0.6
0.7
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 2
REINFORCE
PPO AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 3
REINFORCE
PPO AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 2
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 3
REINFORCE
PPO-AR
PPO
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 11 / 12
Conclusion & Future Work
Conclusions:
We have proposed a Method to automatically find security strategies
=⇒ Model as Markov game & evolve strategies using self-play
reinforcement learning
Addressed domain-specific challenges with Auto-regressive policy,
opponent pool, and function approximation.
Challenges of applied reinforcement learning
Stable convergence remains a challenge
Sample-efficiency is a problem
Generalization is a challenge
Current & Future Work:
Study techniques for mitigation of identified RL challenges
Learn security strategies by interacion with a cyber range
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 12 / 12
Thank you
All code for reproducing the results is open source:
https://github.com/Limmen/gym-idsgame
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 12 / 12

More Related Content

What's hot

Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
Sebastian Ruder
 
Stochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithmsStochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithms
Seonho Park
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
Frank Nielsen
 
Alpha Go: in few slides
Alpha Go: in few slidesAlpha Go: in few slides
Alpha Go: in few slides
Alessandro Cudazzo
 
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Mokhtar SELLAMI
 
Cari presentation maurice-tchoupe-joskelngoufo
Cari presentation maurice-tchoupe-joskelngoufoCari presentation maurice-tchoupe-joskelngoufo
Cari presentation maurice-tchoupe-joskelngoufo
Mokhtar SELLAMI
 
Linear Classifiers
Linear ClassifiersLinear Classifiers
Linear Classifiers
Alexander Jung
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
recsysfr
 
Lecture4 xing
Lecture4 xingLecture4 xing
Lecture4 xing
Tianlu Wang
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
mooopan
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
recsysfr
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
Hidekazu Oiwa
 
Georgetown B-school Talk 2021
Georgetown B-school Talk  2021Georgetown B-school Talk  2021
Georgetown B-school Talk 2021
Charles Martin
 
從行動廣告大數據觀點談 Big data 20150916
從行動廣告大數據觀點談 Big data   20150916從行動廣告大數據觀點談 Big data   20150916
從行動廣告大數據觀點談 Big data 20150916
Craig Chao
 
Large scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkLarge scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using spark
Mila, Université de Montréal
 
News from Mahout
News from MahoutNews from Mahout
News from Mahout
Ted Dunning
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
Sean Moran
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
Kazuki Fujikawa
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement Learning
Giancarlo Frison
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
Lucas Xu
 

What's hot (20)

Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Stochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithmsStochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithms
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Alpha Go: in few slides
Alpha Go: in few slidesAlpha Go: in few slides
Alpha Go: in few slides
 
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
 
Cari presentation maurice-tchoupe-joskelngoufo
Cari presentation maurice-tchoupe-joskelngoufoCari presentation maurice-tchoupe-joskelngoufo
Cari presentation maurice-tchoupe-joskelngoufo
 
Linear Classifiers
Linear ClassifiersLinear Classifiers
Linear Classifiers
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Lecture4 xing
Lecture4 xingLecture4 xing
Lecture4 xing
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
 
Georgetown B-school Talk 2021
Georgetown B-school Talk  2021Georgetown B-school Talk  2021
Georgetown B-school Talk 2021
 
從行動廣告大數據觀點談 Big data 20150916
從行動廣告大數據觀點談 Big data   20150916從行動廣告大數據觀點談 Big data   20150916
從行動廣告大數據觀點談 Big data 20150916
 
Large scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkLarge scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using spark
 
News from Mahout
News from MahoutNews from Mahout
News from Mahout
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement Learning
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 

Similar to Cdis hammar stadler_15_oct_2020

2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
The Statistical and Applied Mathematical Sciences Institute
 
Security of Artificial Intelligence
Security of Artificial IntelligenceSecurity of Artificial Intelligence
Security of Artificial Intelligence
Federico Cerutti
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
Guillaume Costeseque
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
Polytechnique Montréal
 
Triggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphsTriggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphs
INSA Lyon - L'Institut National des Sciences Appliquées de Lyon
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
Guillaume Costeseque
 
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser BootsmaDSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
Deltares
 
MM framework for RL
MM framework for RLMM framework for RL
MM framework for RL
Sung Yub Kim
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Guillaume Costeseque
 
Introduction to Algorithms
Introduction to AlgorithmsIntroduction to Algorithms
Introduction to Algorithms
Venkatesh Iyer
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
izahn
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Matt Moores
 
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
The Statistical and Applied Mathematical Sciences Institute
 
Putting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReducePutting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReduce
Dmitrii Ignatov
 
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesAtomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
IJERA Editor
 
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
The Statistical and Applied Mathematical Sciences Institute
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
Gael Varoquaux
 
Wireless Localization: Positioning
Wireless Localization: PositioningWireless Localization: Positioning
Wireless Localization: Positioning
Stefano Severi
 
Kk3517971799
Kk3517971799Kk3517971799
Kk3517971799
IJERA Editor
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
Guillaume Costeseque
 

Similar to Cdis hammar stadler_15_oct_2020 (20)

2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
 
Security of Artificial Intelligence
Security of Artificial IntelligenceSecurity of Artificial Intelligence
Security of Artificial Intelligence
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
 
Triggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphsTriggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphs
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
 
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser BootsmaDSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
 
MM framework for RL
MM framework for RLMM framework for RL
MM framework for RL
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equations
 
Introduction to Algorithms
Introduction to AlgorithmsIntroduction to Algorithms
Introduction to Algorithms
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
 
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
 
Putting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReducePutting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReduce
 
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesAtomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
 
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
 
Wireless Localization: Positioning
Wireless Localization: PositioningWireless Localization: Positioning
Wireless Localization: Positioning
 
Kk3517971799
Kk3517971799Kk3517971799
Kk3517971799
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
 

More from Kim Hammar

Automated Intrusion Response - CDIS Spring Conference 2024
Automated Intrusion Response - CDIS Spring Conference 2024Automated Intrusion Response - CDIS Spring Conference 2024
Automated Intrusion Response - CDIS Spring Conference 2024
Kim Hammar
 
Automated Security Response through Online Learning with Adaptive Con jectures
Automated Security Response through Online Learning with Adaptive Con jecturesAutomated Security Response through Online Learning with Adaptive Con jectures
Automated Security Response through Online Learning with Adaptive Con jectures
Kim Hammar
 
Självlärande System för Cybersäkerhet. KTH
Självlärande System för Cybersäkerhet. KTHSjälvlärande System för Cybersäkerhet. KTH
Självlärande System för Cybersäkerhet. KTH
Kim Hammar
 
Learning Automated Intrusion Response
Learning Automated Intrusion ResponseLearning Automated Intrusion Response
Learning Automated Intrusion Response
Kim Hammar
 
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
Intrusion Tolerance for Networked Systems through Two-level Feedback ControlIntrusion Tolerance for Networked Systems through Two-level Feedback Control
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
Kim Hammar
 
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Kim Hammar
 
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Kim Hammar
 
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Kim Hammar
 
Learning Optimal Intrusion Responses via Decomposition
Learning Optimal Intrusion Responses via DecompositionLearning Optimal Intrusion Responses via Decomposition
Learning Optimal Intrusion Responses via Decomposition
Kim Hammar
 
Digital Twins for Security Automation
Digital Twins for Security AutomationDigital Twins for Security Automation
Digital Twins for Security Automation
Kim Hammar
 
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Kim Hammar
 
Självlärande system för cyberförsvar.
Självlärande system för cyberförsvar.Självlärande system för cyberförsvar.
Självlärande system för cyberförsvar.
Kim Hammar
 
Intrusion Response through Optimal Stopping
Intrusion Response through Optimal StoppingIntrusion Response through Optimal Stopping
Intrusion Response through Optimal Stopping
Kim Hammar
 
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
Kim Hammar
 
Self-Learning Systems for Cyber Defense
Self-Learning Systems for Cyber DefenseSelf-Learning Systems for Cyber Defense
Self-Learning Systems for Cyber Defense
Kim Hammar
 
Self-learning Intrusion Prevention Systems.
Self-learning Intrusion Prevention Systems.Self-learning Intrusion Prevention Systems.
Self-learning Intrusion Prevention Systems.
Kim Hammar
 
Learning Security Strategies through Game Play and Optimal Stopping
Learning Security Strategies through Game Play and Optimal StoppingLearning Security Strategies through Game Play and Optimal Stopping
Learning Security Strategies through Game Play and Optimal Stopping
Kim Hammar
 
Intrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal StoppingIntrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal Stopping
Kim Hammar
 
Intrusion Prevention through Optimal Stopping and Self-Play
Intrusion Prevention through Optimal Stopping and Self-PlayIntrusion Prevention through Optimal Stopping and Self-Play
Intrusion Prevention through Optimal Stopping and Self-Play
Kim Hammar
 
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Kim Hammar
 

More from Kim Hammar (20)

Automated Intrusion Response - CDIS Spring Conference 2024
Automated Intrusion Response - CDIS Spring Conference 2024Automated Intrusion Response - CDIS Spring Conference 2024
Automated Intrusion Response - CDIS Spring Conference 2024
 
Automated Security Response through Online Learning with Adaptive Con jectures
Automated Security Response through Online Learning with Adaptive Con jecturesAutomated Security Response through Online Learning with Adaptive Con jectures
Automated Security Response through Online Learning with Adaptive Con jectures
 
Självlärande System för Cybersäkerhet. KTH
Självlärande System för Cybersäkerhet. KTHSjälvlärande System för Cybersäkerhet. KTH
Självlärande System för Cybersäkerhet. KTH
 
Learning Automated Intrusion Response
Learning Automated Intrusion ResponseLearning Automated Intrusion Response
Learning Automated Intrusion Response
 
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
Intrusion Tolerance for Networked Systems through Two-level Feedback ControlIntrusion Tolerance for Networked Systems through Two-level Feedback Control
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
 
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
 
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
 
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
 
Learning Optimal Intrusion Responses via Decomposition
Learning Optimal Intrusion Responses via DecompositionLearning Optimal Intrusion Responses via Decomposition
Learning Optimal Intrusion Responses via Decomposition
 
Digital Twins for Security Automation
Digital Twins for Security AutomationDigital Twins for Security Automation
Digital Twins for Security Automation
 
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
 
Självlärande system för cyberförsvar.
Självlärande system för cyberförsvar.Självlärande system för cyberförsvar.
Självlärande system för cyberförsvar.
 
Intrusion Response through Optimal Stopping
Intrusion Response through Optimal StoppingIntrusion Response through Optimal Stopping
Intrusion Response through Optimal Stopping
 
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
 
Self-Learning Systems for Cyber Defense
Self-Learning Systems for Cyber DefenseSelf-Learning Systems for Cyber Defense
Self-Learning Systems for Cyber Defense
 
Self-learning Intrusion Prevention Systems.
Self-learning Intrusion Prevention Systems.Self-learning Intrusion Prevention Systems.
Self-learning Intrusion Prevention Systems.
 
Learning Security Strategies through Game Play and Optimal Stopping
Learning Security Strategies through Game Play and Optimal StoppingLearning Security Strategies through Game Play and Optimal Stopping
Learning Security Strategies through Game Play and Optimal Stopping
 
Intrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal StoppingIntrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal Stopping
 
Intrusion Prevention through Optimal Stopping and Self-Play
Intrusion Prevention through Optimal Stopping and Self-PlayIntrusion Prevention through Optimal Stopping and Self-Play
Intrusion Prevention through Optimal Stopping and Self-Play
 
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
 

Recently uploaded

Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 

Recently uploaded (20)

Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 

Cdis hammar stadler_15_oct_2020

  • 1. Self-learning Systems for Cyber Security1 Kim Hammar & Rolf Stadler kimham@kth.se & stadler@kth.se Division of Network and Systems Engineering KTH Royal Institute of Technology October 15, 2020 1 Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and Self-Play”. In: International Conference on Network and Service Management (CNSM 2020) (CNSM 2020). Izmir, Turkey, Nov. 2020. Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 1 / 12
  • 2. Game Learning Programs Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 2 / 12
  • 3. Challenges: Evolving and Automated Attacks Challenges: Evolving & automated attacks Complex infrastructures 172.18.1.0/24 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
  • 4. Goal: Automation and Learning Challenges Evolving & automated attacks Complex infrastructures Our Goal: Automate security tasks Adapt to changing attack methods Security Agent Controller Observations Actions 172.18.1.0/24 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
  • 5. Approach: Game Model & Reinforcement Learning Challenges: Evolving & automated attacks Complex infrastructures Our Goal: Automate security tasks Adapt to changing attack methods Our Approach: Model network as Markov Game MG = hS, A1, . . . , AN , T , R1, . . . RN i Compute policies π for MG Incorporate π in self-learning systems Security Agent Controller st+1, rt+1 at π(a|s; θ) 172.18.1.0/24 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
  • 6. Related Work Game-Learning Programs: TD-Gammon2 , AlphaGo Zero3 , OpenAI Five etc. =⇒ Impressive empirical results of RL and self-play Network Security: Automated threat modeling4 , automated intrusion detection etc. =⇒ Need for automation and better security tooling Game Theory: Network Security: A Decision and Game-Theoretic Approach5 . =⇒ Many security operations involves strategic decision making 2 Gerald Tesauro. “TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play”. In: Neural Comput. 6.2 (Mar. 1994), 215–219. ISSN: 0899-7667. DOI: 10.1162/neco.1994.6.2.215. URL: https://doi.org/10.1162/neco.1994.6.2.215. 3 David Silver et al. “Mastering the game of Go without human knowledge”. In: Nature 550 (Oct. 2017), pp. 354–. URL: http://dx.doi.org/10.1038/nature24270. 4 Pontus Johnson, Robert Lagerström, and Mathias Ekstedt. “A Meta Language for Threat Modeling and Attack Simulations”. In: Proceedings of the 13th International Conference on Availability, Reliability and Security. ARES 2018. Hamburg, Germany: Association for Computing Machinery, 2018. ISBN: 9781450364485. DOI: 10.1145/3230833.3232799. URL: https://doi.org/10.1145/3230833.3232799. 5 Tansu Alpcan and Tamer Basar. Network Security: A Decision and Game-Theoretic Approach. 1st. USA: Cambridge University Press, 2010. ISBN: 0521119324. Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 4 / 12
  • 7. Outline Use Case Markov Game Model for Intrusion Prevention Reinforcement Learning Problem Method Results Conclusions Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 5 / 12
  • 8. Use Case: Intrusion Prevention A Defender owns a network infrastructure Consists of connected components Components run network services Defends by monitoring and patching An Attacker seeks to intrude on the infrastructure Has a partial view of the infrastructure Wants to compromise a specific component Attacks by reconnaissance and exploitation Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 6 / 12
  • 9. Markov Game Model for Intrusion Prevention (1) Network Infrastructure Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 10. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 11. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart (3) State space |S| = (w + 1)|N|·m·(m+1) SA 3 = [0, 0, 0, 0] SD 3 = [9, 1, 5, 8, 1] SA 0 = [0, 0, 0, 0] SD 0 = [0, 0, 0, 0, 0] SA 1 = [0, 0, 0, 0] SD 1 = [1, 9, 9, 8, 1] SA 2 = [0, 0, 0, 0] SD 2 = [9, 9, 9, 8, 1] Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 12. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart (3) State space |S| = (w + 1)|N|·m·(m+1) SA 3 = [0, 0, 0, 0] SD 3 = [9, 1, 5, 8, 1] SA 0 = [0, 0, 0, 0] SD 0 = [0, 0, 0, 0, 0] SA 1 = [0, 0, 0, 0] SD 1 = [1, 9, 9, 8, 1] SA 2 = [0, 0, 0, 0] SD 2 = [9, 9, 9, 8, 1] (4) Action space |A| = |N| · (m + 1) ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 13. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart (3) State space |S| = (w + 1)|N|·m·(m+1) SA 3 = [0, 0, 0, 0] SD 3 = [9, 1, 5, 8, 1] SA 0 = [0, 0, 0, 0] SD 0 = [0, 0, 0, 0, 0] SA 1 = [0, 0, 0, 0] SD 1 = [1, 9, 9, 8, 1] SA 2 = [0, 0, 0, 0] SD 2 = [9, 9, 9, 8, 1] (4) Action space |A| = |N| · (m + 1) ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 (5) Game Dynamics T , R1, R2, ρ0 4/0 0/1 8/0 ?/0 ?/0 ?/0 0/0 9/0 8/0 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 14. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart (3) State space |S| = (w + 1)|N|·m·(m+1) SA 3 = [0, 0, 0, 0] SD 3 = [9, 1, 5, 8, 1] SA 0 = [0, 0, 0, 0] SD 0 = [0, 0, 0, 0, 0] SA 1 = [0, 0, 0, 0] SD 1 = [1, 9, 9, 8, 1] SA 2 = [0, 0, 0, 0] SD 2 = [9, 9, 9, 8, 1] (4) Action space |A| = |N| · (m + 1) ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 (5) Game Dynamics T , R1, R2, ρ0 4/0 0/1 8/0 ?/0 ?/0 ?/0 0/0 9/0 8/0 ∴ - Markov game - Zero-sum - 2 players - Partially observed - Stochastic elements - Round-based MG = hS, A1, A2, T , R1, R2, γ, ρ0i Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 15. Automatic Learning of Security Strategies 1 2 2 2 q1 = 3 q2 = 4 q3 = 6 (18,18) (15,20) (9,18) q2 = 3 q2 = 4 q2 = 6 (20,15) (15,16) (8,12) q2 = 3 q2 = 4 q2 = 6 (18,9) (12,8) (0,0) q2 = 3 q2 = 4 q2 = 6 Finding strategies for the Markov game model: Evolutionary methods Computational game theory Self-Play Reinforcement learning Attacker vs Defender Strategies evolve without human intervention Motivation for Reinforcement Learning: Strong empirical results in related work Can adapt to new attack methods and threats Can be used for complex domains that are hard to model exactly Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 8 / 12
  • 16. The Reinforcement Learning Problem Goal: Approximate π∗ = arg maxπ E hPT t=0 γt rt+1 i Learning Algorithm: Represent π by πθ Define objective J(θ) = Eo∼ρπθ ,a∼πθ [R] Maximize J(θ) by stochastic gradient ascent with gradient ∇θJ(θ) = Eo∼ρπθ ,a∼πθ [∇θ log πθ(a|o)Aπθ (o, a)] Domain-Specific Challenges: Partial observability: captured in the model Large state space |S| = (w + 1)|N|·m·(m+1) Large action space |A| = |N| · (m + 1) Non-stationary Environment due to presence of adversary Agent Environment at st+1 rt+1 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
  • 17. The Reinforcement Learning Problem Goal: Approximate π∗ = arg maxπ E hPT t=0 γt rt+1 i Learning Algorithm: Represent π by πθ Define objective J(θ) = Eo∼ρπθ ,a∼πθ [R] Maximize J(θ) by stochastic gradient ascent with gradient ∇θJ(θ) = Eo∼ρπθ ,a∼πθ [∇θ log πθ(a|o)Aπθ (o, a)] Domain-Specific Challenges: Partial observability: captured in the model Large state space |S| = (w + 1)|N|·m·(m+1) Large action space |A| = |N| · (m + 1) Non-stationary Environment due to presence of adversary Agent Environment at st+1 rt+1 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
  • 18. The Reinforcement Learning Problem Goal: Approximate π∗ = arg maxπ E hPT t=0 γt rt+1 i Learning Algorithm: Represent π by πθ Define objective J(θ) = Eo∼ρπθ ,a∼πθ [R] Maximize J(θ) by stochastic gradient ascent with gradient ∇θJ(θ) = Eo∼ρπθ ,a∼πθ [∇θ log πθ(a|o)Aπθ (o, a)] Domain-Specific Challenges: Partial observability: captured in the model Large state space |S| = (w + 1)|N|·m·(m+1) Large action space |A| = |N| · (m + 1) Non-stationary Environment due to presence of adversary Agent Environment at st+1 rt+1 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
  • 19. Our Reinforcement Learning Method Policy Gradient & Function Approximation To deal with large state space S πθ parameterized by weights θ ∈ Rd of NN. PPO & REINFORCE (stochastic π) Auto-Regressive Policy Representation To deal with large action space A To minimize interference π(a, n|o) = π(a|n, o) · π(n|o) Opponent Pool To avoid overfitting Want agent to learn a general strategy Attacker π(a|s; θ2) Defender π(a|s; θ1) st+1 rt+1 ha1 t , a2 t i Opponent pool s0 s1 s2 . . . Markov Game Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
  • 20. Our Reinforcement Learning Method Policy Gradient & Function Approximation To deal with large state space S πθ parameterized by weights θ ∈ Rd of NN. PPO & REINFORCE (stochastic π) Auto-Regressive Policy Representation To deal with large action space A To minimize interference π(a, n|o) = π(a|n, o) · π(n|o) Opponent Pool To avoid overfitting Want agent to learn a general strategy Attacker π(a|s; θ2) Defender π(a|s; θ1) st+1 rt+1 ha1 t , a2 t i Opponent pool s0 s1 s2 . . . Markov Game Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
  • 21. Our Reinforcement Learning Method Policy Gradient & Function Approximation To deal with large state space S πθ parameterized by weights θ ∈ Rd of NN. PPO & REINFORCE (stochastic π) Auto-Regressive Policy Representation To deal with large action space A To minimize interference π(a, n|o) = π(a|n, o) · π(n|o) Opponent Pool To avoid overfitting Want agent to learn a general strategy Attacker π(a|s; θ2) Defender π(a|s; θ1) st+1 rt+1 ha1 t , a2 t i Opponent pool s0 s1 s2 . . . Markov Game Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
  • 22. Experimentation: Learning from Zero Knowledge 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 1.0 Attacker win probability AttackerAgent vs DefendMinimal, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 Attacker win probability AttackerAgent vs DefendMinimal, scenario 2 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 1.0 Attacker win probability AttackerAgent vs DefendMinimal, scenario 3 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.3 0.4 0.5 0.6 0.7 Attacker win probability DefenderAgent vs AttackMaximal, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability DefenderAgent vs AttackMaximal, scenario 2 REINFORCE PPO AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 Attacker win probability DefenderAgent vs AttackMaximal, scenario 3 REINFORCE PPO AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 2 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 3 REINFORCE PPO-AR PPO Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 11 / 12
  • 23. Experimentation: Learning from Zero Knowledge 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 1.0 Attacker win probability AttackerAgent vs DefendMinimal, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 Attacker win probability AttackerAgent vs DefendMinimal, scenario 2 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 1.0 Attacker win probability AttackerAgent vs DefendMinimal, scenario 3 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.3 0.4 0.5 0.6 0.7 Attacker win probability DefenderAgent vs AttackMaximal, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability DefenderAgent vs AttackMaximal, scenario 2 REINFORCE PPO AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 Attacker win probability DefenderAgent vs AttackMaximal, scenario 3 REINFORCE PPO AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 2 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 3 REINFORCE PPO-AR PPO Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 11 / 12
  • 24. Conclusion & Future Work Conclusions: We have proposed a Method to automatically find security strategies =⇒ Model as Markov game & evolve strategies using self-play reinforcement learning Addressed domain-specific challenges with Auto-regressive policy, opponent pool, and function approximation. Challenges of applied reinforcement learning Stable convergence remains a challenge Sample-efficiency is a problem Generalization is a challenge Current & Future Work: Study techniques for mitigation of identified RL challenges Learn security strategies by interacion with a cyber range Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 12 / 12
  • 25. Thank you All code for reproducing the results is open source: https://github.com/Limmen/gym-idsgame Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 12 / 12