SlideShare a Scribd company logo
1 of 48
Download to read offline
Automated Cloud-Native Incident Response with
Kubernetes and Service Mesh
Matt Turner
@mt165
Francesco Beltramini
@d1gital_f
@controlplaneio
Who we are
Engineer @ Tetrate
● Software Engineer by background
● Long-time Istio user
● Author of LinkedIn’s advanced K8s
course
Tetrate (Booth S98)
The Enterprise Service Mesh Company
● Founded in 2018. 120 people across 14
timezones
● Helping folks manage Istio at scale
● Providing security and compliance in
high-assurance environments
● Enabling cloud migrations and hybrid-cloud
setups
@controlplaneio
Who we are
ControlPlane (Hall 5 - Booth SU57)
Cloud Native and Open Source security consultancy
● Established in 2017
52 people across the UK, Europe, APAC and North America
● Security specialists in cloud, Kubernetes, containers, and
Open Source (we train too!)
● Focused on deeply “Threat Model-ed”, Secure-by-Design and
Secure-by-Default Cloud Native architectures
● Accustomed to work in highly-regulated environments
● Help customers bridging the gap between infra and SecOps
Security Engineering Manager @ ControlPlane
● Lots of studying
● Lots of IT/OT Security
● Lots of Security Ops
@controlplaneio
Agenda
● Security Incident Response 101
● Intelligence-driven defence (Kill Chain) and SOAR
● Cloud Native tech and concepts through a incident response lens
● Cloud Native response walkthrough
@controlplaneio
Security Incident Response 101
Incident:
“An event that could lead to the loss of, or disruption to, an organization's operations,
data, services or functions”.
“A security incident is an event that may indicate that an organization's systems or data
have been compromised, or that measures put in place to protect them have failed.”
Reponse:
A set of People, Process, Technology to identify, contain, eliminate and recover from
such events.
@controlplaneio
PEOPLE PROCESS TECHNOLOGY
Security Incident Response 101
Security Analysts
Security Engineers
Forensics
Managers
Define runbooks
Threat Intel dissemination
Assets isolation
Evidence gathering
Stakeholders comms
Sensors (IPS, EDR, ...)
CN Sensors (Falco, CloudTrail,
VPC Flowlogs...)
Log collection and processing
(SIEM)
Automation tech
@controlplaneio
Matt’s Security Glossary
● IoC - Indicator of Compromise - anything that points to the attack; payload, pwnd workload syscall profile,
etc, ...
● SOC - Security Operations Center - where the security response team sits
● Signal - anything security-related that we monitor
● SIEM - Security Information and Event Management - security alerts dashboard
● SOAR - Security Orchestration, Automation, and Response - workflow engine containing playbooks for
scripted incident response
@controlplaneio
Security Incident Response 101
Existing frameworks
SP 800-61 Rev. 2
@controlplaneio
NIST Incident Response Framework
Glossary
● Detection - watch for signals in SEIM
● Analysis - look at alert, fetch IoCs, if real,
○ Identify attack payload
○ Produce IoC checksum
● Containment - prevent further attacks / limit blast radius
○ E.g. Contain the attack to where it is, by removing its
potency - limit blast radius
○ E.g. deploy new firewall rules / feed Headers to WAF
● Eradication - clean-up anything which was compromised
● Recovery - restore normal service
@controlplaneio
Intelligence-driven Defense
Reactive event-driven approach insufficient against motivated adversaries.
Incident Response must adopt a Kill (attack) Chain perspective:
● Step-by-step approach that identifies and stops enemy activity.
● It no longer needs to be a purely reactive process.
● Implements intent-based response, behavior-based detection to get a step
ahead of adversaries.
● Critical to have the right Intelligence [Indicators of Compromise (IoC)].
Security Incident Response 101
@controlplaneio
Intelligence-driven Defense
Cyber Kill Chain
@controlplaneio
Intelligence-driven Defense
Tactics
Techniques
@controlplaneio
“If you know the enemy and know yourself, you need not
fear the result of a hundred battles. [...]”
Intelligence-driven Defense
Security Incident Response 101
“Know your enemy!”
@controlplaneio
Intelligence-driven Defense
SOAR
Security Orchestration, Automation and Response
Tech stack that enables an organization to collect data about security threats and
respond to security events with little or no human assistance. ADOPT KILLCHAIN.
Threat and vulnerability management
Security incident response
Security operations automation
@controlplaneio
Challenges:
● Complex!
● Reaction time is critical
● Technology interoperability
● Limited automation at times
Recap: Security Incident Response
@controlplaneio
Container? What container?
More challenges:
● Relatively new!
● Skills gap (fast-paced)
● Get observability right
● Deal with volatility / scaling
● Integration with teams’ practices
○ Infra provisioning
○ DevOps pipelines
Forensics
CN Security Incident Response
@controlplaneio
Pro:
● Advanced platform capabilities
● Automation
● GitOps
○ Audit trails
○ Reproducibility & Determinism
○ High privileged infra ops without high privs given to users
CN Security Incident Response
@controlplaneio
1) Rescheduling / Application recovery
2) Support for Custom Operators
3) GitOps workflows
4) RunTime Class / hardened runtimes
5) Ephemeral containers
6) Checkpointing (alpha) and Checkpoint/Restore In Userspace (CRIU)
CN Security Incident Response
Kubernetes benefits
@controlplaneio
1) Layer 7 networking - protocols parsed and understood
2) All traffic intercepted by the sidecar proxy - policy enforced at every hop
3) Full metadata and body logging
4) Fine-grained traffic control, e.g. based on L7 attributes of src / dst
CN Security Incident Response
Istio benefits
@controlplaneio
Cloud Native Incident Response Framework
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication and [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
Coming soon: Kubernetes 4 SOC threat library
● Crafted by CP-friend Abdullah Garcia @ JPMC
(@abdullahgarcia)
● Fused with content from ControlPlane’s internal threat libraries
#1 Preparation
Security Observability
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
● Envoy sends traffic logs to SIEM
● Detect anomaly via SIEM and confirmed unknown and suspicious
● Events could be generated by Cloud Native but also traditional sensors like firewalls or EDR
agents
● Escalate to SOAR and activate Cloud Native runbook
#2 Detection, Constraint, and Analysis
Lookup: ldap://evil.com/evilpayload
IR Activation
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
#2 Detection, Constraint, and Analysis
Preventative containment: buying time
Suspect Pod
● Freeze orchestration
○ Won’t get scaled down, updated, etc
● Block east-west network traffic
○ Attack can’t move laterally
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
Suspect Pod
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Freeze orchestration => Remove from its Deployment
kubectl patch pod ${pod} --type json --patch '[{ "op": "replace",
"path": "/metadata/labels/app", "value": "'${app}-isolated'" }]'
Cloud-native win: not disruptive
@controlplaneio
Suspect Pod
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Block east-west network traffic => Istio AuthorizationPolicy
@controlplaneio
Suspect Pod
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Block east-west network traffic => Istio AuthorizationPolicy
@controlplaneio
#2 Detection, Constraint, and Analysis
Preventative containment: buying time
Remaining workloads
● Respawn in hardened container runtime
Cloud-native win: none of this is disruptive to overall service
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Remaining Workloads
Respawn in hardened container runtime => gVisor
kubectl patch deployment ${dep} --type json --patch '[{ "op":
"replace", "path": "/spec/template/spec/runtimeClassName",
"value": "'gvisor'" }]'
Cloud-native win: not disruptive (if tested properly)
@controlplaneio
Investigation stations
Suspect Pod
● Verbosely log north-south traffic
○ Lets the attack continue
○ Gather IoC - to detect the attack in future
○ Possibly gather C2 addresses, exfil data, etc
● Container checkpoint
● Forensic tools
#2 Detection, Constraint, and Analysis
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
Suspect Pod
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Verbosely log network traffic => Envoy WASM filter
@controlplaneio
Suspect Pod
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Verbosely log network traffic => Envoy WASM filter
@controlplaneio
Container Checkpointing
● Uses CRIU - Linux “Checkpoint/Restore In Userspace”
● KEP-2008
● Supported by CRI-O
● Under discussion by containerd (PR 6965)
● Very alpha
● Imperative
https://criu.org
https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/
https://kubernetes.io/blog/2023/03/10/forensic-container-analysis/
https://github.com/containerd/containerd/pull/6965
https://github.com/kubernetes/enhancements/issues/2008
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Suspect Pod
@controlplaneio
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Suspect Pod
NET
PROC
Forensic tools => ephemeral debug container
kubectl debug -ti ${pod} --image=busybox --target=http-log
@controlplaneio
Investigation stations
Remaining workloads
● Verbose logging of network traffic
○ Same implementation again - might see IoC / attack IPs here
#2 Detection, Constraint, and Analysis
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
#2 Detection, Constraint, and Analysis - Cloud-Native Implementation
Remaining Workloads
Verbosely log network traffic => Envoy WASM filter
Same implementation as above
@controlplaneio
● Confirm true positive
● Keep harvesting IOCs (headers, body)
#2 Detection, Constraint, and Analysis
Confirm
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
● Re-configure firewall / WAF (PEPs - Policy Enforcement Points)
● Firewall is the Envoy sidecar
○ Now Layer 7
● WAF is Envoy plugin Coraza (https://github.com/corazawaf/coraza)
○ mod_security rules compatible
○ Needed for blocking bodies
● Present at every workload / hop
● Configure WAF to block identified payloads and malicious requests
● Maybe also block attacking / C2 IPs at the perimeter
Containment / Response strategy: KILL CHAIN
#3 Containment, Eradication, and Recovery
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
● Delete the definitely-compromised Pod(s) (now orphaned from Deployment)
● Now the attack is blocked and can’t reoccur…
● Restart the remaining workloads just in case
Clean up
#3 Containment, Eradication, and Recovery
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
Delete Pod => Imperative step
kubectl delete pod ${pod}
Compromised Pod
#3 Containment, Eradication, and Recovery - Cloud-Native Implementation
@controlplaneio
● Nothing altered service levels
Cloud-native win: probably not needed
Restore normal service
#3 Containment, Eradication, and [Recovery]
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint, Analysis
Step #3: Containment, Eradication, [Recovery]
Step #4: Post-Incident Activity
@controlplaneio
Restart all Pods =>
● Restore their runtimeClass, serviceAccount, etc
● Side-effect of doing a rolling update
Remaining Workloads
#3 Containment, Eradication, and Recovery - Cloud-Native Implementation
@controlplaneio
● Response actions assumed to be by human or SOAR runbooks
● SOAR isn’t ideal
○ Probably won’t have first-class support for k8s API
○ Shouldn’t have cluster-admin access
Automation
Sharpening the tools
@controlplaneio
“Response” - https://github.com/mt-inside/response
● Generator of YAMLs, issuer of Commands
● Would be nice if more things were declarative
○ DebugContainer resource
○ ContainerCheckpoint resource, like VolumeSnapshot*
Automation
Sharpening the tools
@controlplaneio
Automation
Sharpening the tools
@controlplaneio
“Response” - https://github.com/mt-inside/response
● Also an Operator
● GitOps - SOAR commits to git
○ Fits with existing CD pipeline
○ Audit log of responses from git history
○ SOAR doesn’t need any cluster access, let alone admin
● Can retry the imperative commands
Automation
Sharpening the tools
CRD
REPO
SOAR OP
@controlplaneio
Cloud-Native Incident Response Framework
Cloud Native Incident Response Steps
Step #1: Preparation
Step #2: Detection, Constraint and Analysis
Step #3: Containment, Eradication and [Recovery]
Step #4: Post-Incident Activity
Initial considerations:
● Fast!
● Can indeed reduce initial threats impact
● Requires careful considerations, very workload-dependent
● Probably not for all Organizations
@controlplaneio
Free Book!
Attacking clusters since 2017
@controlplaneio
Q&A
Thank you!
Feedback:

More Related Content

Similar to Automated Cloud-Native Incident Response with Kubernetes and Service Mesh

.conf Go Zurich 2022 - Security Session
.conf Go Zurich 2022 - Security Session.conf Go Zurich 2022 - Security Session
.conf Go Zurich 2022 - Security SessionSplunk
 
Cncf checkov and bridgecrew
Cncf checkov and bridgecrewCncf checkov and bridgecrew
Cncf checkov and bridgecrewLibbySchulze
 
Security and Advanced Automation in the Enterprise
Security and Advanced Automation in the EnterpriseSecurity and Advanced Automation in the Enterprise
Security and Advanced Automation in the EnterpriseAmazon Web Services
 
Shift Left Security
Shift Left SecurityShift Left Security
Shift Left SecurityBATbern
 
AWS Cloud Governance & Security through Automation - Atlanta AWS Builders
AWS Cloud Governance & Security through Automation - Atlanta AWS BuildersAWS Cloud Governance & Security through Automation - Atlanta AWS Builders
AWS Cloud Governance & Security through Automation - Atlanta AWS BuildersJames Strong
 
Security operation center (SOC)
Security operation center (SOC)Security operation center (SOC)
Security operation center (SOC)Ahmed Ayman
 
Avoiding the security brick
Avoiding the security brickAvoiding the security brick
Avoiding the security brickEqual Experts
 
Building world-class security response and secure development processes
Building world-class security response and secure development processesBuilding world-class security response and secure development processes
Building world-class security response and secure development processesDavid Jorm
 
Using Splunk or ELK for Auditing AWS/GCP/Azure Security posture
Using Splunk or ELK for Auditing AWS/GCP/Azure Security postureUsing Splunk or ELK for Auditing AWS/GCP/Azure Security posture
Using Splunk or ELK for Auditing AWS/GCP/Azure Security postureCloudVillage
 
Using Splunk/ELK for auditing AWS/GCP/Azure security posture
Using Splunk/ELK for auditing AWS/GCP/Azure security postureUsing Splunk/ELK for auditing AWS/GCP/Azure security posture
Using Splunk/ELK for auditing AWS/GCP/Azure security postureJose Hernandez
 
Securing Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container DataSecuring Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container DataMirantis
 
For Business's Sake, Let's focus on AppSec
For Business's Sake, Let's focus on AppSecFor Business's Sake, Let's focus on AppSec
For Business's Sake, Let's focus on AppSecLalit Kale
 
Pragmatic Pipeline Security
Pragmatic Pipeline SecurityPragmatic Pipeline Security
Pragmatic Pipeline SecurityJames Wickett
 
AWS live hack: Docker + Snyk Container on AWS
AWS live hack: Docker + Snyk Container on AWSAWS live hack: Docker + Snyk Container on AWS
AWS live hack: Docker + Snyk Container on AWSEric Smalling
 
Securing the Internet of Things - Hank Chavers
Securing the Internet of Things - Hank ChaversSecuring the Internet of Things - Hank Chavers
Securing the Internet of Things - Hank ChaversWithTheBest
 
An introduction to SOC (Security Operation Center)
An introduction to SOC (Security Operation Center)An introduction to SOC (Security Operation Center)
An introduction to SOC (Security Operation Center)Ahmad Haghighi
 
ATT&CKing the Sentinel – deploying a threat hunting capability on Azure Senti...
ATT&CKing the Sentinel – deploying a threat hunting capability on Azure Senti...ATT&CKing the Sentinel – deploying a threat hunting capability on Azure Senti...
ATT&CKing the Sentinel – deploying a threat hunting capability on Azure Senti...CloudVillage
 
IoT Cyber+Physical+Social Engineering Attack Security (v0.1.6 / sep2020)
IoT Cyber+Physical+Social Engineering Attack Security (v0.1.6 / sep2020)IoT Cyber+Physical+Social Engineering Attack Security (v0.1.6 / sep2020)
IoT Cyber+Physical+Social Engineering Attack Security (v0.1.6 / sep2020)mike parks
 
Plataforma de Operação e Simulação Cibernética
Plataforma de Operação e Simulação CibernéticaPlataforma de Operação e Simulação Cibernética
Plataforma de Operação e Simulação CibernéticaHamilton Oliveira
 

Similar to Automated Cloud-Native Incident Response with Kubernetes and Service Mesh (20)

.conf Go Zurich 2022 - Security Session
.conf Go Zurich 2022 - Security Session.conf Go Zurich 2022 - Security Session
.conf Go Zurich 2022 - Security Session
 
Cncf checkov and bridgecrew
Cncf checkov and bridgecrewCncf checkov and bridgecrew
Cncf checkov and bridgecrew
 
Security and Advanced Automation in the Enterprise
Security and Advanced Automation in the EnterpriseSecurity and Advanced Automation in the Enterprise
Security and Advanced Automation in the Enterprise
 
Shift Left Security
Shift Left SecurityShift Left Security
Shift Left Security
 
AWS Cloud Governance & Security through Automation - Atlanta AWS Builders
AWS Cloud Governance & Security through Automation - Atlanta AWS BuildersAWS Cloud Governance & Security through Automation - Atlanta AWS Builders
AWS Cloud Governance & Security through Automation - Atlanta AWS Builders
 
Security operation center (SOC)
Security operation center (SOC)Security operation center (SOC)
Security operation center (SOC)
 
Avoiding the security brick
Avoiding the security brickAvoiding the security brick
Avoiding the security brick
 
Building world-class security response and secure development processes
Building world-class security response and secure development processesBuilding world-class security response and secure development processes
Building world-class security response and secure development processes
 
Using Splunk or ELK for Auditing AWS/GCP/Azure Security posture
Using Splunk or ELK for Auditing AWS/GCP/Azure Security postureUsing Splunk or ELK for Auditing AWS/GCP/Azure Security posture
Using Splunk or ELK for Auditing AWS/GCP/Azure Security posture
 
Using Splunk/ELK for auditing AWS/GCP/Azure security posture
Using Splunk/ELK for auditing AWS/GCP/Azure security postureUsing Splunk/ELK for auditing AWS/GCP/Azure security posture
Using Splunk/ELK for auditing AWS/GCP/Azure security posture
 
Securing Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container DataSecuring Your Containers is Not Enough: How to Encrypt Container Data
Securing Your Containers is Not Enough: How to Encrypt Container Data
 
Logicalis Security Conference
Logicalis Security ConferenceLogicalis Security Conference
Logicalis Security Conference
 
For Business's Sake, Let's focus on AppSec
For Business's Sake, Let's focus on AppSecFor Business's Sake, Let's focus on AppSec
For Business's Sake, Let's focus on AppSec
 
Pragmatic Pipeline Security
Pragmatic Pipeline SecurityPragmatic Pipeline Security
Pragmatic Pipeline Security
 
AWS live hack: Docker + Snyk Container on AWS
AWS live hack: Docker + Snyk Container on AWSAWS live hack: Docker + Snyk Container on AWS
AWS live hack: Docker + Snyk Container on AWS
 
Securing the Internet of Things - Hank Chavers
Securing the Internet of Things - Hank ChaversSecuring the Internet of Things - Hank Chavers
Securing the Internet of Things - Hank Chavers
 
An introduction to SOC (Security Operation Center)
An introduction to SOC (Security Operation Center)An introduction to SOC (Security Operation Center)
An introduction to SOC (Security Operation Center)
 
ATT&CKing the Sentinel – deploying a threat hunting capability on Azure Senti...
ATT&CKing the Sentinel – deploying a threat hunting capability on Azure Senti...ATT&CKing the Sentinel – deploying a threat hunting capability on Azure Senti...
ATT&CKing the Sentinel – deploying a threat hunting capability on Azure Senti...
 
IoT Cyber+Physical+Social Engineering Attack Security (v0.1.6 / sep2020)
IoT Cyber+Physical+Social Engineering Attack Security (v0.1.6 / sep2020)IoT Cyber+Physical+Social Engineering Attack Security (v0.1.6 / sep2020)
IoT Cyber+Physical+Social Engineering Attack Security (v0.1.6 / sep2020)
 
Plataforma de Operação e Simulação Cibernética
Plataforma de Operação e Simulação CibernéticaPlataforma de Operação e Simulação Cibernética
Plataforma de Operação e Simulação Cibernética
 

More from Matt Turner

The Life of a Packet through Istio III
The Life of a Packet through Istio IIIThe Life of a Packet through Istio III
The Life of a Packet through Istio IIIMatt Turner
 
apiserver-Only "Clusters" for fun and profit
apiserver-Only "Clusters" for fun and profitapiserver-Only "Clusters" for fun and profit
apiserver-Only "Clusters" for fun and profitMatt Turner
 
Istio + SPIRE for cross-domain traffic trust in hybrid-cloud scenarios
Istio + SPIRE for cross-domain traffic trust in hybrid-cloud scenariosIstio + SPIRE for cross-domain traffic trust in hybrid-cloud scenarios
Istio + SPIRE for cross-domain traffic trust in hybrid-cloud scenariosMatt Turner
 
Why Is Istio That Shape?
Why Is Istio That Shape?Why Is Istio That Shape?
Why Is Istio That Shape?Matt Turner
 
Dynamically Testing Individual Microservice Releases In Production
  Dynamically Testing Individual Microservice Releases In Production  Dynamically Testing Individual Microservice Releases In Production
Dynamically Testing Individual Microservice Releases In ProductionMatt Turner
 
Gateway APIs, Envoy Gateway, and API Gateways
Gateway APIs, Envoy Gateway, and API GatewaysGateway APIs, Envoy Gateway, and API Gateways
Gateway APIs, Envoy Gateway, and API GatewaysMatt Turner
 
The Life of a Packet III - Service Mesh London
The Life of a Packet III - Service Mesh LondonThe Life of a Packet III - Service Mesh London
The Life of a Packet III - Service Mesh LondonMatt Turner
 
Cloud-Native Progressive Delivery
Cloud-Native Progressive DeliveryCloud-Native Progressive Delivery
Cloud-Native Progressive DeliveryMatt Turner
 
An Introduction to Bazel
An Introduction to BazelAn Introduction to Bazel
An Introduction to BazelMatt Turner
 
Networks, Linux, Containers, Pods
Networks, Linux, Containers, PodsNetworks, Linux, Containers, Pods
Networks, Linux, Containers, PodsMatt Turner
 
What is a Service Mesh and what can it do for your Microservices
What is a Service Mesh and what can it do for your MicroservicesWhat is a Service Mesh and what can it do for your Microservices
What is a Service Mesh and what can it do for your MicroservicesMatt Turner
 
Debugging an RBAC Problem in Istio
Debugging an RBAC Problem in IstioDebugging an RBAC Problem in Istio
Debugging an RBAC Problem in IstioMatt Turner
 
Running Resillient Workloads with Istio - KubeCon China 2019
Running Resillient Workloads with Istio - KubeCon China 2019Running Resillient Workloads with Istio - KubeCon China 2019
Running Resillient Workloads with Istio - KubeCon China 2019Matt Turner
 
Software Networking and Interfaces on Linux
Software Networking and Interfaces on LinuxSoftware Networking and Interfaces on Linux
Software Networking and Interfaces on LinuxMatt Turner
 
Running Resillient Workloads with Istio - OpenInfra Days 2019
Running Resillient Workloads with Istio - OpenInfra Days 2019Running Resillient Workloads with Istio - OpenInfra Days 2019
Running Resillient Workloads with Istio - OpenInfra Days 2019Matt Turner
 
The Life of a Packet through Istio - DevExperience Romania, April 2019
The Life of a Packet through Istio - DevExperience Romania, April 2019The Life of a Packet through Istio - DevExperience Romania, April 2019
The Life of a Packet through Istio - DevExperience Romania, April 2019Matt Turner
 
The life of a packet through Istio - QCon London 2019
The life of a packet through Istio - QCon London 2019The life of a packet through Istio - QCon London 2019
The life of a packet through Istio - QCon London 2019Matt Turner
 
Do You Need a Service Mesh? @ London Devops, January 2019
Do You Need a Service Mesh? @ London Devops, January 2019Do You Need a Service Mesh? @ London Devops, January 2019
Do You Need a Service Mesh? @ London Devops, January 2019Matt Turner
 
Istio, The Packet's-Eye View - KubeCon NA 2018
Istio, The Packet's-Eye View - KubeCon NA 2018Istio, The Packet's-Eye View - KubeCon NA 2018
Istio, The Packet's-Eye View - KubeCon NA 2018Matt Turner
 
The life of a packet through Istio
The life of a packet through IstioThe life of a packet through Istio
The life of a packet through IstioMatt Turner
 

More from Matt Turner (20)

The Life of a Packet through Istio III
The Life of a Packet through Istio IIIThe Life of a Packet through Istio III
The Life of a Packet through Istio III
 
apiserver-Only "Clusters" for fun and profit
apiserver-Only "Clusters" for fun and profitapiserver-Only "Clusters" for fun and profit
apiserver-Only "Clusters" for fun and profit
 
Istio + SPIRE for cross-domain traffic trust in hybrid-cloud scenarios
Istio + SPIRE for cross-domain traffic trust in hybrid-cloud scenariosIstio + SPIRE for cross-domain traffic trust in hybrid-cloud scenarios
Istio + SPIRE for cross-domain traffic trust in hybrid-cloud scenarios
 
Why Is Istio That Shape?
Why Is Istio That Shape?Why Is Istio That Shape?
Why Is Istio That Shape?
 
Dynamically Testing Individual Microservice Releases In Production
  Dynamically Testing Individual Microservice Releases In Production  Dynamically Testing Individual Microservice Releases In Production
Dynamically Testing Individual Microservice Releases In Production
 
Gateway APIs, Envoy Gateway, and API Gateways
Gateway APIs, Envoy Gateway, and API GatewaysGateway APIs, Envoy Gateway, and API Gateways
Gateway APIs, Envoy Gateway, and API Gateways
 
The Life of a Packet III - Service Mesh London
The Life of a Packet III - Service Mesh LondonThe Life of a Packet III - Service Mesh London
The Life of a Packet III - Service Mesh London
 
Cloud-Native Progressive Delivery
Cloud-Native Progressive DeliveryCloud-Native Progressive Delivery
Cloud-Native Progressive Delivery
 
An Introduction to Bazel
An Introduction to BazelAn Introduction to Bazel
An Introduction to Bazel
 
Networks, Linux, Containers, Pods
Networks, Linux, Containers, PodsNetworks, Linux, Containers, Pods
Networks, Linux, Containers, Pods
 
What is a Service Mesh and what can it do for your Microservices
What is a Service Mesh and what can it do for your MicroservicesWhat is a Service Mesh and what can it do for your Microservices
What is a Service Mesh and what can it do for your Microservices
 
Debugging an RBAC Problem in Istio
Debugging an RBAC Problem in IstioDebugging an RBAC Problem in Istio
Debugging an RBAC Problem in Istio
 
Running Resillient Workloads with Istio - KubeCon China 2019
Running Resillient Workloads with Istio - KubeCon China 2019Running Resillient Workloads with Istio - KubeCon China 2019
Running Resillient Workloads with Istio - KubeCon China 2019
 
Software Networking and Interfaces on Linux
Software Networking and Interfaces on LinuxSoftware Networking and Interfaces on Linux
Software Networking and Interfaces on Linux
 
Running Resillient Workloads with Istio - OpenInfra Days 2019
Running Resillient Workloads with Istio - OpenInfra Days 2019Running Resillient Workloads with Istio - OpenInfra Days 2019
Running Resillient Workloads with Istio - OpenInfra Days 2019
 
The Life of a Packet through Istio - DevExperience Romania, April 2019
The Life of a Packet through Istio - DevExperience Romania, April 2019The Life of a Packet through Istio - DevExperience Romania, April 2019
The Life of a Packet through Istio - DevExperience Romania, April 2019
 
The life of a packet through Istio - QCon London 2019
The life of a packet through Istio - QCon London 2019The life of a packet through Istio - QCon London 2019
The life of a packet through Istio - QCon London 2019
 
Do You Need a Service Mesh? @ London Devops, January 2019
Do You Need a Service Mesh? @ London Devops, January 2019Do You Need a Service Mesh? @ London Devops, January 2019
Do You Need a Service Mesh? @ London Devops, January 2019
 
Istio, The Packet's-Eye View - KubeCon NA 2018
Istio, The Packet's-Eye View - KubeCon NA 2018Istio, The Packet's-Eye View - KubeCon NA 2018
Istio, The Packet's-Eye View - KubeCon NA 2018
 
The life of a packet through Istio
The life of a packet through IstioThe life of a packet through Istio
The life of a packet through Istio
 

Recently uploaded

Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 

Recently uploaded (20)

Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 

Automated Cloud-Native Incident Response with Kubernetes and Service Mesh

  • 1. Automated Cloud-Native Incident Response with Kubernetes and Service Mesh Matt Turner @mt165 Francesco Beltramini @d1gital_f
  • 2. @controlplaneio Who we are Engineer @ Tetrate ● Software Engineer by background ● Long-time Istio user ● Author of LinkedIn’s advanced K8s course Tetrate (Booth S98) The Enterprise Service Mesh Company ● Founded in 2018. 120 people across 14 timezones ● Helping folks manage Istio at scale ● Providing security and compliance in high-assurance environments ● Enabling cloud migrations and hybrid-cloud setups
  • 3. @controlplaneio Who we are ControlPlane (Hall 5 - Booth SU57) Cloud Native and Open Source security consultancy ● Established in 2017 52 people across the UK, Europe, APAC and North America ● Security specialists in cloud, Kubernetes, containers, and Open Source (we train too!) ● Focused on deeply “Threat Model-ed”, Secure-by-Design and Secure-by-Default Cloud Native architectures ● Accustomed to work in highly-regulated environments ● Help customers bridging the gap between infra and SecOps Security Engineering Manager @ ControlPlane ● Lots of studying ● Lots of IT/OT Security ● Lots of Security Ops
  • 4. @controlplaneio Agenda ● Security Incident Response 101 ● Intelligence-driven defence (Kill Chain) and SOAR ● Cloud Native tech and concepts through a incident response lens ● Cloud Native response walkthrough
  • 5. @controlplaneio Security Incident Response 101 Incident: “An event that could lead to the loss of, or disruption to, an organization's operations, data, services or functions”. “A security incident is an event that may indicate that an organization's systems or data have been compromised, or that measures put in place to protect them have failed.” Reponse: A set of People, Process, Technology to identify, contain, eliminate and recover from such events.
  • 6. @controlplaneio PEOPLE PROCESS TECHNOLOGY Security Incident Response 101 Security Analysts Security Engineers Forensics Managers Define runbooks Threat Intel dissemination Assets isolation Evidence gathering Stakeholders comms Sensors (IPS, EDR, ...) CN Sensors (Falco, CloudTrail, VPC Flowlogs...) Log collection and processing (SIEM) Automation tech
  • 7. @controlplaneio Matt’s Security Glossary ● IoC - Indicator of Compromise - anything that points to the attack; payload, pwnd workload syscall profile, etc, ... ● SOC - Security Operations Center - where the security response team sits ● Signal - anything security-related that we monitor ● SIEM - Security Information and Event Management - security alerts dashboard ● SOAR - Security Orchestration, Automation, and Response - workflow engine containing playbooks for scripted incident response
  • 8. @controlplaneio Security Incident Response 101 Existing frameworks SP 800-61 Rev. 2
  • 9. @controlplaneio NIST Incident Response Framework Glossary ● Detection - watch for signals in SEIM ● Analysis - look at alert, fetch IoCs, if real, ○ Identify attack payload ○ Produce IoC checksum ● Containment - prevent further attacks / limit blast radius ○ E.g. Contain the attack to where it is, by removing its potency - limit blast radius ○ E.g. deploy new firewall rules / feed Headers to WAF ● Eradication - clean-up anything which was compromised ● Recovery - restore normal service
  • 10. @controlplaneio Intelligence-driven Defense Reactive event-driven approach insufficient against motivated adversaries. Incident Response must adopt a Kill (attack) Chain perspective: ● Step-by-step approach that identifies and stops enemy activity. ● It no longer needs to be a purely reactive process. ● Implements intent-based response, behavior-based detection to get a step ahead of adversaries. ● Critical to have the right Intelligence [Indicators of Compromise (IoC)]. Security Incident Response 101
  • 13. @controlplaneio “If you know the enemy and know yourself, you need not fear the result of a hundred battles. [...]” Intelligence-driven Defense Security Incident Response 101 “Know your enemy!”
  • 14. @controlplaneio Intelligence-driven Defense SOAR Security Orchestration, Automation and Response Tech stack that enables an organization to collect data about security threats and respond to security events with little or no human assistance. ADOPT KILLCHAIN. Threat and vulnerability management Security incident response Security operations automation
  • 15. @controlplaneio Challenges: ● Complex! ● Reaction time is critical ● Technology interoperability ● Limited automation at times Recap: Security Incident Response
  • 16. @controlplaneio Container? What container? More challenges: ● Relatively new! ● Skills gap (fast-paced) ● Get observability right ● Deal with volatility / scaling ● Integration with teams’ practices ○ Infra provisioning ○ DevOps pipelines Forensics CN Security Incident Response
  • 17. @controlplaneio Pro: ● Advanced platform capabilities ● Automation ● GitOps ○ Audit trails ○ Reproducibility & Determinism ○ High privileged infra ops without high privs given to users CN Security Incident Response
  • 18. @controlplaneio 1) Rescheduling / Application recovery 2) Support for Custom Operators 3) GitOps workflows 4) RunTime Class / hardened runtimes 5) Ephemeral containers 6) Checkpointing (alpha) and Checkpoint/Restore In Userspace (CRIU) CN Security Incident Response Kubernetes benefits
  • 19. @controlplaneio 1) Layer 7 networking - protocols parsed and understood 2) All traffic intercepted by the sidecar proxy - policy enforced at every hop 3) Full metadata and body logging 4) Fine-grained traffic control, e.g. based on L7 attributes of src / dst CN Security Incident Response Istio benefits
  • 20. @controlplaneio Cloud Native Incident Response Framework Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication and [Recovery] Step #4: Post-Incident Activity
  • 21. @controlplaneio Coming soon: Kubernetes 4 SOC threat library ● Crafted by CP-friend Abdullah Garcia @ JPMC (@abdullahgarcia) ● Fused with content from ControlPlane’s internal threat libraries #1 Preparation Security Observability Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 22. @controlplaneio ● Envoy sends traffic logs to SIEM ● Detect anomaly via SIEM and confirmed unknown and suspicious ● Events could be generated by Cloud Native but also traditional sensors like firewalls or EDR agents ● Escalate to SOAR and activate Cloud Native runbook #2 Detection, Constraint, and Analysis Lookup: ldap://evil.com/evilpayload IR Activation Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 23. @controlplaneio #2 Detection, Constraint, and Analysis Preventative containment: buying time Suspect Pod ● Freeze orchestration ○ Won’t get scaled down, updated, etc ● Block east-west network traffic ○ Attack can’t move laterally Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 24. @controlplaneio Suspect Pod #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Freeze orchestration => Remove from its Deployment kubectl patch pod ${pod} --type json --patch '[{ "op": "replace", "path": "/metadata/labels/app", "value": "'${app}-isolated'" }]' Cloud-native win: not disruptive
  • 25. @controlplaneio Suspect Pod #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Block east-west network traffic => Istio AuthorizationPolicy
  • 26. @controlplaneio Suspect Pod #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Block east-west network traffic => Istio AuthorizationPolicy
  • 27. @controlplaneio #2 Detection, Constraint, and Analysis Preventative containment: buying time Remaining workloads ● Respawn in hardened container runtime Cloud-native win: none of this is disruptive to overall service Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 28. @controlplaneio #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Remaining Workloads Respawn in hardened container runtime => gVisor kubectl patch deployment ${dep} --type json --patch '[{ "op": "replace", "path": "/spec/template/spec/runtimeClassName", "value": "'gvisor'" }]' Cloud-native win: not disruptive (if tested properly)
  • 29. @controlplaneio Investigation stations Suspect Pod ● Verbosely log north-south traffic ○ Lets the attack continue ○ Gather IoC - to detect the attack in future ○ Possibly gather C2 addresses, exfil data, etc ● Container checkpoint ● Forensic tools #2 Detection, Constraint, and Analysis Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 30. @controlplaneio Suspect Pod #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Verbosely log network traffic => Envoy WASM filter
  • 31. @controlplaneio Suspect Pod #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Verbosely log network traffic => Envoy WASM filter
  • 32. @controlplaneio Container Checkpointing ● Uses CRIU - Linux “Checkpoint/Restore In Userspace” ● KEP-2008 ● Supported by CRI-O ● Under discussion by containerd (PR 6965) ● Very alpha ● Imperative https://criu.org https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ https://kubernetes.io/blog/2023/03/10/forensic-container-analysis/ https://github.com/containerd/containerd/pull/6965 https://github.com/kubernetes/enhancements/issues/2008 #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Suspect Pod
  • 33. @controlplaneio #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Suspect Pod NET PROC Forensic tools => ephemeral debug container kubectl debug -ti ${pod} --image=busybox --target=http-log
  • 34. @controlplaneio Investigation stations Remaining workloads ● Verbose logging of network traffic ○ Same implementation again - might see IoC / attack IPs here #2 Detection, Constraint, and Analysis Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 35. @controlplaneio #2 Detection, Constraint, and Analysis - Cloud-Native Implementation Remaining Workloads Verbosely log network traffic => Envoy WASM filter Same implementation as above
  • 36. @controlplaneio ● Confirm true positive ● Keep harvesting IOCs (headers, body) #2 Detection, Constraint, and Analysis Confirm Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 37. @controlplaneio ● Re-configure firewall / WAF (PEPs - Policy Enforcement Points) ● Firewall is the Envoy sidecar ○ Now Layer 7 ● WAF is Envoy plugin Coraza (https://github.com/corazawaf/coraza) ○ mod_security rules compatible ○ Needed for blocking bodies ● Present at every workload / hop ● Configure WAF to block identified payloads and malicious requests ● Maybe also block attacking / C2 IPs at the perimeter Containment / Response strategy: KILL CHAIN #3 Containment, Eradication, and Recovery Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 38. @controlplaneio ● Delete the definitely-compromised Pod(s) (now orphaned from Deployment) ● Now the attack is blocked and can’t reoccur… ● Restart the remaining workloads just in case Clean up #3 Containment, Eradication, and Recovery Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 39. @controlplaneio Delete Pod => Imperative step kubectl delete pod ${pod} Compromised Pod #3 Containment, Eradication, and Recovery - Cloud-Native Implementation
  • 40. @controlplaneio ● Nothing altered service levels Cloud-native win: probably not needed Restore normal service #3 Containment, Eradication, and [Recovery] Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint, Analysis Step #3: Containment, Eradication, [Recovery] Step #4: Post-Incident Activity
  • 41. @controlplaneio Restart all Pods => ● Restore their runtimeClass, serviceAccount, etc ● Side-effect of doing a rolling update Remaining Workloads #3 Containment, Eradication, and Recovery - Cloud-Native Implementation
  • 42. @controlplaneio ● Response actions assumed to be by human or SOAR runbooks ● SOAR isn’t ideal ○ Probably won’t have first-class support for k8s API ○ Shouldn’t have cluster-admin access Automation Sharpening the tools
  • 43. @controlplaneio “Response” - https://github.com/mt-inside/response ● Generator of YAMLs, issuer of Commands ● Would be nice if more things were declarative ○ DebugContainer resource ○ ContainerCheckpoint resource, like VolumeSnapshot* Automation Sharpening the tools
  • 45. @controlplaneio “Response” - https://github.com/mt-inside/response ● Also an Operator ● GitOps - SOAR commits to git ○ Fits with existing CD pipeline ○ Audit log of responses from git history ○ SOAR doesn’t need any cluster access, let alone admin ● Can retry the imperative commands Automation Sharpening the tools CRD REPO SOAR OP
  • 46. @controlplaneio Cloud-Native Incident Response Framework Cloud Native Incident Response Steps Step #1: Preparation Step #2: Detection, Constraint and Analysis Step #3: Containment, Eradication and [Recovery] Step #4: Post-Incident Activity Initial considerations: ● Fast! ● Can indeed reduce initial threats impact ● Requires careful considerations, very workload-dependent ● Probably not for all Organizations