Andy has made mistakes. He's seen even more. And in this talk he details the best and the worst of the container and Kubernetes security problems he's experienced, exploited, and remediated.
This talk details low level exploitable issues with container and Kubernetes deployments. We focus on lessons learned, and show attendees how to ensure that they do not fall victim to avoidable attacks.
See how to bypass security controls and exploit insecure defaults in this technical appraisal of the container and cluster security landscape.
6. ● Applications, infrastructure,
security, and policy are all defined
as code
● Everything is built as a type of
software
● Similar controls can be applied to
entire classes of software,
containers, and systems (static
analysis, composition scanning
etc)
Everything is Software
7. DoD Enterprise DevSecOps Reference Design
Securing the entire application and
infrastructure lifecycle
● Collaboration
● Automation
● Containerisation
● Testing
SecOps
8. Third Party Code Risk and Supply Chain Security
● Build stages and their artefacts
can be cryptographically signed to
provide a chain of trust
● Software dependencies pulled into
a secure build environment should
be scanned for CVEs and their
signatures verified where available
9. Defending
● Prevent network egress
● Isolate from the host's kernel
● Execute RUN commands a
non-root user in container
filesystem
● Run build process as a non-root
user
○ or in a user namespace
● Share nothing non-essential
11. kubesim.io - K8S Hacking
and Hardening Simulator
● infrastructure deployment
● cluster provisioning and
workload configuration
● scenario runner with
challenges, hints, and scoring
● raw command line
experience
● open source core at
https://github.com/kubernetes-simulator/simulator
●
12. What are we doing today?
● Burning some things
○ And trying to extinguish the flames
● For each attack
○ Intro
○ Demo
○ Remediate
○ DevSecOps-ify
13. Docker and Kube
● Kube doesn't love us
● Y so difficult?
● What's the problem?
24. kubesec.io - example insecure pod
[
{
"object": "Pod/kubesec-demo.default",
"valid": true,
"message": "Passed with a score of 1 points",
"score": 1,
"scoring": {
"advise": [
{
"selector": "containers[] .securityContext .capabilities .drop",
"reason": "Reducing kernel capabilities available to a container limits its attack
surface"
},
{
"selector": ".spec .serviceAccountName",
"reason": "Service accounts restrict Kubernetes API access and should be configured
with least privilege"
},
{
"selector": "containers[] .resources .requests .cpu",
"reason": "Enforcing CPU requests aids a fair balancing of resources across the
cluster"
},
...
25. Pod YAML - Isolation-Breaking Configs
● SecurityContexts: use for pods and containers
● Dangerous pod configurations;
○ Running as root (no user namespace support in Kubernetes)
○ Privileged (can perform root operations on the host)
○ No seccomp/AppArmor profile (unlike Docker, no default profile in Kubernetes)
○ Full RBAC (administrative access to cluster)
○ Mounting host volumes (resource contention, side-channel communication)
○ Sharing host pid, network, or IPC namespaces (can lead to escalation)
○ AllowPrivilegeEscalation (permits escalating to root inside the container)
○ Excess capabilities (violates least privilege)
○ No cgroups (unbounded resource consumption)
29. What is this?
● DirtyCOW. A copy-on-write
vulnerability in the kernel from
2016
● Allows a malicious user to gain
root on the host from inside a
container
● Was being exploited in the wild
30. "One of the sites I manage was compromised, and an exploit of this
issue was uploaded and executed. A few years ago I started
packet capturing all inbound HTTP traffic and was able to
extract the exploit and test it out in a sandbox"
http://www.v3.co.uk/v3-uk/news/2474845/linux-users-urged-to-prot
ect-against-dirty-cow-security-flaw
Dirty COW (2016)
31. Why is it bad?
● It pretty much hoses down your system and flushes the corpse
● Linux kernel since 2.6.22 (July 2007)
○ Fixed in 4.8.3, 4.7.9, 4.4.26 or newer (Oct 2016)
● All Docker versions
○ it’s a kernel bug: container syscalls hit host
34. Security Contexts
● DAC (discretionary access control)
○ File system permissions, the basis of linux security
● Capabilities
○ Subdivides the full set of root capabilities into smaller buckets
○ Not perfect, but can limit the shape of the privilege that the user has
○ Beware CAP_SYS_ADMIN - the capability bucket
● Sandboxing
○ seccomp-bpf (user-supplied code running in kernel)
○ Filtering syscalls to reduce attack surface. Disallowed system calls get SIGKILLed
○ Seccomp is enabled by default in Docker, but NOT IN KUBERNETES
○ It should be mandated via a PodSecurityPolicy
● Mandatory Access Control
○ SELinux (RHEL/CentOS only)
○ AppArmor (Ubuntu, Debian and derivatives) - defaults on in Docker, off in Kubernetes
35. Docker’s Default seccomp Profile
● Docker uses a JSON DSL for seccomp profiles that compile down to eBPF (i.e.
seccomp-bpf), and are run in the kernel
● Only whitelisted system calls are permitted
● Docker’s default seccomp whitelist blocks some dangerous system calls:
○ add_key, keyctl, request_key: Prevent containers from using the kernel
keyring, which is not namespaced
○ clone, unshare: Deny cloning new namespaces. Also gated by
CAP_SYS_ADMIN for CLONE_* flags, except CLONE_USERNS.
● “I specifically wanted to block cloning new user namespaces inside containers because
they are notorious for being points of entry for kernel bugs”
● Seccomp security profiles for Docker
https://blog.jessfraz.com/post/a-rant-on-usable-security/
36. Writing Effective Seccomp Profiles
● Dynamic
○ "Observational" or "learning" security systems will watch your application's behaviour at
runtime in a pre-prod environment and generate a profile
○ This has limited value, as it requires a pre-prod system to demonstrate the complete set of
behaviours it exhibits in production - this is a sadly unrealistic goal
○ Some distance can be made with this approach and a comprehensive test suite, but logging,
crashing, and stress-related behaviours are unlikely to be comprehensively covered, resulting
in potential production downtime
○ This approach can be used to inform the final policy
● Static
○ Laborious
○ Requires in-depth knowledge of the linux syscall interface
42. lesson: patch your hosts
● the kernel is the basis of container
security
● containers doesn't really exist!
● as our reputable track hosts from
Canonical's LXD team put it --
containers are a userspace fiction
43. containers are a userspace fiction
● there is no kernel representation of a container, it is an emergent property of a
collection of stimuli and restrictions born from unintelligent design and years
of evolution
● a lot like consciousness
45. Security Test Suite!
● Testing is a Dark Art
● Anything can be a security test
● Arrange (set env)
● Act (perform test and capture result)
● Assert (does this match expectation?)
● Prove test fails as expected
● Beware acceptance testing and push that test as low as it can go
47. testing: out of date versions
● well, this test is simple. What kernel version am I running? And what docker
version?
● fancy security tooling will do this for you, OR you can use some basic tests for
your node
48. goss
● goss is "go serverspec"
● it's not the ultimate test tool, but it's pretty good
50. TESTING IS COOL
● It’s how we “prove” we’re secure
● Against known quantities
● Test enough to install confidence
● “Goldilocks Test Suites” -- not too many, not too few
55. binaryedge
● this platform has already
portscanned the IPv4 address
space for us
● indexed all the content
● and preemptively attacked it
● THIS MAY NOT BE LEGAL IN
YOUR JURISDICTION
● you may not care
● but sadly, on stage and on
camera, I do
56. binaryedge
THIS MAY NOT BE LEGAL
IN YOUR JURISDICTION
● this platform has already
portscanned the IPv4 address
space for us
● indexed all the content
● and preemptively attacked it
60. The Banner Banhammer
● let's test the API server and see if it's leaking
○ Use my nice nmap script https://gist.github.com/sublimino/c357379369808d0f77d3e2fe86fd4611
62. Watch It Burn
● unauthenticated client
● malicious websockets upgrade
● potential escalation of privilege
63. Here cometh the Lesson
● Tin foil hats are cool
○ DEFCON says so
○ Defence in depth isn’t going anywhere
● Don't run a Kubernetes API server
endpoint on the public internet!!!1one
68. # CVE-2019-11253
# https://github.com/kubernetes/kubernetes/issues/83253
# Shout out: @raesene for poc collab, @iancoldwater + @mauilion for
# HONKing inspiration and other guidance.
# Description: In Kubernetes 1.13 and below, the default configuration
# is that system:anonymous can request a selfsubjectaccessreview
# via mechanisms such as "kubectl auth can-i". This request can
# include POSTed YAML, and just the act of trying to parse it causes
# excessive memory usage by the API server. Anywhere from about 10
# to 100 concurrent requests of this nature can overwhelm the API
# server's resources and cause it to become unresponsive to the point
# that the worker nodes and user's running kubectl will believe the
# control plane is offline. Since requests can last up to 60s by
# default before the timeout kicks in, sustaining the attack only
# requires between 10 and ~100 requests per minute.
69. # CVE-2019-11253
# Recommendation: Update Kubernetes to a release that includes YAML
# parsing resource limits and limit direct, public access to API
# servers. See the above GH issue for details.
# https://github.com/kubernetes/kubernetes/issues/83253
72. Network Policy Testing: nmap with Netassert.io
k8s: # used for Kubernetes pods
deployment: # only deployments currently supported
test-frontend: # pod name, defaults to `default` namespace
test-microservice: 80 # `test-microservice` is the DNS name of the target service
test-database: -80 # test-frontend should not be able to access test-database port 80
new-namespace:test-microservice: # `new-namespace` is the namespace name
test-database.new-namespace: 80 # longer DNS names can be used for other namespaces
test-frontend.default: 80
default:test-database:
test-frontend.default.svc.cluster.local: 80 # full DNS names can be used
test-microservice.default.svc.cluster.local: -80
https://github.com/controlplaneio/netassert#configuration
Network security testing with highly parallelised nmap
https://github.com/controlplaneio/netassert
73. bats-core
● Test library for Bash
● TEST EVERYTHING
EVERYWHERE
● https://github.com/bats-core/
bats-core/
82. De-auditing K8S
● Reconfigure API server
● Blackhole traffic to remote
logging endpoint
● DOS remote logging endpoint
https://monzo.com/blog/we-built-network-isolation-for-1-500-services
85. kubesim.io - K8S Hacking
and Hardening Simulator
● infrastructure deployment
● cluster provisioning and
workload configuration
● scenario runner with
challenges, hints, and scoring
● raw command line
experience
● open source core at
https://github.com/kubernetes-simulator/simulator
●
87. How to Train
your Red Team
(for Cloud
Native)
● Hosted: kubesim.io
● Open source:
https://github.com/kubernete
s-simulator/simulator
● Attack Trees:
https://github.com/cncf/finan
cial-user-group/tree/master/p
rojects/k8s-threat-model
● Training:
https://control-plane.io/
89. Recap: Applications in Kubernetes
● Just applications, linux processes, memory, and filesystems
● More granular security profile
○ Easier to harden
○ Security boundary is the container process or pod, not the whole instance
○ Controlled networking environment
90. Layers of Security Testing
● Infrastructure: server hardening/conformance
● Supply chain: image validation, Kubernetes deployment YAML validation
● Runtime: application behaviour, telemetry, session handling, and networking
91. What's the Problem with Security Testing?
● What's the Problem with Security Testing?
○ False Positives
● Why?
○ Defensive security measures and nondeterministic test environment
● Mitigation
○ Reduce signal-to-noise ratio with targeted testing and retry budgets
92. Getting the House in Order
● Ensure that applications have a local testing story for developers
○ Build server verifies tests that developers are able to run locally
○ This proves out mocking and stubbing harnesses
● Once developer local testing is in place, testing can be as simple as standing
up docker-compose with mocks and mock data
○ https://github.com/kubernetes/kompose can help migration
● Continuous Security (...ish)
○ A lot of tooling isn’t container-native
○ But continuous scanning leaves deeper exploration for humans
94. SEC584: Defending Cloud Native
Infrastructure
BETA Pricing: 50% Discount ($2100 USD)
Live Online | 3-day beta October 12 - 14
with Andrew Martin & Eric Johnson
https://www.sans.org/event/sec584-beta-one-2020