In a world where applications are now containerized and distributed across homogeneous hosts with technologies like Kubernetes, the traditional hardware firewall is no longer able to enforce the access restrictions needed to prevent intrusion and attack. That doesn't mean network security is dead though - far from it: it just requires a different approach. This talk will cover how we can secure modern microservices applications, in particular looking at Project Calico in Kubernetes. We'll take a look at the Kubernetes NetworkPolicy API, go through how it addresses this problem, and then dig into how it's implemented by Project Calico. There'll be a demonstration of how to set up Calico on CoreOS Container Linux and add network policies for a simple microservices application, and finally we'll wrap up by looking at the performance impact and perhaps some future extensions.
(From the April 2017 Cloud Native Computing Berlin meetup. Video may become available at https://www.meetup.com/Cloud-Native-Computing-Berlin/events/238925663/.)
14. role: frontend
role: user-auth
role: main-logic
role: database
“allow from web to
TCP 80”
“allow from role: frontend”
“allow from role: user-auth”
“allow from role:
user-auth”
“allow from role:
main-logic”
“allow from role: frontend”
Hi everyone
Thank you all for coming, and also thanks to our hosts Kinvolk. Please yell at me if I’m not speaking loudly enough!
I’m a software engineer at Tigera. I’ve always been in networking, and for the past few years I’ve been a developer on Project Calico. For anyone who hasn’t encountered it yet, it’s an open source project focused on providing simple, scalable and secure communications for cloud native workloads.
Today I’m going to speak a bit about network policy for Kubernetes. I’ll start with a quick spin through why you should care, and then get into a demo and some details of how it all works. I’ll give some time for questions at the end.
So, what’s the problem?
Once upon a time, applications were simple. (Well, some of them…?)
In previous generations of web application architectures, the network was the thing that connected users to your application.
Now the application you want to build doesn’t run on a single machine. Computers have gotten faster, but our ambitions have outstripped Moore’s Law.
Now the network is part of the composition of the application itself.
If any of these component applications is compromised, then all data stored could be vulnerable, potentially very broadly - cluster, organisation etc. My colleague, Mike Stowe, spoke about that here a few weeks ago at the CoreOS meetup.
But how how can we do that? We’ve got physical firewalls, so we can just put them between our applications.
(This isn’t intended to illustrate a sensible network architecture, just the effect.)
But we want to use Kubernetes.
For the sake of my curiosity, how many people here are already using Kubernetes? Show of hands? In production? Plans
Cool, so
We’ve got multiple nodes in our cluster running pods. So can we use physical firewalls similarly?
But really a k8s cluster doesn’t look like that.
They’re slow.
And then they’ll give up anyway because there are too many rules: one set per pod!
Now. There is one thing we could do - categorize the nodes by what application they run. I mention this because it possible, I don’t endorse it.
There are some things we could do:
Affinity
Multiple clusters
Taints and tolerations to force pods onto certain nodes
Doing this in some places might be a pragmatic solution for compliance say, but it’s awful.
It’s slow and inflexible - you throw out many of the benefits of Kubernetes, or indeed any cloud native orchestrator. The orchestrator should be doing the scheduling, not a network security engineer. Speed, flexibility, utilization.
It’s still in beta, strictly, but it’s been in for about a year and mostly seems pretty stable now. The one thing to watch out for is named ports, which afaik nobody has implemented and might be removed.
Key points.
What does this apply to? What does that mean for traffic?
NetworkPolicies always define allowed traffic. That’s why we need to disable all traffic to use them, by annotating the namespaces.
Ingress only
Podselector(s) for from
Here’s a slightly different one.
This one is a namespaceselector instead.
Also an example of specifying a port, which in practice you’d usually want to do.
Kubernetes uses labels to select which pods in your application to replicate, and which pods should belong to a given micro-service. Your application’s pods are already labeled according to what their role is in your application, and this lines up really well with how the network should secure your application.
This diagram shows four micro-services, made up of replicated Kubernetes pods. In this example, the developer has chosen to use “role” as the label used to group his services, but there isn’t anything special about this - any label could be used.
And looking at this diagram, it is easy to see what the necessary network policy should be.
Let’s take a closer look at an example of how we might actually define some of this policy using the Kubernetes API.
The “user-auth” service in this example app takes incoming requests from the frontend, and validates them against the database’s list of valid users. As such, it needs to accept incoming connections from any of our replicated frontend pods.
The Kubernetes object on the left describes this relationship. This “NetworkPolicy” selects all the “user-auth” pods, and allows incoming traffic from “frontend” pods to TCP port 8001. This is really powerful, because we’re describing network security in an easy to understand way, and we’re using Kubernetes labels.
Because policy is defined in terms of labels, If I scale my frontend to handle a sudden spike in traffic, my network policy will automatically be applied to all of the new pods with no extra effort required. It just works.
As I mentioned at the start, I work on project Calico, so obviously I’m going to use it here. There are now a number of implementations: Weave and Romana are some other examples.
I’ll do a quick demo, and we’ll look briefly into how things work
Calico uses the Linux kernel to enforce network policy on each and every pod in your cluster.
The Calico distributed firewall is dynamic, automatic, and enforces security rules in front of each workload in your cluster.
The network fabric here is swappable. For example, you could be using one of the backends provided by canal, or your own custom fabric.
Sorry I’m not actually showing you this bit - this laptop is new to me and it turns out airplane mode is mapped to print screen so I couldn’t grab a screenshow?!
Anyway, you install this yaml file, which sets up the various components and configurations.
I’ve had to go for something simple here, since I didn’t have access to the cluster I wanted to use, or the time to set it up.
- don’t worry about it
- connection based, and overlay / fabric has much more impact anyway
- how do we get it efficient?
- selectors very efficient
- iptables performance mostly depends on the number and type of rules traversed for a packet
- minimal representation on nodes (unused policies not written)
- you probably don't need this, but order of policies (can be explicit for Calico ones, or alphabetical) will give the processing order
- honestly, if you're really pushing it, kube-proxy is probably more concerning. We have a big k8s user who are finding it consumes 10s of thousands of rules, when we're only in thousands in a large setup.
The NetworkPolicy API as it stands is relatively restrictive. The underlying Calico engine (we do policy across multiple orchestrators, so we implement a superset of function) supports a few more things.
You can actually use these with K8s, by using the very similar calico policy API to configure them.