2. Agenda
• Some Items about Kubernetes
• Lift and Shift
• Evolving Applications for Kubernetes
• Changing People/Processes for Kubernetes
3. tl;dr
• Be ready for change
• As you evolve your processes/support, you have to provide backwards compatibility for both your infrastructure/applications and your processes
• Really focus on having a stable deployment mechanism
• Sort out your interaction contracts
• Start small and constrained - Realize that you won't do it "right"
• You will have to make application changes
• You will have to change your expectations
• Be ready for cutting edges
• Kubernetes has very simple set of core primitives, and a lot of options to build on top
• CICD
• Authentication and Access Control
• Resource management - forcing to use limits/requests
• Avoiding mixing goals
4. Perspective (Me)
• I'm a platform administration. I make sure the clusters...
• are available,
• have resources,
• connect to the rest of the infrastructure.
• I run applications, but for the most part help app teams use the platform
5. Survey (You)
• Who's familiar with cfengine / puppet / chef / ansible / etc?
• Who's familiar with current containers?
• Who's familiar with Kubernetes? The object / resource model?
• Who's running Kubernetes in Production (even for the smallest workload)?
7. Declarative State
• Tell me what you want
• Not how to do it
• How to do it can change in different contexts
• "LoadBalancer" is slightly different in AWS, GCP, Azure
• All state stored in the API Server
8. Reconciliation Loop Driven
• Many specific independent actors
• Controllers
• Operators
• Actors implement declarative state and current state
• Actor can change declarative state for another actor
and trigger its actions
• Main ones
• schduler
• controller-manager
• kubelet
10. Networking
• Everything in the cluster is reachable to everything else
• (Policy might restrict)
• Magic Mappings.
• L4 Load Balancer
• DNS Mapping
• Map from outside cluster to inside
11. Resources
• What are used to defined declarative state
• Stored in cluster (API Server)
• aka Manifests
13. Pod
• Collection of containers working tightly together
• Unit of Scheduling
• Share network stack
• Can share disk mounts
• Sidecar: a support container running with the application container
20. Goals
• Run an application inside of Kubernetes
• Change the code as little as possible
• Hook into the existing infrastructure as much as possible
• Keep it simple - avoid state, storage, etc
• Kick the tires
26. App Design for Kubernetes
• Application Pod with Logstash Sidecar Pod
• ConfigMap holding prerendered output from Chef. Mounted under conf dir
• Shared mount (emptydir) for log output
• Written by app
• Read by logstash
• Service definition to map from outside to inside
34. Lessons Learned
• Startup time takes its time (tomcat startup)
• Debugging (kube exec)
• ConfigMap/Deployments only worked for one environment
• Healthchecks didn't fit well in the model and worked counter to
debugging steps
38. 1. Single Concern Principle
• Do one thing (and do it well)
• Separation of Concerns
• Target updates
• Minimize (vertical) image sprawl
39. 2. Image Immutability Principle
• Image is a delivery artifact with all of
the properties that that should have
• "Build once, deploy everywhere"
• Don't layer configuration on as part of
image (unless you're putting *all*
foreseeable configuration possibilities
in there)
40. 3. Self-Containment Principle
• Extension of Image Immutability
• On deployment, layer in instance
unique items (config, data)
• This uniqueness layer should be
specific to this instance
41. 4. Runtime Confinement Principle
• Get an understanding of your
resource requirements
• And use them! (helps with scheduling)
• Without them ==>
• Unintentional, uninformed
oversubscription
• Roving micro-oversubscription
hotspots
42. 5. Process Disposability Principle
• Processes are ephemeral
• Before ready for them to not be there
• This will happen often (every change)
43. Containers (by themselves) are half suited for
Kubernetes
• Kubernetes builds on containers
• If you have been following container
modeling, that translates directly
46. 7. High Observability Principle
• Change in behaviors
• Biggest change in thinking
• Forced thinking of items like health checks and monit et al
• Add to Disposability Principle - have to be able to debug quickly, over the network, and with remaining forensics
48. Be Ready for Change
• Changed Deployment Strategy
• Single manifests -> Helm Charts
• Changed Helm Chart Structure 5 Times
• Changed Logging Infrastructure 3 Times
• BE VERY CAREFUL IN WHAT YOU STOP SUPPORTING
49. Contracts
• Describe what each side/component of processes will provide and accept
• Helps to define
• What can be changed without impacting others
• What needs to be talked about before changing
50. Kubernetes Cluster Kubernetes Team App Team / User
-Receives App Definition
+Define App Definition
(name, resource count, users)
-Receives namespace, RBAC +Defines namespace, RBAC
-Trusts central auth +Logs in via central auth
-Allows access to granted resources +Accesses namespace
App Team Onboarding to Cluster
51. App (Pod) Kubernetes Cluster Monitoring App Team / User
+Logs to STDOUT -Receives from STDOUT
+Transmits to Logging Bus -Receives on Logging Bus
+JSON Structured Log
Format
-Handles JSON Format
+Indexes in Search Tool -Search in Search Tool
+Infrastructure enrichments:
pod, cluster, container host,
environment, datacenter
-Search by infrastructure
information
Logging Contract
52. Be Comfortable with Being Uncomfortable
• A lot of this technology is new/recent
• A lot of simple implementations (first pass)
• A lot of undiscovered bugs
• "Best practices" are highly localized
53. Simplified Primitives
• Deployments
• All at once (destroy + build)
• Rolling
• Load Balancing
• Only equal weight round robin (be it via L4 forwarding, or DNS)
• What's Layer7?
54. Common App Team Concerns
• How do I get onboarded?
• How does my application have to interact with the system?
• How do I run my application?
• How do I troubleshoot my application?
• How does it all work?!!?!?!
55. How do I run my application?
• Build an Application Template
• Dockerfile
• Helm Chart
• Jenkinsfile
• Extend with organizational specific functions
• Partial Deploy functions
• Incorporate environment values
60. New Application Model
• Jar App (faster start up)
• zmetrics port (separate from client interface)
• Prometheus scrapes metrics
• Readiness/liveness probes
• Logs to STDOUT
61. New Deployment Model
• CICD Driven
• Standard format for repository
• Dockerfile, Chart --> artifacts
• Environment specific values
• Multiproject pipeline pushes to multiple environments with approval gates
• Automatic canary deploy, sanity check, then full deploy
63. • If you change too quickly, you will be in for a world of hurt
• Different ways to deploy
• Different Kubernetes versions (v1alpha1, v1beta1, v1)
• You can't please everyone
• Tradeoffs
• Training - bootcamps and walking people through...
• Examples examples examples - easy to copy (cargo culting)
64. Health Checks
• Liveliness probe: If this fails, Kubernetes will restart the
container.
• Readiness probe: If this fails, Kubernetes will take the
pod out of the service pool.
• If an app is bad, I should stop sending traffic to it and
recover it, right? ==> Ok to set these to the same
thing.
https://cloud.google.com/kubernetes-engine/kubernetes-comic/
65. "We DDOSed Ourselves!!!"
• On startup, application can be ready
• Gets flooded with traffic
• Kube restarts because liveness failed as well
• Quick fix: Removed liveness
• Real fix:
• Run liveness and readiness on a different port/
connection threadpool/etc
• Know they mean different items
66. • Prometheusbeat has limited support
• Security scanning checkbox
• Type:LoadBalancer Services (and anything built off them) get a permit *
ICMP Destination Unreachable (Type 3) - runs afoul of security policies
• Provide helper tools to setup configuration
• Login ==> can also gather cluster information like certificates and
endpoints