Successfully reported this slideshow.
Your SlideShare is downloading. ×

Kubernetes best practices

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 73 Ad

Kubernetes best practices

Presented at AI NEXTCon Seattle 1/17-20, 2018
http://aisea18.xnextcon.com
join our free online AI group with 50,000+ tech engineers to learn and practice AI technology, including: latest AI news, tech articles/blogs, tech talks, tutorial videos, and hands-on workshop/codelabs, on machine learning, deep learning, data science, etc..

Presented at AI NEXTCon Seattle 1/17-20, 2018
http://aisea18.xnextcon.com
join our free online AI group with 50,000+ tech engineers to learn and practice AI technology, including: latest AI news, tech articles/blogs, tech talks, tutorial videos, and hands-on workshop/codelabs, on machine learning, deep learning, data science, etc..

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Kubernetes best practices (20)

Advertisement

More from Bill Liu (20)

Recently uploaded (20)

Advertisement

Kubernetes best practices

  1. 1. Sandeep Dinesh Developer Advocate @sandeepdinesh github.com/thesandlord Kubernetes Best Practices
  2. 2. Google Cloud Platform 2 Kubernetes is really flexible
  3. 3. Google Cloud Platform 3 But you might yourself in the
  4. 4. Building Containers
  5. 5. Google Cloud Platform 5 Don’t trust arbitrary base images!
  6. 6. 6Google Cloud Platform Static Analysis of Containers https://github.com/banyanops/collector https://github.com/coreos/clair
  7. 7. Google Cloud Platform 7 Use small base images
  8. 8. 8Google Cloud Platform Overhead Node.js App Your App → 5MB Your App’s Dependencies → 95MB Total App Size → 100MB Docker Base Images: node:8 → 667MB node:8-wheezy → 521MB node:8-slim → 225MB node:8-alpine → 63.7MB scratch → ~50MB
  9. 9. 9Google Cloud Platform Overhead Node.js App Your App → 5MB Your App’s Dependencies → 95MB Total App Size → 100MB Docker Base Images: node:8 → 667MB node:8-wheezy → 521MB node:8-slim → 225MB node:8-alpine → 63.7MB scratch → ~50MB ← 6.6x App Size!!
  10. 10. 10Google Cloud Platform Overhead Node.js App Your App → 5MB Your App’s Dependencies → 95MB Total App Size → 100MB Docker Base Images: node:8 → 667MB node:8-wheezy → 521MB node:8-slim → 225MB node:8-alpine → 63.7MB scratch → ~50MB ← 6.6x App Size!! ← 13.3x“min” overhead!!
  11. 11. 11Google Cloud Platform Overhead Node.js App Your App → 5MB Your App’s Dependencies → 95MB Total App Size → 100MB Docker Base Images: node:8 → 667MB node:8-wheezy → 521MB node:8-slim → 225MB node:8-alpine → 63.7MB scratch → ~50MB ← 6.6x App Size!! ← 13.3x“min” overhead!! Pros: Builds are faster Need less storage Cold starts (image pull) are faster Potentially less attack surface Cons: Less tooling inside container “Non-standard” environment
  12. 12. Google Cloud Platform 12 Use the “builder pattern”
  13. 13. 13Google Cloud Platform Code Build Container Compiler Dev Deps Unit Tests etc... Build Artifact(s) Runtime Container Runtime Env Debug/Monitor Tooling Binaries Static Files Bundles Transpiled Code
  14. 14. Google Cloud Platform 14 Docker bringing native support for multi-stage builds in Docker CE 17.05
  15. 15. Container Internals
  16. 16. Google Cloud Platform 16 Use a non-root user inside the container
  17. 17. 17Google Cloud Platform FROM node:alpine RUN apk update && apk add imagemagick RUN groupadd -r nodejs RUN useradd -m -r -g nodejs nodejs USER nodejs ADD package.json package.json RUN npm install ADD index.js index.js CMD npm start Example Dockerfile
  18. 18. 18Google Cloud Platform Enforce it! apiVersion: v1 kind: Pod metadata: name: hello-world spec: containers: # specification of the pod’s containers # ... securityContext: runAsNonRoot: true
  19. 19. Google Cloud Platform 19 Make the filesystem read-only
  20. 20. 20Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: hello-world spec: containers: # specification of the pod’s containers # ... securityContext: runAsNonRoot: true readOnlyRootFilesystem: true Enforce it!
  21. 21. Google Cloud Platform 21 One process per container
  22. 22. Google Cloud Platform 22 Don’t restart on failure. Crash cleanly instead.
  23. 23. Google Cloud Platform 23 Log to stdout and stderr
  24. 24. Google Cloud Platform 24 Add “dumb-init” to prevent zombie processes
  25. 25. 25Google Cloud Platform FROM node:alpine RUN apk update && apk add imagemagick RUN groupadd -r nodejs RUN useradd -m -r -g nodejs nodejs USER nodejs ADD https://github.com/Yelp/dumb-init/releases/download/v1.2.0/dumb-init_1.2.0_amd64 /usr/local/bin/dumb-init RUN chmod +x /usr/local/bin/dumb-init ENTRYPOINT ["/usr/bin/dumb-init", "--"] ADD package.json package.json RUN npm install ADD index.js index.js CMD npm start Example Dockerfile
  26. 26. Google Cloud Platform 26 Good News: No need to do this in K8s 1.7
  27. 27. Deployments
  28. 28. Google Cloud Platform 28 Use the “record” option for easier rollbacks
  29. 29. 29Google Cloud Platform $ kubectl apply -f deployment.yaml --record … $ kubectl rollout history deployments my-deployment deployments "ghost-recorded" REVISION CHANGE-CAUSE 1 kubectl apply -f deployment.yaml --record 2 kubectl edit deployments my-deployment 3 kubectl set image deployment/my-deplyoment my-container=app:2.0
  30. 30. Google Cloud Platform 30 Use plenty of descriptive labels
  31. 31. 31Google Cloud Platform apiVersion: extensions/v1beta1 kind: Deployment metadata: name: web spec: replicas: 12 template: metadata: labels: name: web color: blue experimental: 'true' Labels
  32. 32. 32Google Cloud Platform App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels
  33. 33. 33Google Cloud Platform App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels App = Nifty
  34. 34. 34Google Cloud Platform App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels Role = BE
  35. 35. 35Google Cloud Platform App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels Phase = Dev
  36. 36. 36Google Cloud Platform App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels Phase = Dev Role = FE
  37. 37. Google Cloud Platform 37 Use sidecar containers for proxies, watchers, etc
  38. 38. 38Google Cloud Platform Examples App Database localhost App App Proxy Proxy Proxy secure connection
  39. 39. 39Google Cloud Platform Auth, Rate Limiting, etc. Examples App Incoming Requests localhost App App Proxy Proxy Proxy
  40. 40. Google Cloud Platform 40 Don’t use sidecars for bootstrapping!
  41. 41. Google Cloud Platform 41 Use init containers instead!
  42. 42. 42Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: awesomeapp-pod labels: app: awesomeapp annotations: pod.beta.kubernetes.io/init-containers: '[ { "name": "init-myapp", "image": "busybox", "command": ["sh", "-c", "until nslookup myapp; do echo waiting for myapp; sleep 2; done;"] }, { "name": "init-mydb", "image": "busybox", "command": ["sh", "-c", "until nslookup mydb; do echo waiting for mydb; sleep 2; done;"] } ]' spec: containers: - name: awesomeapp-container image: busybox command: ['sh', '-c', 'echo The app is running! && sleep 3600']
  43. 43. Google Cloud Platform 43 Don’t use :latest or no tag
  44. 44. Google Cloud Platform 44 Readiness and Liveness probes are your friend
  45. 45. 45Google Cloud Platform Readiness → Is the app ready to start serving traffic? ● Won’t be added to a service endpoint until it passes ● Required for a “production app” in my opinion Liveness → Is the app still running? ● Default is “process is running” ● Possible that the process can be running but not working correctly ● Good to define, might not be 100% necessary These can sometimes be the same endpoint, but not always Health Checks
  46. 46. Services
  47. 47. Google Cloud Platform 47 Don’t always use type: LoadBalancer
  48. 48. Google Cloud Platform 48 Ingress is great
  49. 49. 49Google Cloud Platform Ingress Service 1 Service 2 Service 3 websocket.mydomain.com mydomain.com /foo /bar
  50. 50. Google Cloud Platform 50 type: NodePort can be “good enough”
  51. 51. Google Cloud Platform 51 Use Static IPs. They are free* !
  52. 52. 52Google Cloud Platform apiVersion: v1 kind: Service metadata: name: myservice spec: type: LoadBalancer loadBalancerIP: QQQ.ZZZ.YYY.XXX ports: - port: 80 targetPort: 3000 protocol: TCP selector: name: myapp $ gcloud compute addresses create ingress --global … $ gcloud compute addresses create myservice --region=us-west1 Created … address: QQQ.ZZZ.YYY.XXX … $ apiVersion: extensions/v1beta1 kind: Ingress metadata: name: myingress annotations: kubernetes.io/ingress.global-static-ip-name: "ingress" spec: backend: serviceName: myservice servicePort: 80
  53. 53. Google Cloud Platform 53 Map external services to internal ones
  54. 54. 54Google Cloud Platform External Services kind: Service apiVersion: v1 metadata: name: mydatabase namespace: prod spec: type: ExternalName externalName : my.database.example.com ports: - port: 12345 kind: Service apiVersion: v1 metadata: name: mydatabase spec: ports: - protocol: TCP port: 80 targetPort: 12345 kind: Endpoints apiVersion: v1 metadata: name: mydatabase subsets: - addresses: - ip: 10.128.0.2 ports: - port: 12345 Hosted Database Database outside cluster but inside network
  55. 55. Application Architecture
  56. 56. Google Cloud Platform 56 Use Helm Charts
  57. 57. Google Cloud Platform 57 ALL downstream dependencies are unreliable
  58. 58. Google Cloud Platform 58 Make sure your microservices aren’t too micro
  59. 59. Google Cloud Platform 59 Use a “Service Mesh”
  60. 60. 60Google Cloud Platform https://github.com/istio/istio https://github.com/linkerd/linkerd
  61. 61. Google Cloud Platform 61 Use a PaaS?
  62. 62. 62Google Cloud Platform
  63. 63. Cluster Management
  64. 64. Google Cloud Platform 64 Use Google Container Engine
  65. 65. Google Cloud Platform 65 Resources, Anti-Affinity, and Scheduling
  66. 66. 66Google Cloud Platform Node Affinity hostname zone region instance-type os arch custom!
  67. 67. 67Google Cloud Platform Node Taints / Tolerations special hardware dedicated hosts etc
  68. 68. 68Google Cloud Platform Pod Affinity / Anti-Affinity hostname zone region
  69. 69. Google Cloud Platform 69 Use Namespaces to split up your cluster
  70. 70. Google Cloud Platform 70 Role Based Access Control
  71. 71. Google Cloud Platform 71 Unleash the Chaos Monkey
  72. 72. 72Google Cloud Platform More Resources ● http://blog.kubernetes.io/2016/08/security-best-practices-kubernetes-deployment.html ● https://github.com/gravitational/workshop/blob/master/k8sprod.md ● https://nodesource.com/blog/8-protips-to-start-killing-it-when-dockerizing-node-js/ ● https://www.ianlewis.org/en/using-kubernetes-health-checks ● https://www.linux.com/learn/rolling-updates-and-rollbacks-using-kubernetes-deployments ● https://kubernetes.io/docs/api-reference/v1.6/
  73. 73. Questions? What best practices do you have?

×