Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kubernetes Manchester - 6th December 2018

345 views

Published on

Slide deck for the Kubernetes Manchester meetup December 2018 talk. Jim introduces a little about moneysupermarket, the direction we're heading and historical problems we've had.
I (David) then walk through the technology choices we've made and how they fit together to form our Istio service mesh on an auto-scaling AWS EC2 kubernetes platform.

Published in: Technology
  • ➤➤ How Long Does She Want You to Last? Here's the link to the FREE report ●●● https://tinyurl.com/rockhardxx
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Kubernetes Manchester - 6th December 2018

  1. 1. MoneySuperKubernetes Navigating K8s at Moneysupermarket Kubernetes Manchester, December 2018 David Stockton, DevOps Tech Lead for Core Platform Jim Davies, Head of DevOps
  2. 2. Jim Davies David Stockton
  3. 3. Why do we need Kubernetes?
  4. 4. < 2016 2017 2018 2019 > AWS • Rightscale • Masterless Puppet • Jenkins on Mesos • Single-container hosts • Docker Swarm spikes • Nomad spikes • Kubernetes full trial • AWS and industry say ”Go K8s” • EKS preview (not impressed) • Kubernetes full development • Continued migration • Test and learn • DEPLOY ON DAY ONE! Road to Kubernetes
  5. 5. • AWS EKS • First evaluated Jan 2018 (beta) – only available in US AZ’s (we want customer data in EU) • At the time, required custom kubectl binaries & well behind in version support (much better now) – we plan to re-evaluate EKS next year. • No automation capability – Fargate for EKS wasn’t ready, Terraform not ready. • Would need better interoperability with our existing EC2 VPC estate. • GKE • Have data analytics platform in GCP; but all SoA/front-end estate is in AWS + direct connect Technology Choices Platform Deployment • Kops • K8s deployment automation (AWS first-class citizen) • Can output terraform files for IaC management - we keep terraform state in S3 bucket BUT we throw-away the Terraform files and re-generate each run. • + Not tempted to edit Terraform files (Don’t do this! We did and it’s a nightmare to use any kops function again afterwards) • + Can turn-off cluster (we shut-down environments at night - £££ ) without ’destroying’ it. • - Hasn’t happened yet but theoretical worry if kops re-architects terraform files (theoretically just rebuild the whole cluster anyway) • - Launch configurations only (no launch template multi instance-type support yet) • Puppet – Invoked via kops hook • Small (master-less) puppet run to add some instance level goodness (e.g. security tooling!)
  6. 6. • Route 53 • Delegated sub-domain to Route53 management (controlled by ‘external-dns’) • ELB • Route53 entries point to ELBs • More on this when I talk about Istio… Traffic Ingress Application Deployment • Jenkins • Orchestrates & provides feedback • e.g. PR merged  Webhook  Helm-Deploy (Jenkinsfile) • Helm • THE kubernetes package manager • YAML + golang template = k8s YAML • ProTip: ’Sprig’ for free - http://masterminds.github.io/sprig/ • ‘generic-service’ helm chart  • helm-deploy.rb • Proprietary Ruby script • Defines list of helm releases to run per environment • helm chart + chart version + values.yaml path • Remember, environments restarted each day from code (disaster recovery = normal day!) • Artifactory • Docker repository (plus virtual aggregation) • Helm repository (plus virtual aggregation)
  7. 7. • Service Mesh • Connect – control flow of traffic (e.g. blue/green deployments, testing) • Secure – automatically secure services (auth ’N’ auth + encryption) – solves our internal network requirements; not cloud vendor specfic (i.e. could span over multiple clouds/datacenters) • Control – enforce policies (e.g. circuit breaker) • Observe – automatic tracing, monitoring & logging • Traffic Ingress • k8s service  Istio Ingress Gateway  Gateway  VirtualService  <upstream> • Egress • ServiceEntries – can use to white-list outbound traffic or create rules (e.g. no more than X connections to external service Y) • Squads in Control • Central point of control (e.g. /endpoint was ‘serviceA’, now it’s ‘serviceB’ – only 1 piece of configuration) • Squads can ’wire-up’ their service – in their control • Limits / Service Protection - no more than X concurrent connections • Fault Testing - delay injection / fail X% of requests • Canary Deployments – maybe next talk!? Istio
  8. 8. • Docker Images • Linted (docker lint) • Scanned (klar  clair  clair-db) • On build (existing vuln. at build time) • On schedule (new vulns) • Corporate white-list YAML from git repo • IGNORE_UNFIXED is a better strategy – but you still want to know! • Lots of vendor solutions available too • Hosts • Standard Linux tooling / SaaS vendors available • Many have profiling capabilities • Many offer container (or even helm chart); beware they typically need to be a privileged container – only appropriate if you manage your own hosts • k8s • CVE-2018-1002105 (9.8!) • https://access.redhat.com/security/cve/cve-2018-1002105 • https://github.com/kubernetes/kubernetes/issues/71411 • Affected versions: • Kubernetes v1.0.x-1.9.x • Kubernetes v1.10.0-1.10.10 (fixed in v1.10.11) • Kubernetes v1.11.0-1.11.4 (fixed in v1.11.5) • Kubernetes v1.12.0-1.12.2 (fixed in v1.12.3) • https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/ • https://github.com/neuvector/kubernetes-cis-benchmark Security
  9. 9. • Integrate with SSO • kuberos  dex  Okta (LDAPS)  ldaps[ ] • kuberos = UI to OIDC provider • https://github.com/helm/charts/tree/master/stable/kuberos • dex = OIDC provider which aggregates back-end auth providers • https://github.com/helm/charts/tree/master/stable/dex • Okta = SSO solution; provides SAML, OIDC, LDAPS front-ends • Wait, what? Okta OIDC doesn’t by default present group names in refresh token response – additional feature = £££ • dex talks to (no extra charge) Okta LDAPS directory (and/or others – e.g. static passwords for out-of-cluster service accounts • LDAPS[ ] = Active Directory / Samba / Whatever • Why? • Kuberos provides ~/.kube credentials UI • Thereafter kubectl talks to dex and refreshes token • Joiner / Mover / Leaver process = someone else’s problem  • Standard kubectl tooling • Group mappings in code Security – API Creds / RBAC
  10. 10. • Most charts allow specifying attributes of storage; but critically NOT the volume ID :face-palm: •  Need reclaim policy = retain (or you’ll lose the volume!) •  Cannot delete and re-deploy helm chart without manual intervention • Need to take PV out of retained state • If you’re not making changes to the PV definition (shouldn’t be; then at least helm upgrade’s work) • Environments up forever =  • Working through stable helm charts with PRs to support volume IDs • Not as simple as it sounds; often need to split replica-sets into multiples or similar approaches to allow specific IDs to be used Storage • Horizontal Pod Auto-scaler • Fancy way of saying ‘look at this Prometheus metric and scale if above threshold’ • Business metric scaling = cool… ‘if average user search time > 0.5s then scale up search API’ • TODO: OR rules only at the moment • metric-server (Pod CPU & mem API) & prometheus-adapter (custom API) • Cluster Auto Scaler • Increase instances running in AWS ASG if nodes can’t satisfy request • TODO: Add option for headroom to allow faster pod scaling • Kops rolling update cluster • It just works! • Pod disruption budget if required = don’t move this pod (e.g. jenkins build slave) Scaling / Updating
  11. 11. • Docker Logs (StdOut, JSON codec please!)  Logspout  Logstash  <log stack of your choice> • Logspout slurps docker logs and spits them out to logstash • Metrics • Prometheus – Who ISN’T using this?! • Kubernetes 1st class citizen • Graphite – to – Prometheus • Majority of current estate is AWS native & uses graphite • https://github.com/prometheus/graphite_exporter • We collect approx 1.5million metrics every 30s • Ruby graphite-to-prometheus exporter • (Golang implementation was go-slow at-scale and Ruby impl. is easier to maintain in-house) • Scrape metrics from pod • Observability • Weave Scope – Great but beware as it’s a cluster admin (root SSH to nodes!) • Kiali – Istio specific. Nice but unclear how actively developed it is • Jaeger – Like Zipkin (distributed tracing); auto-deployed with Kiali – nice for free (enabled: true) • Cockpit – Nice UI and uses kubectl creds • Kube-ops-view – Read only  ; Ugly  • Kubernetic – Desktop UI (beta – free!) • Pretty pictures incoming… Observability
  12. 12. Weave Scope
  13. 13. Weave Scope
  14. 14. Jaeger
  15. 15. Cockpit
  16. 16. Kube Ops View
  17. 17. Desktop – Kubernetic (beta = free)
  18. 18. • Excessive Istio logging - Remove stdout rule! • Java Heap memory in containers • Heap = ¼ host RAM by default • Chicken/egg design of hosting orchestration tool • Tooling sits inside cloud?! • Deployment ordering – 1st deploy = no istio; 2nd deploy/scales with Istio! • Ruby v2.4+ ndot limitation for DNS lookups – WTF! • Istio requires ServiceEntry for headless services • Early Istio maturity (reliability and e.g. max secgroups) v1.0.2+ =  • ‘Finding’ the default container limits per NS • Unreliable Jenkins K8s slave plugin (1.13.2 = ) • Kops clashing with existing VPC setup (please don’t delete those subnets!) • AWS limit on ELBs (8 IP per subnet!! solved with Istio ingress gateway service) • StatefulSets and EBS mapping (earlier) • RBAC with Okta (atypical OIDC) • No UDP support in Istio (don’t pick syslog as your first service to move!) Jim: ”Dave, why’s it taking so long?”
  19. 19. • Two customer facing services in production • 50+ Jenkins slave agent images moved from Mesos to k8s • Groovy script FTW • Around a dozen services in progress • 80% of central infrastructure services are migrated (sonar, dashboards, jenkins slaves, etc) • Benefits are being sold to all squads, platform team keen for this to be “pull” action. Squads WANT to move to k8s. • Success = developer led Where are we today?
  20. 20. • Canary deployments • Working on automated canary deployments & roll-back • Helm already gives us this with readiness/liveness checks –extending coverage to include automated acceptance testing. • Automated fault tolerance acceptance testing over multiple services. • Improved performance testing plan – auto-scaling so harder to test to break; new patterns emerging. • Cost savings through consolidation (merge VPCs & compute) • Improved spot price ‘storm’ tolerance (prod = spot) Roadmap
  21. 21. Thanks to Tom, Tristan and Booking.com Questions

×