Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Flagger: Istio Progressive Delivery Operator

654 views

Published on

Introducing Flagger: a progressive delivery Kubernetes operator for Istio.
Flagger automates the promotion of canary deployments, and uses Istio routing for traffic shifting and Prometheus metrics for canary analysis.

Published in: Software

Flagger: Istio Progressive Delivery Operator

  1. 1. Flagger Istio Progressive Delivery Operator Stefan Prodan @stefanprodan Istio London November 2018 1
  2. 2. What is Progressive Delivery? Progressive Delivery is Continuous Delivery with fine-grained control over the blast radius. Building blocks ● User segmentation ● Traffic management ● Observability ● Automation https://redmonk.com/jgovernor/2018/08/06/towards-progressive-delivery/
  3. 3. Introducing Flagger Flagger is a Kubernetes operator that automates the promotion of canary deployments using Istio routing for traffic shifting and Prometheus metrics for canary analysis. Flagger implements a control loop that gradually shifts traffic to the canary while measuring key performance indicators. Based on the KPIs analysis a canary is promoted or aborted.
  4. 4. Get Flagger helm repo add flagger https://flagger.app helm upgrade -i flagger flagger/flagger --namespace=istio-system --set metricsServer=http://prometheus.istio-system:9090 --set controlLoopInterval=1m helm upgrade -i flagger-grafana flagger/grafana --namespace=istio-system --set url=http://prometheus.istio-system:9090
  5. 5. Flagger overview
  6. 6. Key Performance Indicators The decision to pause the traffic shift, abort or promote a canary is based on: ● Deployment health status ● Request success rate percentage ● Request latency average value
  7. 7. Observability Flagger emits Kubernetes events related to the advancement and final status of a canary analysis.
  8. 8. Observability RED + USE methods Flagger comes with a Grafana dashboard for canary analysis.
  9. 9. Canary CRD 9 apiVersion: flagger.app/v1alpha1 kind: Canary metadata: name: podinfo namespace: test spec: # deployment reference targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo # hpa reference (optional) autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: # container port port: 9898 # Istio gateways (optional) gateways: - public-gateway.istio-system.svc.cluster.local # Istio virtual service host names (optional) hosts: - app.istio.weavedx.com
  10. 10. Canary CRD - analysis spec 10 canaryAnalysis: # maximum number of failed metric checks # before rolling back the canary threshold: 10 # max traffic percentage routed to canary maxWeight: 50 # canary increment step stepWeight: 5 metrics: - name: istio_requests_total # minimum req success rate percentage (non 5xx responses) threshold: 99 interval: 1m - name: istio_request_duration_seconds_bucket # maximum req duration P99 (milliseconds) threshold: 500 interval: 30s
  11. 11. Canary deployment stages
  12. 12. Canary automation for production systems
  13. 13. Flagger control loop stage 1-5 ● scan for canary deployments ● create the primary deployment and HPA ● create the ClusterIP services for primary and canary ● create an Istio virtual service with weighted destinations mapped to primary and canary ClusterIP services ● check primary and canary deployments status ○ halt advancement if a rolling update is underway ○ halt advancement if pods are unhealthy ● increase canary traffic weight percentage from 0% to 5% (step weight) ● check canary HTTP request success rate and latency ○ halt advancement if any metric is under the specified threshold ○ increment the failed checks counter
  14. 14. Flagger control loop stage 6-7 ● check if the number of failed checks reached the threshold ○ route all traffic to primary ○ scale to zero the canary deployment ○ mark canary as failed ○ wait for the canary to be updated (revision bump) and start over ● increase canary traffic by 5% (step weight) ○ halt advancement while canary request success rate is under the threshold ○ halt advancement while canary request duration P99 is over the threshold ○ halt advancement if the primary or canary deployment becomes unhealthy ○ halt advancement while canary deployment is being scaled up/down by HPA
  15. 15. Flagger control loop stage 8-12 ● promote canary to primary when canary weight reaches max ○ copy canary deployment spec template over primary ● wait for primary rolling update to finish ○ halt advancement if pods are unhealthy ● route all traffic to primary ● scale to zero the canary deployment ● mark canary as finished ● wait for the canary to be updated (revision bump) and start over
  16. 16. GitOps Progressive Delivery Demo 16
  17. 17. Links 17 Flagger https://github.com/stefanprodan/flagger GitOps demo repo https://github.com/stefanprodan/gitops-progressive-delivery Istio walkthrough https://github.com/stefanprodan/istio-gke

×