Crashing Pods: How to Compensate for such an Outage?

Crashing Pods
–
How to compensate for such an outage?
Michael Hofmann
Hofmann IT-Consulting
info@hofmann-itconsulting.de
https://hofmann-itconsulting.de

Crashing pods?
●
Controlled (error state): rolling update
●
Application deadlock
– Thread pool full
– Thread deadlock situation (detection in JVM)
●
Memory Leak (out of memory)
●
Bug in application or application server

Mitigation/Compensation Strategies
●
Quick recognition of error state for recovery
●
Short time for eventual consistency
●
Controlled error state (e.g. rolling update)
●
Intelligent routing (outlier detection)
●
Classic resilience

Kubernetes Architecture
Source: https://kubernetes.io/docs/concepts/overview/components/

Liveness and readiness probes
spec:
containers:
- name: crashing-pod
image: hofmann/crashing-pod:latest
imagePullPolicy: Never
ports:
- containerPort: 9080
livenessProbe:
httpGet:
path: /health/live
port: 9080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /health/ready
port: 9080
initialDelaySeconds: 15
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 5
●
Difference liveness
and readiness probe
●
Only pods with a
successful readiness
probe will be
assigned to a service
(endpoint, get IP)

From service to pod
●
Service
– Called by client: <svc-name>.<ns>.svc.cluster.local
– Basis for dns naming
– References pod by labels
●
Pods
– Assigned IPs
●
Endpoints
– Connects service to pod-instances (IPs)
– stored in etcd: IP and port
– Endpoint refresh: pod created, pod deleted, pod label modified
– Basis for: kube-proxy, ingress controller, coreDNS, cloud provider, service mesh

Endpoint outdated
●
Kubelet:
– Readiness probes
– Housekeeping interval to update endpoint
●
Kube-proxy (iptables settings)
●
Kubernetes DNS (coreDNS)
●
Caching of DNS values in client

Rolling update
●
Update running pods
– Defined by rolling update strategy
●
Influenced by
– liveness and readiness probes
– preStop lifecycle hook
●
Distributed infrastructure can react on error state (update components)
– SIGTERM (not SIGKILL)
●
Shutdown hook in application server (finish open requests)
●
Target: zero-downtime-deployment (should...)

Rolling update & preStop Hook
readinessProbe:
...
lifecycle:
preStop:
exec:
command: ["/bin/bash", "-c", "sleep 30"]
strategy: # default of k8s
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # max. 1 over-provisioned pod
maxUnavailable: 0 # no unavailable pod during update
Container
Deployment

Intelligent Routing
●
Server side load balancing
– Endpoint handling done by infrastructure (K8S)
– Requests will be routed to faulty instance until platform evicts faulty instance
●
Client side load balancing
– Client must now all endpoints: dependency on infrastructure (service registry)
– Can react on faulty request
●
Outlier detection (additional to client side LB)
– Faulty instance (HTTP >= 500) will be evicted (period of time)
– Reacts faster than distributed infrastructure

Resilience
●
Frameworks
– Server side load balancing
●
Retry storm on faulty pods
– Spring Cloud LoadBalancer (client side LB)
●
Since 2020
●
Generic abstraction for Netflix Ribbon
●
Kubernetes and Cloud Foundry service registry
●
Service Mesh

Idempotency
●
Retry causes multiple calls!
●
GET, HEAD, OPTIONS, DELETE (if exists)
●
PUT
– Idempotent by definition
●
must be implemented idempotent (DuplicateKeyException)
– Primary key must be in payload
●
POST
– Idempotency key in header
– Idempotency key stored in separate table
– PUT semantics with primary key (header vs. payload)

Istio
●
Resilience
●
Client side load balancing (knows pods)
●
Outlier detection
●
Does it`s own health checks (in addition to
kubelet)
●
kubelet checks sidecar and workload together

Istio
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: mesh-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"

Istio
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: crashing-pod
spec:
gateways:
- mesh-gateway
hosts:
- "*"
http:
- match:
- uri:
prefix: /
route:
- destination:
port:
number: 9080
host: crashing-pod
subset: v1
retries:
attempts: 3
perTryTimeout: 1s
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: crashing-pod
spec:
host: crashing-pod
subsets:
- name: v1
labels:
app: crashing-pod
trafficPolicy:
tls:
mode: DISABLE
loadBalancer:
simple: ROUND_ROBIN
outlierDetection:
consecutiveGatewayErrors: 1
interval: 1.0s
baseEjectionTime: 30s

Recap: Quick Error-Recognition
●
Interval of health probes (liveness, readiness) by kubelet
●
Other error detection by kubelet (OOM)
●
Problem: distributed architecture of K8S (propagation of
error event to components)
●
Error type:
– Controlled error state (e.g. rolling update)
– Fast detectable errors
– Slow detectable errors

Summary
●
Distributed architecture of K8S
●
Controlled error state: 99,9% (see rolling update) -->
100%?!
●
Mix of strategy necessary: 100%
– Client side load balancing
– Outlier detection
– Resilience
– Idempotency

Crashing Pods: How to Compensate for such an Outage?

Recommended

Recommended

More Related Content

More from Michael Hofmann

More from Michael Hofmann (6)

Recently uploaded

Recently uploaded (20)

Crashing Pods: How to Compensate for such an Outage?