meetup 16/7/2018
Agenda
● Redis Labs intro and architecture
● Double orchestration
● Our kubernetes solution
● The way to operators
● Operators intro
● Operators development
● Demo
Redis Labs
Intro And Arch
3
Introduction to Redis Enterprise
Open source. The leading in-memory database platform,
supporting any high performance operational, analytics or
hybrid use case.
The open source home and commercial provider of Redis
Enterprise (Redise
) technology, platform, products & services.
We Are Hiring !
Redise
- Open Source & Proprietary Technology
Redise
Node
Cluster Manager
Redise
Cluster
• Shared nothing, symmetrical cluster
architecture
• Fully compatible with open source
commands & data structures
Enterprise Layer
Open Source Layer
REST API
Zero latency proxy
• Faster time to market with continuity between
dev/test and production environments that use
Redise
Pack
• Highly available, easier to scale, simpler to manage
Redis technology, integrated with orchestration tools
such as PCF, Kubernetes, Mesosphere...
• Node in a container approach — All Redise
services
inside each container.
Run Redise
clusters on single or multiple nodes
Redise
in Containers
Node in a Pod Approach
Node 1
Vs
Node 2 Node 3 Node 1 Node 2 Node 3
One pod, multiple services per nodeMultiple pods, multiple services per node
Double Orchestration
For fun and profit!
10
What’s Double Orchestration ?
Kubernetes PKS
External
Redis Cluster
Orchestration
Node 1
Redis Shards
Node 2 Node N
Internal
Why like this?
• Resource management - Orchestration platforms are
designed to be generic.
• Again - Performance is king.
• Last but not least, it allows us to maintain a common
architecture - regardless of running environment, be it bare
metal, VM, K8s, Pivotal Cloud Foundry.
(.… Surprisingly enough, not everybody in the world uses containers…)
Who Does What
• Node auto-healing
• Node scaling
• Failover & scaling
• Configuration & monitoring
• Service discovery
• Upgrade
+
And specifically on Kubernetes
Node in a
Pod
Statefulset
Persistent
Volumes
Custom
Controller
Services
Manager
Our Kubernetes Solution
StatefulSet
Our cluster nodes are deployed as part of a statefulset
Affinity
Allows us to control the Redislabs cluster nodes topology
Redislabs Service Manager
Create/Update/Delete service entries for each Redis DB hosted on the cluster
RBAC
The Service Manager must have permissions to access the namespace to create services
Ingress
Allow access to Redis DBs from outside of the k8s cluster
17
Redis Labs on Kubernetes - Building Blocks
StatefulSet
• Introduced in 1.5, GAed in 1.9
• Statefulset Pod consistency
– Pod naming
– Scale-out/Scale-in
– Pod Upgrade
• Persistent Disks
– Same PVC will be used when Pod is (re)scheduled
• All Pods are uniform
• Recovery from error state
pod-0 pod-1 pod-2
PV PV PV
Pod features
• Anti-affinity
– Allows us to control where the pods are being scheduled
• Readiness Probes
– Allows us to control the action flows to avoid data loss
• Pre-stop hook
– Drain the node and move resources to a different node
Why?
• Redis Enterprise is a multi-tenant Redis cluster
• Redis Enterprise Database can have 1 or more network endpoints
Problem
• Expose databases as a service instance
Solution
• Python based application that will: create, delete or update necessary database
service entries
• Based on an idempotent reconciliation loop
Redis Enterprise Services Manager
Kubernetes Cluster
Redis Labs Stateful Set
Worker Node
pod-0
Worker Node
pod-2
Worker Node
pod-1
Redis Enterprise Cluster
PV PV PV
K8s API
Services
Manager
Add/Edit/Delete
Database Services
App App App
The way to operators
• Provide a solid primary db solution for end-users
• Stateful application
– Some changes cannot be performed
– Some changes need to mutate the state before applying the actual change
– Data-loss is unacceptable
• Support multiple k8s deployments
– Cloud: GKE, AWS, etc
– Openshift
– PKS
– Vanilla
– On-prem hardware vendor
• Ingress
• Packaging
Redis Labs Challenges
• Started out with 9 static yaml files
– Hard to deploy
– Hard to maintain
– Hard to distribute
– No control over the deployment life-cycle
• Helm
– Customized deployment
– Easier to maintain
– Not fully supported everywhere
– No control over the deployment life-cycle
• Operator
– Simple deployment (2 yaml files)
– Full control over life-cycle
– K8s compatible
Our journey
.yaml
.yaml
Operator
26
Custom Resource
+
Custom Controller
=
Operator
27
kubectl
API Server
StatefulSet Controller
Watch(StatefulSet)
pod-0 pod-1 pod-2
my-sts
kubectl create -f my-sts.yaml
PV PV PV
28
kubectl
StatefulSet Controller
Watch(StatefulSet)
pod-0 pod-1 pod-2
my-sts
kubectl scale statefulset my-sts --replicas=5
pod-3 pod-4
API Server
29
kubectl
API Server
RedisCluster Controller
Watch(RedisCluster)
my-redis-cluster
kubectl create -f my-redis-cluster.yaml
Stateful
Set
UI
service
Service
Account
...
30
kubectl
API Server
RedisCluster Controller
Watch(RedisCluster)
my-redis-cluster
kubectl apply -f my-redis-cluster.yaml
Stateful
Set
UI
service
Service
Account
...
get-status()
example: downscale
● Life Cycle Control
○ Scale Up → Add new pod, Rebalance Data
○ Healing → Restore Backups, Auto Recovery
○ Backup
○ Validations (ex. even # pods)
● Configuration
○ Automate complex deployments (ex. Vault cluster and etcd cluster)
○ Reconfiguration
○ Agnostic configuration (ex. PVC by cloud provider)
● 3rd party resource (ex. prometheus)
● Cross distribution
● Easy to deploy
Why are operators useful?
32
Our Upgrade Flow
In a Redis Enterprise Cluster we need to:
1. Drain pod
2. Stop pod
3. Start new pod
● Downgrade - not supported (oss backward compatibility)
Our Upgrade Flow
With Yaml/Helm -
We used a life cycle preStop hook of a stateful set
1. Encoded inside the yaml - cumbersome
2. Cannot validate version
3. No error handling
With Operator -
1. Maintain logic in code not in a config file
2. Validations: not a downgrade, cluster is not already in an upgrade process
3. Error handling
4. Manage canary deployment
34
crd_cluster.yaml
35
operator.yaml
36
cr.yaml
37
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
labelSelector:
matchExpressions:
key: app
operator: In
values:
{{ template "redisenterprise.name" . }}
key: release
operator: In
values:
{{ .Release.Name }}
key: redis.io/role
operator: In
values:
node
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 31536000
serviceAccountName: {{ template "redisenterprise.serviceAccountName" . }}
{{ with .Values.imagePullSecrets }}
imagePullSecrets:
{{ toYaml . | indent 8 }}
{{ end }}
containers:
name: redis
image: {{ .Values.redisImage.repository }}:{{ .Values.redisImage.tag }}
imagePullPolicy: {{ .Values.redisImage.pullPolicy }}
readinessProbe:
exec:
command:
# check that the node is bootstrapped and that its connected and synced.
bash
c
curl silent localhost:8080/v1/bootstrap && /opt/redislabs/bin/rladmin status
nodes | grep node:$(cat /etc/opt/redislabs/node.id) | grep OK
initialDelaySeconds: 20
timeoutSeconds: 5
lifecycle:
preStop:
exec:
command:
# enslave the node, if this current node is master, change the master to
the first slave node.
bash
c
/opt/redislabs/bin/rladmin node $(cat /etc/opt/redislabs/node.id) enslave
&& ((/opt/redislabs/bin/rladmin status nodes | grep node:$(cat
/etc/opt/redislabs/node.id) | grep q master) && /opt/redislabs/bin/rlutil
change_master master=$(/opt/redislabs/bin/rladmin status nodes | grep slave |
head 1 | cut d " " f 1| cut d ":" f2) && sleep 10) || /bin/true
resources:
{{ toYaml .Values.redisResources | indent 10 }}
ports:
containerPort: 8001
containerPort: 8443
containerPort: 9443
securityContext:
capabilities:
add:
SYS_RESOURCE
{{ if .Values.persistentVolume.enabled }}
volumeMounts:
mountPath: "/opt/persistent"
name: redisstorage
{{ end }}
env:
name: K8S_ORCHASTRATED_DEPLOYMENT
value: "yes"
name: JOIN_HOSTNAME
value: {{ template "redisenterprise.fullname" . }}
{{ if .Values.persistentVolume.enabled }}
name: PERSISTANCE_PATH
value: /opt/persistent
{{ end }}
name: K8S_SERVICE_NAME
value: {{ template "redisenterprise.fullname" . }}
name: BOOTSTRAP_HANDLE_REDIRECTS
value: "enabled"
name: BOOTSTRAP_CLUSTER_FQDN
value: {{ template "redisenterprise.clusterDNS" . }}
name: BOOTSTRAP_DMC_THREADS
value: "10"
name: BOOTSTRAP_USERNAME
valueFrom:
secretKeyRef:
name: {{ template "redisenterprise.fullname" . }}
key: username
name: BOOTSTRAP_PASSWORD
valueFrom:
secretKeyRef:
name: {{ template "redisenterprise.fullname" . }}
key: password
name: BOOTSTRAP_LICENSE
valueFrom:
secretKeyRef:
name: {{ template "redisenterprise.fullname" . }}
key: license
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: {{ template "redisenterprise.statefulsetName" . }}
labels:
app: {{ template "redisenterprise.name" . }}
chart: {{ template "redisenterprise.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
spec:
{{ if .Values.persistentVolume.enabled }}
volumeClaimTemplates:
metadata:
name: redisstorage
labels:
app: {{ template "redisenterprise.name" . }}
chart: {{ template "redisenterprise.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: {{ .Values.persistentVolume.size | quote }}
{{ if .Values.persistentVolume.storageClass }}
{{ if (eq "" .Values.persistentVolume.storageClass) }}
storageClassName: ""
{{ else }}
storageClassName: "{{ .Values.persistentVolume.storageClass }}"
{{ end }}
{{ end }}
{{ end }}
serviceName: {{ template "redisenterprise.fullname" . }}
replicas: {{ .Values.replicas }}
updateStrategy:
type: "RollingUpdate"
template:
metadata:
labels:
redis.io/role: node
app: {{ template "redisenterprise.name" . }}
chart: {{ template "redisenterprise.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
spec:
{{ with .Values.nodeSelector }}
nodeSelector:
{{ toYaml . | indent 8 }}
{{ end }}
38
kubectl create -f cr.yaml
cr.yaml
"apiVersion": "app.redislabs.com/v1alpha1",
"items": [
{
"apiVersion": "app.redislabs.com/v1alpha1",
"kind": "RedisEnterpriseCluster",
"metadata": {
...
"creationTimestamp": "2018-07-12T15:47:31Z",
"generation": 0,
"name": " my-cluster-test",
"namespace": "redis"
},
"spec": {
"nodes": 3,
"serviceAccountName": "my-cluster-test",
"uiServiceType": "ClusterIP",
"username": " demo@redislabs.com "
...
curl http://127.0.0.1:8001/apis/app.redislabs.com/v1alpha1/
redisenterpriseclusters
Operator Development
39
• Started by CoreOS
– CoreOS pioneered by creating a few Operators (Prometheus & vault)
• Operator SDK:
minimize boilerplate and help developers to get started writing Operators
• The Basic API:
– Register Watchers on any Resource
– Create/Read/Update/Delete/Get on any resource
– Register schemas using k8s GO api
• Operator Lifecycle Manager
41
The Reconciliation/Control Loop
• Called for every update, delete or creation on the watched resources
– No way of knowing what type of event except Delete
• Called every X seconds to “resync” resources
• Our responsibility is to allow the user to use our resource as any other in k8s
– AKA idempotent API
• Every call to handle we get our watched resources, we need to determine what to
do exactly
42
Idempotent APIs
Desired State = Current Resource Current State
• Aggregation of deployed
resources
• Internal application
state
43
The Reconciliation/Control Loop - Challenges
• Determine which changes need to happen
• Determine if the change is valid
• K8s doesn’t provide a solid validation before applying changes to CR
– 1.9 has a beta feature for OpenAPI validations
• Long running processes as part of a resource change
Pending
Creation
Running
Invalid
Error
create
create
apply
create
apply
apply
Pending Creation - initial state where cluster is not deployed yet
Running - Cluster Deployed and is either running or starting to run and not ready yet
Invalid - Invalid configuration was requested. E.g. even #nodes. Until a valid configuration is applied the status will remain invalid
Error - Error when trying to deploy or update the Redis Enterprise Cluster
apply
Redis Cluster Status
applyapply
create = kubectl create -f cr.yaml
apply = kubectl apply -f cr.yaml
45
Development Challenges
• Deep understanding of how Kubernetes works (statefulsets, controller, APIs)
• Workflows - Idempotent APIs are challenging due to state mutation
• Double Orchestration - Adds a level of complexity compared to stateless
deployments
• Various SDK issues
https://www.telepresence.io
46
One Last Thing
Demo
We Are Hiring !

Orchestrating Redis & K8s Operators

  • 1.
  • 2.
    Agenda ● Redis Labsintro and architecture ● Double orchestration ● Our kubernetes solution ● The way to operators ● Operators intro ● Operators development ● Demo
  • 3.
  • 4.
    Introduction to RedisEnterprise Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and commercial provider of Redis Enterprise (Redise ) technology, platform, products & services.
  • 6.
  • 7.
    Redise - Open Source& Proprietary Technology Redise Node Cluster Manager Redise Cluster • Shared nothing, symmetrical cluster architecture • Fully compatible with open source commands & data structures Enterprise Layer Open Source Layer REST API Zero latency proxy
  • 8.
    • Faster timeto market with continuity between dev/test and production environments that use Redise Pack • Highly available, easier to scale, simpler to manage Redis technology, integrated with orchestration tools such as PCF, Kubernetes, Mesosphere... • Node in a container approach — All Redise services inside each container. Run Redise clusters on single or multiple nodes Redise in Containers
  • 9.
    Node in aPod Approach Node 1 Vs Node 2 Node 3 Node 1 Node 2 Node 3 One pod, multiple services per nodeMultiple pods, multiple services per node
  • 10.
  • 12.
    What’s Double Orchestration? Kubernetes PKS External Redis Cluster Orchestration Node 1 Redis Shards Node 2 Node N Internal
  • 13.
    Why like this? •Resource management - Orchestration platforms are designed to be generic. • Again - Performance is king. • Last but not least, it allows us to maintain a common architecture - regardless of running environment, be it bare metal, VM, K8s, Pivotal Cloud Foundry. (.… Surprisingly enough, not everybody in the world uses containers…)
  • 14.
    Who Does What •Node auto-healing • Node scaling • Failover & scaling • Configuration & monitoring • Service discovery • Upgrade +
  • 15.
    And specifically onKubernetes Node in a Pod Statefulset Persistent Volumes Custom Controller Services Manager
  • 16.
  • 17.
    StatefulSet Our cluster nodesare deployed as part of a statefulset Affinity Allows us to control the Redislabs cluster nodes topology Redislabs Service Manager Create/Update/Delete service entries for each Redis DB hosted on the cluster RBAC The Service Manager must have permissions to access the namespace to create services Ingress Allow access to Redis DBs from outside of the k8s cluster 17 Redis Labs on Kubernetes - Building Blocks
  • 18.
    StatefulSet • Introduced in1.5, GAed in 1.9 • Statefulset Pod consistency – Pod naming – Scale-out/Scale-in – Pod Upgrade • Persistent Disks – Same PVC will be used when Pod is (re)scheduled • All Pods are uniform • Recovery from error state pod-0 pod-1 pod-2 PV PV PV
  • 19.
    Pod features • Anti-affinity –Allows us to control where the pods are being scheduled • Readiness Probes – Allows us to control the action flows to avoid data loss • Pre-stop hook – Drain the node and move resources to a different node
  • 20.
    Why? • Redis Enterpriseis a multi-tenant Redis cluster • Redis Enterprise Database can have 1 or more network endpoints Problem • Expose databases as a service instance Solution • Python based application that will: create, delete or update necessary database service entries • Based on an idempotent reconciliation loop Redis Enterprise Services Manager
  • 21.
    Kubernetes Cluster Redis LabsStateful Set Worker Node pod-0 Worker Node pod-2 Worker Node pod-1 Redis Enterprise Cluster PV PV PV K8s API Services Manager Add/Edit/Delete Database Services App App App
  • 22.
    The way tooperators
  • 23.
    • Provide asolid primary db solution for end-users • Stateful application – Some changes cannot be performed – Some changes need to mutate the state before applying the actual change – Data-loss is unacceptable • Support multiple k8s deployments – Cloud: GKE, AWS, etc – Openshift – PKS – Vanilla – On-prem hardware vendor • Ingress • Packaging Redis Labs Challenges
  • 24.
    • Started outwith 9 static yaml files – Hard to deploy – Hard to maintain – Hard to distribute – No control over the deployment life-cycle • Helm – Customized deployment – Easier to maintain – Not fully supported everywhere – No control over the deployment life-cycle • Operator – Simple deployment (2 yaml files) – Full control over life-cycle – K8s compatible Our journey .yaml .yaml
  • 25.
  • 26.
  • 27.
    27 kubectl API Server StatefulSet Controller Watch(StatefulSet) pod-0pod-1 pod-2 my-sts kubectl create -f my-sts.yaml PV PV PV
  • 28.
    28 kubectl StatefulSet Controller Watch(StatefulSet) pod-0 pod-1pod-2 my-sts kubectl scale statefulset my-sts --replicas=5 pod-3 pod-4 API Server
  • 29.
    29 kubectl API Server RedisCluster Controller Watch(RedisCluster) my-redis-cluster kubectlcreate -f my-redis-cluster.yaml Stateful Set UI service Service Account ...
  • 30.
    30 kubectl API Server RedisCluster Controller Watch(RedisCluster) my-redis-cluster kubectlapply -f my-redis-cluster.yaml Stateful Set UI service Service Account ... get-status() example: downscale
  • 31.
    ● Life CycleControl ○ Scale Up → Add new pod, Rebalance Data ○ Healing → Restore Backups, Auto Recovery ○ Backup ○ Validations (ex. even # pods) ● Configuration ○ Automate complex deployments (ex. Vault cluster and etcd cluster) ○ Reconfiguration ○ Agnostic configuration (ex. PVC by cloud provider) ● 3rd party resource (ex. prometheus) ● Cross distribution ● Easy to deploy Why are operators useful?
  • 32.
    32 Our Upgrade Flow Ina Redis Enterprise Cluster we need to: 1. Drain pod 2. Stop pod 3. Start new pod ● Downgrade - not supported (oss backward compatibility)
  • 33.
    Our Upgrade Flow WithYaml/Helm - We used a life cycle preStop hook of a stateful set 1. Encoded inside the yaml - cumbersome 2. Cannot validate version 3. No error handling With Operator - 1. Maintain logic in code not in a config file 2. Validations: not a downgrade, cluster is not already in an upgrade process 3. Error handling 4. Manage canary deployment
  • 34.
  • 35.
  • 36.
  • 37.
    37 affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: labelSelector: matchExpressions: key: app operator: In values: {{template "redisenterprise.name" . }} key: release operator: In values: {{ .Release.Name }} key: redis.io/role operator: In values: node topologyKey: kubernetes.io/hostname terminationGracePeriodSeconds: 31536000 serviceAccountName: {{ template "redisenterprise.serviceAccountName" . }} {{ with .Values.imagePullSecrets }} imagePullSecrets: {{ toYaml . | indent 8 }} {{ end }} containers: name: redis image: {{ .Values.redisImage.repository }}:{{ .Values.redisImage.tag }} imagePullPolicy: {{ .Values.redisImage.pullPolicy }} readinessProbe: exec: command: # check that the node is bootstrapped and that its connected and synced. bash c curl silent localhost:8080/v1/bootstrap && /opt/redislabs/bin/rladmin status nodes | grep node:$(cat /etc/opt/redislabs/node.id) | grep OK initialDelaySeconds: 20 timeoutSeconds: 5 lifecycle: preStop: exec: command: # enslave the node, if this current node is master, change the master to the first slave node. bash c /opt/redislabs/bin/rladmin node $(cat /etc/opt/redislabs/node.id) enslave && ((/opt/redislabs/bin/rladmin status nodes | grep node:$(cat /etc/opt/redislabs/node.id) | grep q master) && /opt/redislabs/bin/rlutil change_master master=$(/opt/redislabs/bin/rladmin status nodes | grep slave | head 1 | cut d " " f 1| cut d ":" f2) && sleep 10) || /bin/true resources: {{ toYaml .Values.redisResources | indent 10 }} ports: containerPort: 8001 containerPort: 8443 containerPort: 9443 securityContext: capabilities: add: SYS_RESOURCE {{ if .Values.persistentVolume.enabled }} volumeMounts: mountPath: "/opt/persistent" name: redisstorage {{ end }} env: name: K8S_ORCHASTRATED_DEPLOYMENT value: "yes" name: JOIN_HOSTNAME value: {{ template "redisenterprise.fullname" . }} {{ if .Values.persistentVolume.enabled }} name: PERSISTANCE_PATH value: /opt/persistent {{ end }} name: K8S_SERVICE_NAME value: {{ template "redisenterprise.fullname" . }} name: BOOTSTRAP_HANDLE_REDIRECTS value: "enabled" name: BOOTSTRAP_CLUSTER_FQDN value: {{ template "redisenterprise.clusterDNS" . }} name: BOOTSTRAP_DMC_THREADS value: "10" name: BOOTSTRAP_USERNAME valueFrom: secretKeyRef: name: {{ template "redisenterprise.fullname" . }} key: username name: BOOTSTRAP_PASSWORD valueFrom: secretKeyRef: name: {{ template "redisenterprise.fullname" . }} key: password name: BOOTSTRAP_LICENSE valueFrom: secretKeyRef: name: {{ template "redisenterprise.fullname" . }} key: license apiVersion: apps/v1beta1 kind: StatefulSet metadata: name: {{ template "redisenterprise.statefulsetName" . }} labels: app: {{ template "redisenterprise.name" . }} chart: {{ template "redisenterprise.chart" . }} release: {{ .Release.Name }} heritage: {{ .Release.Service }} spec: {{ if .Values.persistentVolume.enabled }} volumeClaimTemplates: metadata: name: redisstorage labels: app: {{ template "redisenterprise.name" . }} chart: {{ template "redisenterprise.chart" . }} release: {{ .Release.Name }} heritage: {{ .Release.Service }} spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: {{ .Values.persistentVolume.size | quote }} {{ if .Values.persistentVolume.storageClass }} {{ if (eq "" .Values.persistentVolume.storageClass) }} storageClassName: "" {{ else }} storageClassName: "{{ .Values.persistentVolume.storageClass }}" {{ end }} {{ end }} {{ end }} serviceName: {{ template "redisenterprise.fullname" . }} replicas: {{ .Values.replicas }} updateStrategy: type: "RollingUpdate" template: metadata: labels: redis.io/role: node app: {{ template "redisenterprise.name" . }} chart: {{ template "redisenterprise.chart" . }} release: {{ .Release.Name }} heritage: {{ .Release.Service }} spec: {{ with .Values.nodeSelector }} nodeSelector: {{ toYaml . | indent 8 }} {{ end }}
  • 38.
    38 kubectl create -fcr.yaml cr.yaml "apiVersion": "app.redislabs.com/v1alpha1", "items": [ { "apiVersion": "app.redislabs.com/v1alpha1", "kind": "RedisEnterpriseCluster", "metadata": { ... "creationTimestamp": "2018-07-12T15:47:31Z", "generation": 0, "name": " my-cluster-test", "namespace": "redis" }, "spec": { "nodes": 3, "serviceAccountName": "my-cluster-test", "uiServiceType": "ClusterIP", "username": " demo@redislabs.com " ... curl http://127.0.0.1:8001/apis/app.redislabs.com/v1alpha1/ redisenterpriseclusters
  • 39.
  • 40.
    • Started byCoreOS – CoreOS pioneered by creating a few Operators (Prometheus & vault) • Operator SDK: minimize boilerplate and help developers to get started writing Operators • The Basic API: – Register Watchers on any Resource – Create/Read/Update/Delete/Get on any resource – Register schemas using k8s GO api • Operator Lifecycle Manager
  • 41.
    41 The Reconciliation/Control Loop •Called for every update, delete or creation on the watched resources – No way of knowing what type of event except Delete • Called every X seconds to “resync” resources • Our responsibility is to allow the user to use our resource as any other in k8s – AKA idempotent API • Every call to handle we get our watched resources, we need to determine what to do exactly
  • 42.
    42 Idempotent APIs Desired State= Current Resource Current State • Aggregation of deployed resources • Internal application state
  • 43.
    43 The Reconciliation/Control Loop- Challenges • Determine which changes need to happen • Determine if the change is valid • K8s doesn’t provide a solid validation before applying changes to CR – 1.9 has a beta feature for OpenAPI validations • Long running processes as part of a resource change
  • 44.
    Pending Creation Running Invalid Error create create apply create apply apply Pending Creation -initial state where cluster is not deployed yet Running - Cluster Deployed and is either running or starting to run and not ready yet Invalid - Invalid configuration was requested. E.g. even #nodes. Until a valid configuration is applied the status will remain invalid Error - Error when trying to deploy or update the Redis Enterprise Cluster apply Redis Cluster Status applyapply create = kubectl create -f cr.yaml apply = kubectl apply -f cr.yaml
  • 45.
    45 Development Challenges • Deepunderstanding of how Kubernetes works (statefulsets, controller, APIs) • Workflows - Idempotent APIs are challenging due to state mutation • Double Orchestration - Adds a level of complexity compared to stateless deployments • Various SDK issues
  • 46.
  • 47.
  • 48.