5. Kubernetes
Operator
5
- Extend Kubernetes API with
additional objects
- Encapsulate operational
know-how of your application
- Manages your application as a
Kubernetes native object
- Your application is represented
via a CustomResourceDefinition
6. More on Operators
- Run inside K8S cluster (e.g. as a Deployment)
- Implements an CRUD API for CustomResources (CR)
- Asynchronously responds to changes to CRs _after_ they are
written by the Kubernetes API server
- Each change triggers a reconcile() run, where the operational
know-how of your Operator lives
- reconcile() tries to bring the current state of your application to
the desired (as per spec) state
7. Kubernetes Operators and Resources
Resource Schemas are seperated into three sections:
- Spec. Defines the desired state of the resource as specified by user.
- Status. Publishes the resource state as observed by the Operator.
- Metadata. Contains information common to most resources about
the object such as object name, annotations, labels and more.
Operators usually only read the Spec, while they might both read and
update Status and Metadata
8. Main Purpose: Reconciling Desired and Actual States
- Operators create Watches for
Resources they manage
- For every CRUD operation on a
watched Resource, reconcile() is
triggered
- Operator might create new Resources
that are owned by the Operator
- Those resources have
metadata.ownerReferences
Kubernetes
API
Operator
watch v1.MyRes
Create MyRes “demo”
reconcile(
) triggeredFetch v1.MyRes “demo”
check MyRes
current state and
desired stare
Create v1.Pod “demoPod”
update MyRes
stateUpdate v1.MyRes “demo”
9. Popular Operators
- prometheus-operator probably most widely adopted one
- Many other Operators available for managing your applications on
Kubernetes:
- etcd-operator, mongodb-operator,
confluent-operator
- Awesome list:
- https://github.com/operator-framework/awesome-operators
11. Tribefire’s Application Deployment Model
tribefire-master
control-center
modeler
explorer
custom-cartridges
SQL DB
ETCD
ActiveMQ 3rd Party
Services
12. Managing Tribefire on Kubernetes: TribefireOperator
- Manages our Tribefire platform on Kubernetes
- Tribefire is a model-driven application delivery platform
- Consists of several components like master, control-center, etc
- Tribefire is represented via TribefireRuntime CRD on K8S
- TribefireOperator maps CRD to Kubernetes native resources
such as Pods, Services, Ingresses etc.
13. TribefireOperator: Mapping K8S native resources
Deployment
Kubernetes Cluster
Service
Ingress
Secrets
tribefire-operator
<<custom resource>>
TribefireRuntime
The operator watches CRUD of
TribefireRuntime resources
and acts accordingly by CRUD’ing
the required Kubernetes native
objects
14. Representing Tribefire as a CRD: TribefireRuntime
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: tribefireruntimes.tribefire.cloud
spec:
group: tribefire.cloud
names:
kind: TribefireRuntime
plural: tribefireruntimes
shortNames:
- tf
scope: Namespaced
subresources:
status: {}
versions:
- name: v1alpha1
storage: true
served: true
apiVersion: tribefire.cloud/v1alpha1
kind: TribefireRuntime
metadata:
name: infracoders
namespace: tribefire
spec:
domain: tribefire.cloud
databaseType: cloudSql
backend:
type: etcd
components:
- name: tribefire-master
type: Services
logLevel: FINE
logJson: false
env:
- name: "TRIBEFIRE_HOST"
value: "demo.svc"
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2048Mi"
cpu: "2000m"
- name: tribefire-control-center
type: ControlCenter
TribefireRuntime CRD:
deployed “once” to K8S by
cluster-admin. Declares a
new CustomResource by
describing its metadata and
specification
TribefireRuntime
CR: deployed by
Tribefire users,
describing the specific
Tribefire components
and capabilites that are
needed.
15. TribefireRuntime CRs are Kubernetes native objects
apiVersion: tribefire.cloud/v1alpha1
kind: TribefireRuntime
metadata:
name: infracoders
namespace: demo
spec:
domain: tribefire.cloud
databaseType: cloudSql
backend:
type: etcd
components:
- name: tribefire-master
type: Services
logLevel: FINE
logJson: false
env:
- name: "TRIBEFIRE_HOST"
value: "demo.svc"
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2048Mi"
cpu: "2000m"
- name: tribefire-control-center
type: ControlCenter
> kubectl create -f tribefire-infracoders.yaml
tribefireruntime.tribefire.cloud/infracoders created
> kubectl get tf -n demo
NAME STATUS AGE
infracoders unavailable 10s
tfdemo-dev available 2d
datapedia available 2w
> kubectl get tf -n demo -o wide
NAME STATUS AGE DOMAIN DATABASE BACKEND UNAVAILABLE
infracoders unavailable 18s tribefire.cloud cloudsql activemq tribefire-master
tfdemo-dev available 2d tribefire.cloud cloudsql etcd
Datapedia available 2w tribefire.cloud cloudsql etcd
> kubectl edit tf -n demo infracoders
tribefireruntime.tribefire.cloud/infracoders edited
> kubectl delete tf -n demo infracoders
tribefireruntime.tribefire.cloud "infracoders" deleted
16. Accessing the TribefireRuntime CR via Kubernetes API
/apis/tribefire.cloud/v1alpha1/namespaces/infracoders/
/apis/
tribefire.cloud/v1alpha1/
namespaces/infracoders/
tribefireruntimes/demo
spec.version
spec.group
metadata.namespace
spec.names.plural
metadata.name
spec.scope: Namespaced
> kubectl proxy --port=8080
Starting to serve on 127.0.0.1:8080...
> curl localhost:8080/apis/tribefire.cloud/v1alpha1/namespaces/infracoders/tribefireruntimes/demo
… huge json response here…
17. CustomResource and RBAC
- Managing deployments
manually requires that every
user has privileges to create
Deployments, Services etc
- With operators, you only need
permission to deploy your
CustomResource
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: tribefire-runtime-admin
namespace: demo
rules:
- apiGroups:
- tribefire.cloud
resources:
- "*"
verbs:
- "*"
Users that want to manage
TribefireRuntimes only
need permissions for the
tribefire.cloud APIs
19. CustomResource and Default Values
- There is no way to specify defaults in a
CRD - OpenAPI spec does allow that but
Kubernetes doesn’t
- Setting defaults via the first Reconcile run
inside the operator might work, but can
introduce race conditions
- Setting defaults via (mutating) Webhooks is
the only safe way to handle defaults in a
CR.
API HTTP
Handler
API Request
Authn/Authz
Mutating
Admission
Controllers
Object
Validator
Etcd
Persistence
Handler
Validating
Admission
Controllers
Mutating
Webhook
Handler
The handler receives
Admission request
including the object
under admission. It can
either directly admit or
decline the request, or
return a set of JSON
patches to mutate the
object under admission.
20. Running pre-delete hooks via Finalizers
- Used to trigger cleanup logic such as
de-provisioning databases or
storage
- Resource deletion cannot proceed
until finalizers are gone
- metadata.deletionTimestamp
as a marker that the resource
handled by the Operator is being
deleted
apiVersion: tribefire.cloud/v1alpha1
kind: TribefireRuntime
metadata:
creationTimestamp: 2019-01-14T09:33:46Z
finalizers:
- default.finalizers.tribefire.cloud
generation: 2
labels:
stage: staging
name: infracoders
namespace: demo
When the Operator has
finished cleanup task, it
has to remove the
finalizer(s) accordingly in
order to release the
resource and let
Kubernetes delete the
resource
21. Provide feedback to users via /status subresource
- Show the current state of your
custom resource
- Use observedGeneration to check if
the .spec of your resource has
changed
- Implement status.conditions to
support synchronous tasks via
kubectl wait --for=condition=available
status:
components:
- name: tribefire-services
status: available
urls:
- https://ic.staging.tribefire.cloud/services
- name: tribefire-demo-cartridge
status: available
conditions:
- lastTransitionTime: 2019-01-14T09:39:08Z
lastUpdateTime: 2019-01-14T09:39:08Z
message: TribefireRuntime fully available
reason: TribefireRuntimeBecameAvailable
status: "True"
type: Available
observedGeneration: 2
status: available
22. Using OpenAPI for Validation
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
...
subresources:
status: {}
validation:
openAPIV3Schema:
spec:
properties:
backend:
properties:
parameters:
items:
properties:
name:
type: string
value:
type: string
required:
- name
- value
type: object
type: array
type:
enum:
- etcd
- activemq
type: string
type: object
apiVersion: tribefire.cloud/v1alpha1
kind: TribefireRuntime
metadata:
name: infracoders
namespace: tribefire
spec:
domain: tribefire.cloud
databaseType: cloudSql
backend:
parameters:
- name: url
value: http://tf-etcd-cluster-client.etcd:2379
type: etcd
components:
- name: tribefire-master
type: Services
logLevel: FINE
logJson: false
…
Use the OpenAPI section in your CRD
to enforce a schema on your custom
resources. For instance you might
want to restrict backend.type to
have etcd and activemq as the only
valid inputs
23. Using Events to trace appliction state changes
- Emitted via EventRecorder in k8s.io/client-go
- Records important information about state changes
- Visibility via kubectl describe tf
- Useful for monitoring (checkout heptio-eventrouter or bitnami’s kubewatch)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ComponentDeployment 22m tribefire Created tribefire-cartridge tribefire-demo-cartridge
Normal SecretBootstrap 22m tribefire Created database secret
Normal SecretBootstrap 22m tribefire Created database service account
Normal SecretBootstrap 22m tribefire Created image pull secret
Normal ComponentDeployment 22m tribefire Created tribefire-master
Normal ComponentDeployment 22m tribefire Created control-center
Normal ComponentDeployment 22m tribefire Created explorer
Normal DatabaseBootstrap 22m tribefire Created database tfdemo-dev-operator-demo
Normal RuntimeReconciled 22m tribefire TribefireRuntime reconciled
Normal ComponentAvailable 21m tribefire Status for 'control-center' switched: 'unavailable' to 'available'
Normal ComponentAvailable 21m tribefire Status for 'explorer' switched: 'unavailable' to 'available'
Normal ComponentAvailable 21m tribefire Status for 'tribefire-master' switched: 'unavailable' to 'available'
24. Outlook on Future Topics
- Deploying and Managing Operators
- Handling multiple CRD versions
- Metrics