Clusternaut:
Orchestrating Percona XtraDB Cluster
with Kubernetes.
Raghavendra Prabhu
Percona Live Data Performance ’16
rprabhu@yelp.com/me@rdprabhu.com/@randomsurfer
Yelp’s Mission
Connecting people with great
local businesses.
Yelp Stats
As of Q4 2015
86M 3270%95M
Me
Raghavendra Prabhu
Software Engineer, Distributed Systems @ Yelp
rprabhu@yelp.com / me@rdprabhu.com
Applicability to any datastore
● Derived datastores
○ Elasticsearch
○ Redis
● Relational
○ MySQL
■ Group replication
■ NDB
○ PostgreSQL
● MongoDB
● Cassandra
Galera - “The Oar boat”
κυβερνήτης “The Helmsman”
Warehouse computing
➔ Mesosphere
➔ Paasta
➔ GCE
◆ Reference
➔ ECS
➔ Smartcloud
➔ Tectonic*
Rationale
➔ Nodes v/s Hosts
◆ Resource-based
➔ Reusable components
◆ Monitoring and Tracing
◆ DNS and Service Discovery
◆ Logging
◆ Metrics
◆ Scheduler
➔ Agnostic
➔ Roles
The Fit
➔ Layered
◆ Client - Server
◆ Multi-layered
➔ Scaling
◆ Horizontal and Vertical
● Preferred?
● Need for vertical
The Fit
➔ Statelessness
◆ Planes of logic:
● Control Plane
● Data Plane : Storage
◆ Anti-pattern for containers
➔ Elasticity
◆ Elastic Scalability
● Scaling down
Declarative vs Imperative
➔ Configurable mgmt
◆ Puppet, Nix, Terraform
➔ Microservices
◆ What runs on my laptop
● What runs on server
● Reproducibility
➔ 12-factor app
➔ Composability
➔ Immutable deployment artifact
Containers
● What is a container and why should I care
○ Operating system virtualization
● Isolation
○ Hierarchies of isolation - application, cgroups,
namespaces, seccomp…
● Unikernels and VMs
○ Role?
○ MirageOS, Rump kernel
● Some - LXC / LXD, Docker*, Rocket*, runc, jails, solaris
zones, lmcty, systemd-nspawn
Galera - really short intro!
➔ MySQL and WSREP api
➔ Galera plugin
➔ Group communication
➔ Synchronous replication
◆ ‘Virtually’
➔ EVS
➔ Certification-based
◆ Optimistic Concurrency
➔ Automatic Node Provisioning
Galera - really short intro!
➔ CAP theorem and Galera
◆ CP
➔ How does it fit
◆ Others
◆ Idempotency
➔ Stateless?
◆ Symmetric
◆ Replicas - Cassandra et.al.
◆ MySQL Cluster
➔ Maintenance of Quorum
Orchestration
● SOA def
○ ‘Stitching’
○ ‘Composing’
● Automation?
● Choreography
● Best of both worlds
Kubernetes
● Started as orchestrator
○ Is an ecosystem for containers
● Horizontal Scaling
● Self-healing
○ Chaos-monkey
● Latest issue
○ Rolling update in clusters
○ How K8s solves this
Kubernetes
● Bin packing
● Automated rollouts and rollbacks
● Secret management
○ Elegant
● Storage orchestration
● Service discovery and load balancing
○ Underrated
Kubernetes: API
● Consistent and Versioned
○ Very important glue
● Composable
● Developed with Swagger
● API Groups
● Supports both declarative and imperative
○ Rolling-update / Daemon Sets
Kubernetes
➔ Components:
◆ Kubelet
◆ Pods
● Main service
● Sidekicks
◆ Services
● The gcomm:// URL.
◆ Replication Controller
Kubernetes
➔ Components:
◆ Labels and selectors
● Plumbing / addressing mechanism.
● Metadata - docker-machine, MachineMetadata
● Set-based and equality-based
Kubernetes
➔ Higher Order
◆ Daemon Sets
● Logging, Monitoring, Tracing
◆ Replica Sets
◆ Deployments
● Rolling updates declarative
● Bouncing
○ PaaSTA
Kubernetes
➔ Components:
◆ Volumes
● Persistent Volumes
● External Storage Providers
◆ Secrets / Vault
◆ Horizontal Pod Autoscaler
➔ Scheduler
◆ Pluggable
Kubernetes: Providers
● Bespoke
● Google Container Engine (GKE)
● AWS
● Azure
● Determinants:
○ Network - flannel, Weave, calico, GCE.
○ OS
○ Config Mgmt
Kubernetes: Ecosystem
● Deis
● Package manager - Helm
● Fabric8
● Spread
○ From compose to kubernetes
● Openshift
Kubernetes
➔ Others:
◆ Mesos
● Supports k8s too.
● Aurora, Chronos, Marathon
◆ Docker Swarm
◆ Fleet
➔ Key Differences
Pods
Pods
● Herd..
● What should they contain - containers!
● How is the grouping done
● Pods and nodes
○ Colocation
● Pod communication
● Labels
Services
Services
● Don’t commingle with `microservices`
○ Think of endpoints.
● Layering architecture
○ Logical address of subset of pods
● Communication
○ Environment
■ Ordering requirement
■ Discovery
○ DNS
■ Issues with DNS
Services: in general
● Potential issues
○ Staleness
○ Live HUP-ing
○ Propagation
● Haproxy
○ Reload configuration.
○ Solved at Yelp with linux qdiscs.
● Flux from Weave
Replication Controller
● “Herd Management”
● ASG
● Pod template: Cookie Cutter
○ Pattern
○ Anti-pattern
■ Asymmetric initialisation
Replication Controller
● Role
○ Init/Supervisor for cluster
○ Rolling updates
○ Multi-version
● Replica Sets
Networking
● Docker-style linking
● Proxy for Pods
● Types
○ Pod to Pod
○ Pod to Service
○ Intra-Pod
○ External to Service
● Providers:
○ OpenVSwitch / Flannel / Calico / Weave / Google
External components
● Flannel / Others
● Etcd
● Fluentd
● Skydns
● Container Registry
● REST server
● Proxy
● cAdvisor / Heapster
PAAS: PaaSTA
● Docker
● Mesos
○ Chronos
○ Marathon
● Sensu
● Smartstack
○ Zookeeper
● Jenkins
● Splunk / Signalfx
● Why
Deployment
● Declare and build individual Galera/PXC nodes.
○ Keep it minimal and simple
○ No assumptions
● Without Kubernetes
○ Docker-compose
■ Possible issues
● Galera node ⇔ Pod
○ Haproxy
○ xinetd
Deployment
● Basic Steps:
○ Create a ‘flat’ network - 10.0.0.0/24
○ Create a ‘cluster’ - zone
○ Create a service endpoint.
■ Internal service - 3306/4567/4568.
■ External service - 3306/3306(?).
■ Expose the external.
■ Session affinities.
Deployment
● Next:
○ Bootstrap a node Pod from a template.
■ Query existing with selector.
○ Start rest of nodes from template.
■ Point to Service with selector.
■ Replication controller
○ Volumes
Deployment - Implications
● Load balancing in state transfers
● Respawning of nodes on timeout
○ May not be same nodes.
● kubectl to manage
○ Puppet etc. also have modules now.
● Separation of client and cluster traffic
Case Study: Safe restarts
● Highly available
● Unattended autonomous
○ and Imperative
● Restarts - services and nodes
● More of a orchestration than choreographing
● Randomness?
● Distributed locks
● Low impedance
Service Definition
Dockerfile
```
... | kubectl create -f -
kubectl expose service eclient --
port=3306 --target-port=3306 --
name=loadbl --type='LoadBalancer’
```
```
.. | kubectl create -f -
kubectl scale --replicas=8
replicationcontrollers
controller
```
```
kubectl get --no-headers
pods -l 'name=pxc' | wc -l
2
kubectl stop …
kubectl get --no-headers
pods -l 'name=pxc' | wc -l
2
```
Credits!
● https://www.pinterest.com/duanejohnson851/star-trek-tng/
● https://upload.wikimedia.org/wikipedia/commons/a/a5/CubeSpace.jpg
● https://upload.wikimedia.org/wikipedia/commons/thumb/b/ba/IUB_Arboretum_-_lotus_pond_-
_dry_seed_pod_-_P1100172.JPG/1280px-IUB_Arboretum_-_lotus_pond_-_dry_seed_pod_-_P1100172.JPG
● https://raw.githubusercontent.com/kubernetes/kubernetes/master/docs/design/architecture.png
● https://pbs.twimg.com/profile_images/511909265720614913/21_d3cvM.png
● https://camo.githubusercontent.
com/96468330aba188dbd7d7eeae0caca32d9a6329df/687474703a2f2f656e67696e656572696e67626c6f67
2e79656c702e636f6d2f696d616765732f70726576696577732f7061617374615f707265766965772e706e67
● http://galeracluster.com/documentation-webpages/_images/replicationapi.png
● https://www.linkedin.com/pulse/containerizing-docker-kubernetes-ramit-surana
Further reading!
● http://kubernetes.io
● https://github.com/ramitsurana/awesome-kubernetes
● https://open.mesosphere.com/frameworks/
● https://coreos.com/kubernetes/docs/latest/kubernetes-networking.
html
● http://paasta.readthedocs.org/en/latest/about/paasta_principles.html
● http://12factor.net/
● http://kubernetes.io/v1.1/docs/api-reference/v1/definitions.html
Contact
Raghavendra Prabhu
rprabhu@yelp.com / me@rdprabhu.com
Twitter: @randomsurfer
Linkedin: rdprabhu
Slideshare: slidunder
Github: ronin13
http://rdprabhu.com
http://about.me/raghavendra.prabhu
We are Hiring!
Visit
yelp.com/careers
@YelpEngineering
fb.com/YelpEngineers
engineeringblog.yelp.com
github.com/yelp

Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes

  • 1.
    Clusternaut: Orchestrating Percona XtraDBCluster with Kubernetes. Raghavendra Prabhu Percona Live Data Performance ’16 rprabhu@yelp.com/me@rdprabhu.com/@randomsurfer
  • 2.
    Yelp’s Mission Connecting peoplewith great local businesses.
  • 3.
    Yelp Stats As ofQ4 2015 86M 3270%95M
  • 4.
    Me Raghavendra Prabhu Software Engineer,Distributed Systems @ Yelp rprabhu@yelp.com / me@rdprabhu.com
  • 5.
    Applicability to anydatastore ● Derived datastores ○ Elasticsearch ○ Redis ● Relational ○ MySQL ■ Group replication ■ NDB ○ PostgreSQL ● MongoDB ● Cassandra
  • 6.
    Galera - “TheOar boat”
  • 7.
  • 8.
    Warehouse computing ➔ Mesosphere ➔Paasta ➔ GCE ◆ Reference ➔ ECS ➔ Smartcloud ➔ Tectonic*
  • 9.
    Rationale ➔ Nodes v/sHosts ◆ Resource-based ➔ Reusable components ◆ Monitoring and Tracing ◆ DNS and Service Discovery ◆ Logging ◆ Metrics ◆ Scheduler ➔ Agnostic ➔ Roles
  • 10.
    The Fit ➔ Layered ◆Client - Server ◆ Multi-layered ➔ Scaling ◆ Horizontal and Vertical ● Preferred? ● Need for vertical
  • 11.
    The Fit ➔ Statelessness ◆Planes of logic: ● Control Plane ● Data Plane : Storage ◆ Anti-pattern for containers ➔ Elasticity ◆ Elastic Scalability ● Scaling down
  • 12.
    Declarative vs Imperative ➔Configurable mgmt ◆ Puppet, Nix, Terraform ➔ Microservices ◆ What runs on my laptop ● What runs on server ● Reproducibility ➔ 12-factor app ➔ Composability ➔ Immutable deployment artifact
  • 13.
    Containers ● What isa container and why should I care ○ Operating system virtualization ● Isolation ○ Hierarchies of isolation - application, cgroups, namespaces, seccomp… ● Unikernels and VMs ○ Role? ○ MirageOS, Rump kernel ● Some - LXC / LXD, Docker*, Rocket*, runc, jails, solaris zones, lmcty, systemd-nspawn
  • 14.
    Galera - reallyshort intro! ➔ MySQL and WSREP api ➔ Galera plugin ➔ Group communication ➔ Synchronous replication ◆ ‘Virtually’ ➔ EVS ➔ Certification-based ◆ Optimistic Concurrency ➔ Automatic Node Provisioning
  • 15.
    Galera - reallyshort intro! ➔ CAP theorem and Galera ◆ CP ➔ How does it fit ◆ Others ◆ Idempotency ➔ Stateless? ◆ Symmetric ◆ Replicas - Cassandra et.al. ◆ MySQL Cluster ➔ Maintenance of Quorum
  • 16.
    Orchestration ● SOA def ○‘Stitching’ ○ ‘Composing’ ● Automation? ● Choreography ● Best of both worlds
  • 17.
    Kubernetes ● Started asorchestrator ○ Is an ecosystem for containers ● Horizontal Scaling ● Self-healing ○ Chaos-monkey ● Latest issue ○ Rolling update in clusters ○ How K8s solves this
  • 18.
    Kubernetes ● Bin packing ●Automated rollouts and rollbacks ● Secret management ○ Elegant ● Storage orchestration ● Service discovery and load balancing ○ Underrated
  • 19.
    Kubernetes: API ● Consistentand Versioned ○ Very important glue ● Composable ● Developed with Swagger ● API Groups ● Supports both declarative and imperative ○ Rolling-update / Daemon Sets
  • 21.
    Kubernetes ➔ Components: ◆ Kubelet ◆Pods ● Main service ● Sidekicks ◆ Services ● The gcomm:// URL. ◆ Replication Controller
  • 22.
    Kubernetes ➔ Components: ◆ Labelsand selectors ● Plumbing / addressing mechanism. ● Metadata - docker-machine, MachineMetadata ● Set-based and equality-based
  • 23.
    Kubernetes ➔ Higher Order ◆Daemon Sets ● Logging, Monitoring, Tracing ◆ Replica Sets ◆ Deployments ● Rolling updates declarative ● Bouncing ○ PaaSTA
  • 24.
    Kubernetes ➔ Components: ◆ Volumes ●Persistent Volumes ● External Storage Providers ◆ Secrets / Vault ◆ Horizontal Pod Autoscaler ➔ Scheduler ◆ Pluggable
  • 25.
    Kubernetes: Providers ● Bespoke ●Google Container Engine (GKE) ● AWS ● Azure ● Determinants: ○ Network - flannel, Weave, calico, GCE. ○ OS ○ Config Mgmt
  • 26.
    Kubernetes: Ecosystem ● Deis ●Package manager - Helm ● Fabric8 ● Spread ○ From compose to kubernetes ● Openshift
  • 27.
    Kubernetes ➔ Others: ◆ Mesos ●Supports k8s too. ● Aurora, Chronos, Marathon ◆ Docker Swarm ◆ Fleet ➔ Key Differences
  • 28.
  • 29.
    Pods ● Herd.. ● Whatshould they contain - containers! ● How is the grouping done ● Pods and nodes ○ Colocation ● Pod communication ● Labels
  • 30.
  • 31.
    Services ● Don’t comminglewith `microservices` ○ Think of endpoints. ● Layering architecture ○ Logical address of subset of pods ● Communication ○ Environment ■ Ordering requirement ■ Discovery ○ DNS ■ Issues with DNS
  • 32.
    Services: in general ●Potential issues ○ Staleness ○ Live HUP-ing ○ Propagation ● Haproxy ○ Reload configuration. ○ Solved at Yelp with linux qdiscs. ● Flux from Weave
  • 33.
    Replication Controller ● “HerdManagement” ● ASG ● Pod template: Cookie Cutter ○ Pattern ○ Anti-pattern ■ Asymmetric initialisation
  • 34.
    Replication Controller ● Role ○Init/Supervisor for cluster ○ Rolling updates ○ Multi-version ● Replica Sets
  • 35.
    Networking ● Docker-style linking ●Proxy for Pods ● Types ○ Pod to Pod ○ Pod to Service ○ Intra-Pod ○ External to Service ● Providers: ○ OpenVSwitch / Flannel / Calico / Weave / Google
  • 36.
    External components ● Flannel/ Others ● Etcd ● Fluentd ● Skydns ● Container Registry ● REST server ● Proxy ● cAdvisor / Heapster
  • 37.
    PAAS: PaaSTA ● Docker ●Mesos ○ Chronos ○ Marathon ● Sensu ● Smartstack ○ Zookeeper ● Jenkins ● Splunk / Signalfx ● Why
  • 38.
    Deployment ● Declare andbuild individual Galera/PXC nodes. ○ Keep it minimal and simple ○ No assumptions ● Without Kubernetes ○ Docker-compose ■ Possible issues ● Galera node ⇔ Pod ○ Haproxy ○ xinetd
  • 39.
    Deployment ● Basic Steps: ○Create a ‘flat’ network - 10.0.0.0/24 ○ Create a ‘cluster’ - zone ○ Create a service endpoint. ■ Internal service - 3306/4567/4568. ■ External service - 3306/3306(?). ■ Expose the external. ■ Session affinities.
  • 40.
    Deployment ● Next: ○ Bootstrapa node Pod from a template. ■ Query existing with selector. ○ Start rest of nodes from template. ■ Point to Service with selector. ■ Replication controller ○ Volumes
  • 41.
    Deployment - Implications ●Load balancing in state transfers ● Respawning of nodes on timeout ○ May not be same nodes. ● kubectl to manage ○ Puppet etc. also have modules now. ● Separation of client and cluster traffic
  • 42.
    Case Study: Saferestarts ● Highly available ● Unattended autonomous ○ and Imperative ● Restarts - services and nodes ● More of a orchestration than choreographing ● Randomness? ● Distributed locks ● Low impedance
  • 43.
  • 44.
  • 45.
    ``` ... | kubectlcreate -f - kubectl expose service eclient -- port=3306 --target-port=3306 -- name=loadbl --type='LoadBalancer’ ```
  • 47.
    ``` .. | kubectlcreate -f - kubectl scale --replicas=8 replicationcontrollers controller ``` ``` kubectl get --no-headers pods -l 'name=pxc' | wc -l 2 kubectl stop … kubectl get --no-headers pods -l 'name=pxc' | wc -l 2 ```
  • 48.
    Credits! ● https://www.pinterest.com/duanejohnson851/star-trek-tng/ ● https://upload.wikimedia.org/wikipedia/commons/a/a5/CubeSpace.jpg ●https://upload.wikimedia.org/wikipedia/commons/thumb/b/ba/IUB_Arboretum_-_lotus_pond_- _dry_seed_pod_-_P1100172.JPG/1280px-IUB_Arboretum_-_lotus_pond_-_dry_seed_pod_-_P1100172.JPG ● https://raw.githubusercontent.com/kubernetes/kubernetes/master/docs/design/architecture.png ● https://pbs.twimg.com/profile_images/511909265720614913/21_d3cvM.png ● https://camo.githubusercontent. com/96468330aba188dbd7d7eeae0caca32d9a6329df/687474703a2f2f656e67696e656572696e67626c6f67 2e79656c702e636f6d2f696d616765732f70726576696577732f7061617374615f707265766965772e706e67 ● http://galeracluster.com/documentation-webpages/_images/replicationapi.png ● https://www.linkedin.com/pulse/containerizing-docker-kubernetes-ramit-surana
  • 49.
    Further reading! ● http://kubernetes.io ●https://github.com/ramitsurana/awesome-kubernetes ● https://open.mesosphere.com/frameworks/ ● https://coreos.com/kubernetes/docs/latest/kubernetes-networking. html ● http://paasta.readthedocs.org/en/latest/about/paasta_principles.html ● http://12factor.net/ ● http://kubernetes.io/v1.1/docs/api-reference/v1/definitions.html
  • 50.
    Contact Raghavendra Prabhu rprabhu@yelp.com /me@rdprabhu.com Twitter: @randomsurfer Linkedin: rdprabhu Slideshare: slidunder Github: ronin13 http://rdprabhu.com http://about.me/raghavendra.prabhu
  • 51.
  • 52.