2. Who Am I ?
SQL Server Solution Architect at Pure Storage
SQL Server user for 20 years
Was heavily involved in the SQL Server 2019 EAP
Co-author of the Microsoft workshop:
Big Data Cluster: From Bare Metal to Kubernetes
3. So That We Are All On The Same Page
What are containers ?
What is Kubernetes ?
What are master and worker nodes ?
What are pods ?
4. The Big Picture
Microsoft’s vision is that if you cannot go to
Azure, Microsoft will bring Azure to you via a
cloud native platform:
Kubernetes
5. #1 Planning
Your Kubernetes
Cluster
How many nodes should I have ?
What do I require in terms of
memory and CPU ?
What about storage ?
What about cluster maintenance ?
6. To Start Off With . . .
Master nodes
2GB memory, 2 logical processors
Worker nodes
64GB memory, 8 logical processors
7. A Minimum Viable Production Kubernetes Cluster
Kubelet
Pod
Kubelet
Pod
Kubelet
Pod
Containers
{ } API Server
Scheduler
Controller
Master node
Worker nodes
etcd 2 x master nodes
3 x worker nodes
3 x etcd instances
8. Node Maintenance and Pod Mobility
Worker node
POD
Worker node
POD
Worker node
POD
Worker node Worker node
Worker node Worker node
app: nginx
replicas: 3
T = 1
T = 0
T = 2
POD POD POD
POD POD POD
Worker node
Worker node
Pro tip: factor in extra memory and CPU resources for node maintenance and pod mobility
9. Big Data Cluster Architecture
Pro tip: plan for worker nodes to be dedicated to the storage pool
10. Volume
Persistent volume claim
Persistent volume
Underlying storage
A Big Takeaway Point
State needs to be able to follow
pods around the cluster
The Kubernetes
Storage “Layer Cake”
11. Manual provisioning Automatic Provisioning
volumes/LUNs
have to be
created
manually
Plugin creates
volumes/LUNs
automatically
Storage Provisioning
12. Where Does State Live ?
etcd
Data pool
Master pool
Controller, compute and app pools
Storage pool HDFS - file
Storage pool HDFS – parquet
Certificates and images
Object – public cloud: S3 or Azure Data Lake Gen 2
Object – on-premises: S3
Persistent
volumes
13. Stateful Pod Mobility
Worker node
T = 1
T = 0
T = 2
Worker node
With replicated state
Solution #1: reschedule pod(s) to where state is replicated to
POD
POD
Worker nodeWorker node
POD
Worker nodeWorker node
14. Stateful Pod Mobility
Solution #2: shared state that every node in the cluster can see
Worker node
POD
T = 1
T = 0
T = 2
Worker node
Worker nodeWorker node
Worker nodeWorker node
With shared state
POD
POD
15. Storage Pool - HDFS
Default HDFS replication factor is 3
However much data you need to store in
the storage pool => multiple this by 3 !!!
Triples the write element of your storage
pool IO workload
. . . but !!!
16. Storage Pool HDFS Replication Factor
Pro tip: if your storage platform supports RAID and / or erasure coding,
=> HDFS replication factor can be set to 1
19. Deployment and Management Tools
kubectl
azdata
Helm
Kubespray (more on this later)
Pro tip: use a separate server that does not host any cluster nodes
to run all of your tools from
20. #2 Deployments
Deploying Kubernetes clusters
Deploying SQL Server 2019 Big Data Clusters
Idempotent infrastructure
21. On each machine:
Add current machine to /etc/hosts.
Disable swapping on all devices.
Import the keys and register the repository for Kubernetes.
Configure docker and Kubernetes prerequisites on the machine.
Set net.bridge.bridge-nf-call-iptables=1. On Ubuntu 18.04, the following
commands first enable br_netfilter.
Run kubeadm init on first master
Run kubeadm join on each worker node
Kubeadm – Standard Cluster Creation Tool
Is there a better way ?
22. Kubespray – Managing Your Cluster Life Cycle
Clone kubespray GitHub repo
Install python, ansible and
pre-requisites from requirements.txt
Ensure that each node host is contactable via ssh
Copy inventory directory
Edit inventory.ini file
Run cluster.yml ansible playbook
23. 1. Create a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini cluster.yml --become --become-user=root –K
2. Tear down a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini reset.yml --become --become-user=root –K
3. Add node(s) to a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini scale.yml --become --become-user=root –K
4. Rebuild a cluster’s master nodes
ansible-playbook -i inventory/<cluster-name>/inventory.ini recover-control-plane.yml --become
--become-user=root –K
5. Upgrade a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini upgrade-cluster.yml --become
--become-user=root –K
Managing The Entire Kubernetes Cluster Life Cycle
24. Should I Virtualize My Nodes ?
Bare Metal
Container Engine
How Kubernetes
‘Purists’ view this
. . . but
Hypervisor
Guest OS
Container Base OS Image
Application Image Layer(s)
25. VMs are Great For Automation
2. Use
Terraform to
create node
hosts
1. VM
template
4. Kubespray3. Node hosts
created
5. Goal:
Kubernetes
cluster
26. SQL Server OLAP 101: How Much IO Can A CPU Core Consume ?
SELECT COUNT(*)
FROM SomeBigTable
OPTION (MAXDOP 1)
27. Storage Pool: Gauging IO Throughput
Populate a simple data frame, and then run a count() on it.
28. Leveraging Pod / Node Affinity
Worker node 1
Worker node 1
Worker
nodes 4, 5 and 6.
Worker
node 2
Tie specific parts of the Big Data Cluster to specific K8s worker nodes.
Pro Tip: At the bare minimum, dedicate worker nodes to the storage pool.
31. The default on bare metal for external
access is NodePort.
Can be changed to LoadBalancer . . .
requires something to do this.
A Word On . . . Ingress Control
32. #3 Resilience
Pod mobility and node maintenance
Rolling back a version upgrade
Backup and recovery
33. Pod Mobility / Node Maintenance
Draining a node of its current POD workload:
Kubectl drain <node-name>
Re-enabling a node to accept pods
Kubectl uncordon <node-name>
Pro Tip: test draining worker nodes,
important both for worker node maintenance
and testing the resilience of your Kubernetes cluster.
34. Rolling Back Upgrades
Big Data Cluster images are cached on all worker nodes
Images will build up on each node, until manual intervention (docker rmi) is taken
Pro Tip: Keep, the images for the previous CU,
so that you have somewhere to rollback to without having to download images.
Kubelet
Pod
Kubelet
Pod
Kubelet
Pod
Containers
{ } API Server
Scheduler
Controller
Master node
Worker nodes
etcd
35. Backup And Recovery
etcd
Data pool
Master pool
Controller, compute and app pools
Storage pool HDFS - file
Storage pool HDFS – parquet
Certificates and images
Object – public cloud: S3 or Azure Data Lake Gen 2
Object – on-premises: S3
What we need
to worry about
36. Backup And Recovery
etcd
Data pool
Master pool
Controller, compute and app pools
Storage pool HDFS - file
Storage pool HDFS – parquet
Certificates and images
Object – public cloud: S3 or Azure Data Lake Gen 2
Object – on-premises: S3
SQL Server
BACKUP and
RESTORE
etcdctl
azdata bdc hdfs cp
Tools native to
platform
velero
37. Two Parts To Every Storage Platform
Storage element, how well the
platform handles:
Metadata
RAID / erasure coding
Snapshots and replication
Consistency of performance
API: how well it integrates with
Kubernetes
Pro Tip:
Valero and 3rd party backup tools usually tap a storage platforms snapshot capabilities
=> the quality of this will influence the quality of your backup / restore experience