Data weekender deploying prod grade sql 2019 big data clusters

Deploying Production Grade
SQL Server 2019 Big Data Clusters
Chris Adkin

Who Am I ?
 SQL Server Solution Architect at Pure Storage
 SQL Server user for 20 years
 Was heavily involved in the SQL Server 2019 EAP
 Co-author of the Microsoft workshop:
Big Data Cluster: From Bare Metal to Kubernetes

So That We Are All On The Same Page
 What are containers ?
 What is Kubernetes ?
 What are master and worker nodes ?
 What are pods ?

The Big Picture
Microsoft’s vision is that if you cannot go to
Azure, Microsoft will bring Azure to you via a
cloud native platform:
Kubernetes

#1 Planning
Your Kubernetes
Cluster
 How many nodes should I have ?
 What do I require in terms of
memory and CPU ?
 What about storage ?
 What about cluster maintenance ?

To Start Off With . . .
 Master nodes
2GB memory, 2 logical processors
 Worker nodes
64GB memory, 8 logical processors

A Minimum Viable Production Kubernetes Cluster
Kubelet
Pod
Kubelet
Pod
Kubelet
Pod
Containers
{ } API Server
Scheduler
Controller
Master node
Worker nodes
etcd 2 x master nodes
 3 x worker nodes
 3 x etcd instances

Node Maintenance and Pod Mobility
Worker node
POD
Worker node
POD
Worker node
POD
Worker node Worker node
Worker node Worker node
app: nginx
replicas: 3
T = 1
T = 0
T = 2
POD POD POD
POD POD POD
Worker node
Worker node
Pro tip: factor in extra memory and CPU resources for node maintenance and pod mobility

Big Data Cluster Architecture
Pro tip: plan for worker nodes to be dedicated to the storage pool

 Volume
 Persistent volume claim
 Persistent volume
 Underlying storage
 A Big Takeaway Point
State needs to be able to follow
pods around the cluster
The Kubernetes
Storage “Layer Cake”

Manual provisioning Automatic Provisioning
volumes/LUNs
have to be
created
manually
Plugin creates
volumes/LUNs
automatically
Storage Provisioning

Where Does State Live ?
 etcd
 Data pool
 Master pool
 Controller, compute and app pools
 Storage pool HDFS - file
 Storage pool HDFS – parquet
 Certificates and images
 Object – public cloud: S3 or Azure Data Lake Gen 2
 Object – on-premises: S3
Persistent
volumes

Stateful Pod Mobility
Worker node
T = 1
T = 0
T = 2
Worker node
With replicated state
Solution #1: reschedule pod(s) to where state is replicated to
POD
POD
Worker nodeWorker node
POD

Stateful Pod Mobility
Solution #2: shared state that every node in the cluster can see
Worker node
POD
T = 1
T = 0
T = 2
Worker node
With shared state
POD
POD

Storage Pool - HDFS
 Default HDFS replication factor is 3
 However much data you need to store in
the storage pool => multiple this by 3 !!!
 Triples the write element of your storage
pool IO workload
 . . . but !!!

Storage Pool HDFS Replication Factor
Pro tip: if your storage platform supports RAID and / or erasure coding,
=> HDFS replication factor can be set to 1

Storage Plugins
In-tree
FlexVolume
Container Storage Interface
Pro tip: prefer storage plugins that adhere to the CSI standard,
refer to https://kubernetes-csi.github.io/docs/drivers.html

Essential Viewing
https://www.youtube.com/watch?v=169w6QlWhmo&t=1s

Deployment and Management Tools
 kubectl
 azdata
 Helm
 Kubespray (more on this later)
Pro tip: use a separate server that does not host any cluster nodes
to run all of your tools from

#2 Deployments
 Deploying Kubernetes clusters
 Deploying SQL Server 2019 Big Data Clusters
 Idempotent infrastructure

 On each machine:
 Add current machine to /etc/hosts.
 Disable swapping on all devices.
 Import the keys and register the repository for Kubernetes.
 Configure docker and Kubernetes prerequisites on the machine.
 Set net.bridge.bridge-nf-call-iptables=1. On Ubuntu 18.04, the following
commands first enable br_netfilter.
 Run kubeadm init on first master
 Run kubeadm join on each worker node
Kubeadm – Standard Cluster Creation Tool
Is there a better way ?

Kubespray – Managing Your Cluster Life Cycle
 Clone kubespray GitHub repo
 Install python, ansible and
pre-requisites from requirements.txt
 Ensure that each node host is contactable via ssh
 Copy inventory directory
 Edit inventory.ini file
 Run cluster.yml ansible playbook

1. Create a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini cluster.yml --become --become-user=root –K
2. Tear down a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini reset.yml --become --become-user=root –K
3. Add node(s) to a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini scale.yml --become --become-user=root –K
4. Rebuild a cluster’s master nodes
ansible-playbook -i inventory/<cluster-name>/inventory.ini recover-control-plane.yml --become
--become-user=root –K
5. Upgrade a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini upgrade-cluster.yml --become
--become-user=root –K
Managing The Entire Kubernetes Cluster Life Cycle

Should I Virtualize My Nodes ?
Bare Metal
Container Engine
How Kubernetes
‘Purists’ view this
. . . but
Hypervisor
Guest OS
Container Base OS Image
Application Image Layer(s)

VMs are Great For Automation
2. Use
Terraform to
create node
hosts
1. VM
template
4. Kubespray3. Node hosts
created
5. Goal:
Kubernetes
cluster

SQL Server OLAP 101: How Much IO Can A CPU Core Consume ?
SELECT COUNT(*)
FROM SomeBigTable
OPTION (MAXDOP 1)

Storage Pool: Gauging IO Throughput
Populate a simple data frame, and then run a count() on it.

Leveraging Pod / Node Affinity
Worker node 1
Worker node 1
Worker
nodes 4, 5 and 6.
Worker
node 2
Tie specific parts of the Big Data Cluster to specific K8s worker nodes.
Pro Tip: At the bare minimum, dedicate worker nodes to the storage pool.

Pod / Node Affinity – Worked Example
1. Label your worker nodes
kubectl label node z-ca-bdc-worker1 mssql-cluster=bdc mssql-resource=bdc-shared --overwrite=true
kubectl label node z-ca-bdc-worker2 mssql-cluster=bdc mssql-resource=bdc-storage-pool --overwrite=true
kubectl label node z-ca-bdc-worker3 mssql-cluster=bdc mssql-resource=bdc-storage-pool --overwrite=true
2.Create a configuration
azdata bdc config init --source kubeadm-dev-test --target ca-bdc-dev-test
3. Add default cluster and node labels to control.json
azdata bdc config add -c ca-bdc-dev-test/control.json -j "$.spec.clusterLabel=bdc"
azdata bdc config add -c ca-bdc-dev-test/control.json -j "$.spec.nodeLabel=bdc-shared“
4. Assign node level labels in bdc.json
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.master.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.compute-0.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.data-0.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.storage-0.spec.nodeLabel=bdc-storage-pool"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.nmnode-0.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.sparkhead.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.zookeeper.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.gateway.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.appproxy.spec.nodeLabel=bdc-shared“
5. Deploy your Big Data Cluster
azdata bdc create --accept-eula yes --config-profile ca-bdc-dev-test

 The default on bare metal for external
access is NodePort.
 Can be changed to LoadBalancer . . .
requires something to do this.
A Word On . . . Ingress Control

#3 Resilience
 Pod mobility and node maintenance
 Rolling back a version upgrade
 Backup and recovery

Pod Mobility / Node Maintenance
 Draining a node of its current POD workload:
Kubectl drain <node-name>
 Re-enabling a node to accept pods
Kubectl uncordon <node-name>
Pro Tip: test draining worker nodes,
important both for worker node maintenance
and testing the resilience of your Kubernetes cluster.

Rolling Back Upgrades
 Big Data Cluster images are cached on all worker nodes
 Images will build up on each node, until manual intervention (docker rmi) is taken
Pro Tip: Keep, the images for the previous CU,
so that you have somewhere to rollback to without having to download images.
Kubelet
Pod
Kubelet
Pod
Kubelet
Pod
Containers
{ } API Server
Scheduler
Controller
Master node
Worker nodes
etcd

Backup And Recovery
 etcd
 Data pool
 Master pool
What we need
to worry about

Backup And Recovery
 etcd
 Data pool
 Master pool
SQL Server
BACKUP and
RESTORE
etcdctl
azdata bdc hdfs cp
Tools native to
platform
velero

Two Parts To Every Storage Platform
 Storage element, how well the
platform handles:
 Metadata
 RAID / erasure coding
 Snapshots and replication
 Consistency of performance
 API: how well it integrates with
Kubernetes
Pro Tip:
Valero and 3rd party backup tools usually tap a storage platforms snapshot capabilities
=> the quality of this will influence the quality of your backup / restore experience

Contact Me
@ChrisAdkin8
cadkin@purestorage.com

Data weekender deploying prod grade sql 2019 big data clusters

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Data weekender deploying prod grade sql 2019 big data clusters

Similar to Data weekender deploying prod grade sql 2019 big data clusters (20)

More from Chris Adkin

More from Chris Adkin (16)

Recently uploaded

Recently uploaded (20)

Data weekender deploying prod grade sql 2019 big data clusters