SlideShare a Scribd company logo
1 of 39
Deploying Production Grade
SQL Server 2019 Big Data Clusters
Chris Adkin
Who Am I ?
 SQL Server Solution Architect at Pure Storage
 SQL Server user for 20 years
 Was heavily involved in the SQL Server 2019 EAP
 Co-author of the Microsoft workshop:
Big Data Cluster: From Bare Metal to Kubernetes
So That We Are All On The Same Page
 What are containers ?
 What is Kubernetes ?
 What are master and worker nodes ?
 What are pods ?
The Big Picture
Microsoft’s vision is that if you cannot go to
Azure, Microsoft will bring Azure to you via a
cloud native platform:
Kubernetes
#1 Planning
Your Kubernetes
Cluster
 How many nodes should I have ?
 What do I require in terms of
memory and CPU ?
 What about storage ?
 What about cluster maintenance ?
To Start Off With . . .
 Master nodes
2GB memory, 2 logical processors
 Worker nodes
64GB memory, 8 logical processors
A Minimum Viable Production Kubernetes Cluster
Kubelet
Pod
Kubelet
Pod
Kubelet
Pod
Containers
{ } API Server
Scheduler
Controller
Master node
Worker nodes
etcd 2 x master nodes
 3 x worker nodes
 3 x etcd instances
Node Maintenance and Pod Mobility
Worker node
POD
Worker node
POD
Worker node
POD
Worker node Worker node
Worker node Worker node
app: nginx
replicas: 3
T = 1
T = 0
T = 2
POD POD POD
POD POD POD
Worker node
Worker node
Pro tip: factor in extra memory and CPU resources for node maintenance and pod mobility
Big Data Cluster Architecture
Pro tip: plan for worker nodes to be dedicated to the storage pool
 Volume
 Persistent volume claim
 Persistent volume
 Underlying storage
 A Big Takeaway Point
State needs to be able to follow
pods around the cluster
The Kubernetes
Storage “Layer Cake”
Manual provisioning Automatic Provisioning
volumes/LUNs
have to be
created
manually
Plugin creates
volumes/LUNs
automatically
Storage Provisioning
Where Does State Live ?
 etcd
 Data pool
 Master pool
 Controller, compute and app pools
 Storage pool HDFS - file
 Storage pool HDFS – parquet
 Certificates and images
 Object – public cloud: S3 or Azure Data Lake Gen 2
 Object – on-premises: S3
Persistent
volumes
Stateful Pod Mobility
Worker node
T = 1
T = 0
T = 2
Worker node
With replicated state
Solution #1: reschedule pod(s) to where state is replicated to
POD
POD
Worker nodeWorker node
POD
Worker nodeWorker node
Stateful Pod Mobility
Solution #2: shared state that every node in the cluster can see
Worker node
POD
T = 1
T = 0
T = 2
Worker node
Worker nodeWorker node
Worker nodeWorker node
With shared state
POD
POD
Storage Pool - HDFS
 Default HDFS replication factor is 3
 However much data you need to store in
the storage pool => multiple this by 3 !!!
 Triples the write element of your storage
pool IO workload
 . . . but !!!
Storage Pool HDFS Replication Factor
Pro tip: if your storage platform supports RAID and / or erasure coding,
=> HDFS replication factor can be set to 1
Storage Plugins
In-tree
FlexVolume
Container Storage Interface
Pro tip: prefer storage plugins that adhere to the CSI standard,
refer to https://kubernetes-csi.github.io/docs/drivers.html
Essential Viewing
https://www.youtube.com/watch?v=169w6QlWhmo&t=1s
Deployment and Management Tools
 kubectl
 azdata
 Helm
 Kubespray (more on this later)
Pro tip: use a separate server that does not host any cluster nodes
to run all of your tools from
#2 Deployments
 Deploying Kubernetes clusters
 Deploying SQL Server 2019 Big Data Clusters
 Idempotent infrastructure
 On each machine:
 Add current machine to /etc/hosts.
 Disable swapping on all devices.
 Import the keys and register the repository for Kubernetes.
 Configure docker and Kubernetes prerequisites on the machine.
 Set net.bridge.bridge-nf-call-iptables=1. On Ubuntu 18.04, the following
commands first enable br_netfilter.
 Run kubeadm init on first master
 Run kubeadm join on each worker node
Kubeadm – Standard Cluster Creation Tool
Is there a better way ?
Kubespray – Managing Your Cluster Life Cycle
 Clone kubespray GitHub repo
 Install python, ansible and
pre-requisites from requirements.txt
 Ensure that each node host is contactable via ssh
 Copy inventory directory
 Edit inventory.ini file
 Run cluster.yml ansible playbook
1. Create a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini cluster.yml --become --become-user=root –K
2. Tear down a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini reset.yml --become --become-user=root –K
3. Add node(s) to a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini scale.yml --become --become-user=root –K
4. Rebuild a cluster’s master nodes
ansible-playbook -i inventory/<cluster-name>/inventory.ini recover-control-plane.yml --become 
--become-user=root –K
5. Upgrade a cluster
ansible-playbook -i inventory/<cluster-name>/inventory.ini upgrade-cluster.yml --become 
--become-user=root –K
Managing The Entire Kubernetes Cluster Life Cycle
Should I Virtualize My Nodes ?
Bare Metal
Container Engine
How Kubernetes
‘Purists’ view this
. . . but
Hypervisor
Guest OS
Container Base OS Image
Application Image Layer(s)
VMs are Great For Automation
2. Use
Terraform to
create node
hosts
1. VM
template
4. Kubespray3. Node hosts
created
5. Goal:
Kubernetes
cluster
SQL Server OLAP 101: How Much IO Can A CPU Core Consume ?
SELECT COUNT(*)
FROM SomeBigTable
OPTION (MAXDOP 1)
Storage Pool: Gauging IO Throughput
Populate a simple data frame, and then run a count() on it.
Leveraging Pod / Node Affinity
Worker node 1
Worker node 1
Worker
nodes 4, 5 and 6.
Worker
node 2
Tie specific parts of the Big Data Cluster to specific K8s worker nodes.
Pro Tip: At the bare minimum, dedicate worker nodes to the storage pool.
Pod / Node Affinity – Worked Example
1. Label your worker nodes
kubectl label node z-ca-bdc-worker1 mssql-cluster=bdc mssql-resource=bdc-shared --overwrite=true
kubectl label node z-ca-bdc-worker2 mssql-cluster=bdc mssql-resource=bdc-storage-pool --overwrite=true
kubectl label node z-ca-bdc-worker3 mssql-cluster=bdc mssql-resource=bdc-storage-pool --overwrite=true
2.Create a configuration
azdata bdc config init --source kubeadm-dev-test --target ca-bdc-dev-test
3. Add default cluster and node labels to control.json
azdata bdc config add -c ca-bdc-dev-test/control.json -j "$.spec.clusterLabel=bdc"
azdata bdc config add -c ca-bdc-dev-test/control.json -j "$.spec.nodeLabel=bdc-shared“
4. Assign node level labels in bdc.json
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.master.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.compute-0.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.data-0.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.storage-0.spec.nodeLabel=bdc-storage-pool"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.nmnode-0.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.sparkhead.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.zookeeper.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.gateway.spec.nodeLabel=bdc-shared"
azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.appproxy.spec.nodeLabel=bdc-shared“
5. Deploy your Big Data Cluster
azdata bdc create --accept-eula yes --config-profile ca-bdc-dev-test
DEMO
 The default on bare metal for external
access is NodePort.
 Can be changed to LoadBalancer . . .
requires something to do this.
A Word On . . . Ingress Control
#3 Resilience
 Pod mobility and node maintenance
 Rolling back a version upgrade
 Backup and recovery
Pod Mobility / Node Maintenance
 Draining a node of its current POD workload:
Kubectl drain <node-name>
 Re-enabling a node to accept pods
Kubectl uncordon <node-name>
Pro Tip: test draining worker nodes,
important both for worker node maintenance
and testing the resilience of your Kubernetes cluster.
Rolling Back Upgrades
 Big Data Cluster images are cached on all worker nodes
 Images will build up on each node, until manual intervention (docker rmi) is taken
Pro Tip: Keep, the images for the previous CU,
so that you have somewhere to rollback to without having to download images.
Kubelet
Pod
Kubelet
Pod
Kubelet
Pod
Containers
{ } API Server
Scheduler
Controller
Master node
Worker nodes
etcd
Backup And Recovery
 etcd
 Data pool
 Master pool
 Controller, compute and app pools
 Storage pool HDFS - file
 Storage pool HDFS – parquet
 Certificates and images
 Object – public cloud: S3 or Azure Data Lake Gen 2
 Object – on-premises: S3
What we need
to worry about
Backup And Recovery
 etcd
 Data pool
 Master pool
 Controller, compute and app pools
 Storage pool HDFS - file
 Storage pool HDFS – parquet
 Certificates and images
 Object – public cloud: S3 or Azure Data Lake Gen 2
 Object – on-premises: S3
SQL Server
BACKUP and
RESTORE
etcdctl
azdata bdc hdfs cp
Tools native to
platform
velero
Two Parts To Every Storage Platform
 Storage element, how well the
platform handles:
 Metadata
 RAID / erasure coding
 Snapshots and replication
 Consistency of performance
 API: how well it integrates with
Kubernetes
Pro Tip:
Valero and 3rd party backup tools usually tap a storage platforms snapshot capabilities
=> the quality of this will influence the quality of your backup / restore experience
Any Questions . . .
Contact Me
@ChrisAdkin8
cadkin@purestorage.com

More Related Content

What's hot

Apache Spark on K8s and HDFS Security
Apache Spark on K8s and HDFS SecurityApache Spark on K8s and HDFS Security
Apache Spark on K8s and HDFS SecurityDatabricks
 
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEOClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEOAltinity Ltd
 
Beyond Ingresses - Better Traffic Management in Kubernetes
Beyond Ingresses - Better Traffic Management in KubernetesBeyond Ingresses - Better Traffic Management in Kubernetes
Beyond Ingresses - Better Traffic Management in KubernetesMark McBride
 
Steering the Sea Monster - Integrating Scylla with Kubernetes
Steering the Sea Monster - Integrating Scylla with KubernetesSteering the Sea Monster - Integrating Scylla with Kubernetes
Steering the Sea Monster - Integrating Scylla with KubernetesScyllaDB
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesYousun Jeong
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDatainside-BigData.com
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIAltinity Ltd
 
February 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerFebruary 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerYahoo Developer Network
 
Containerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudContainerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudSubbu Rama
 
Kubernetes Day 2017 - Build, Ship and Run Your APP, Production !!
Kubernetes Day 2017 - Build, Ship and Run Your APP, Production !!Kubernetes Day 2017 - Build, Ship and Run Your APP, Production !!
Kubernetes Day 2017 - Build, Ship and Run Your APP, Production !!smalltown
 
Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules RestructuredDoiT International
 
Scylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per serverScylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per serverAvi Kivity
 
Learn kubernetes in 90 minutes
Learn kubernetes in 90 minutesLearn kubernetes in 90 minutes
Learn kubernetes in 90 minutesLarry Cai
 
Introducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationIntroducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationScyllaDB
 
Taking Your Database Global with Kubernetes
Taking Your Database Global with KubernetesTaking Your Database Global with Kubernetes
Taking Your Database Global with KubernetesChristopher Bradford
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...Radhika Puthiyetath
 
Kubernetes Hands-On Guide
Kubernetes Hands-On GuideKubernetes Hands-On Guide
Kubernetes Hands-On GuideStratoscale
 
How to integrate Kubernetes in OpenStack: You need to know these project
How to integrate Kubernetes in OpenStack: You need to know these projectHow to integrate Kubernetes in OpenStack: You need to know these project
How to integrate Kubernetes in OpenStack: You need to know these projectinwin stack
 
Data Processing solution for Health Domain.
Data Processing solution for Health Domain.Data Processing solution for Health Domain.
Data Processing solution for Health Domain.Suman Singh
 

What's hot (20)

Apache Spark on K8s and HDFS Security
Apache Spark on K8s and HDFS SecurityApache Spark on K8s and HDFS Security
Apache Spark on K8s and HDFS Security
 
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEOClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
 
Beyond Ingresses - Better Traffic Management in Kubernetes
Beyond Ingresses - Better Traffic Management in KubernetesBeyond Ingresses - Better Traffic Management in Kubernetes
Beyond Ingresses - Better Traffic Management in Kubernetes
 
Steering the Sea Monster - Integrating Scylla with Kubernetes
Steering the Sea Monster - Integrating Scylla with KubernetesSteering the Sea Monster - Integrating Scylla with Kubernetes
Steering the Sea Monster - Integrating Scylla with Kubernetes
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigData
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
 
February 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerFebruary 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with Docker
 
Containerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudContainerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the Cloud
 
Kubernetes Day 2017 - Build, Ship and Run Your APP, Production !!
Kubernetes Day 2017 - Build, Ship and Run Your APP, Production !!Kubernetes Day 2017 - Build, Ship and Run Your APP, Production !!
Kubernetes Day 2017 - Build, Ship and Run Your APP, Production !!
 
Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules Restructured
 
Scylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per serverScylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per server
 
Learn kubernetes in 90 minutes
Learn kubernetes in 90 minutesLearn kubernetes in 90 minutes
Learn kubernetes in 90 minutes
 
Introducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationIntroducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task Automation
 
Taking Your Database Global with Kubernetes
Taking Your Database Global with KubernetesTaking Your Database Global with Kubernetes
Taking Your Database Global with Kubernetes
 
Running Cassandra in AWS
Running Cassandra in AWSRunning Cassandra in AWS
Running Cassandra in AWS
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
 
Kubernetes Hands-On Guide
Kubernetes Hands-On GuideKubernetes Hands-On Guide
Kubernetes Hands-On Guide
 
How to integrate Kubernetes in OpenStack: You need to know these project
How to integrate Kubernetes in OpenStack: You need to know these projectHow to integrate Kubernetes in OpenStack: You need to know these project
How to integrate Kubernetes in OpenStack: You need to know these project
 
Data Processing solution for Health Domain.
Data Processing solution for Health Domain.Data Processing solution for Health Domain.
Data Processing solution for Health Domain.
 

Similar to Data weekender deploying prod grade sql 2019 big data clusters

A brief study on Kubernetes and its components
A brief study on Kubernetes and its componentsA brief study on Kubernetes and its components
A brief study on Kubernetes and its componentsRamit Surana
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Ryan Jarvinen
 
Kubernetes - Sailing a Sea of Containers
Kubernetes - Sailing a Sea of ContainersKubernetes - Sailing a Sea of Containers
Kubernetes - Sailing a Sea of ContainersKel Cecil
 
1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej SikaJuraj Hantak
 
Run the elastic stack on kubernetes with eck
Run the elastic stack on kubernetes with eck   Run the elastic stack on kubernetes with eck
Run the elastic stack on kubernetes with eck Daliya Spasova
 
Redis Meetup TLV - K8s Session 28/10/2018
Redis Meetup TLV - K8s Session 28/10/2018Redis Meetup TLV - K8s Session 28/10/2018
Redis Meetup TLV - K8s Session 28/10/2018Danni Moiseyev
 
Bitbucket Pipelines - Powered by Kubernetes
Bitbucket Pipelines - Powered by KubernetesBitbucket Pipelines - Powered by Kubernetes
Bitbucket Pipelines - Powered by KubernetesNathan Burrell
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingPiotr Perzyna
 
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScyllaDB
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...Equnix Business Solutions
 
Operator Lifecycle Management
Operator Lifecycle ManagementOperator Lifecycle Management
Operator Lifecycle ManagementDoKC
 
Operator Lifecycle Management
Operator Lifecycle ManagementOperator Lifecycle Management
Operator Lifecycle ManagementDoKC
 
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupKubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupStefan Schimanski
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)HungWei Chiu
 
Chotot k8s experiences.pptx
Chotot k8s experiences.pptxChotot k8s experiences.pptx
Chotot k8s experiences.pptxarptit
 
Containers and Kubernetes -Notes Leo
Containers and Kubernetes -Notes LeoContainers and Kubernetes -Notes Leo
Containers and Kubernetes -Notes LeoLéopold Gault
 
Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayQiming Teng
 
Kubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesKubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesJian-Kai Wang
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 

Similar to Data weekender deploying prod grade sql 2019 big data clusters (20)

A brief study on Kubernetes and its components
A brief study on Kubernetes and its componentsA brief study on Kubernetes and its components
A brief study on Kubernetes and its components
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17
 
Kubernetes - Sailing a Sea of Containers
Kubernetes - Sailing a Sea of ContainersKubernetes - Sailing a Sea of Containers
Kubernetes - Sailing a Sea of Containers
 
1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika
 
Run the elastic stack on kubernetes with eck
Run the elastic stack on kubernetes with eck   Run the elastic stack on kubernetes with eck
Run the elastic stack on kubernetes with eck
 
Redis Meetup TLV - K8s Session 28/10/2018
Redis Meetup TLV - K8s Session 28/10/2018Redis Meetup TLV - K8s Session 28/10/2018
Redis Meetup TLV - K8s Session 28/10/2018
 
Bitbucket Pipelines - Powered by Kubernetes
Bitbucket Pipelines - Powered by KubernetesBitbucket Pipelines - Powered by Kubernetes
Bitbucket Pipelines - Powered by Kubernetes
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals Training
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla Operator
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
 
Operator Lifecycle Management
Operator Lifecycle ManagementOperator Lifecycle Management
Operator Lifecycle Management
 
Operator Lifecycle Management
Operator Lifecycle ManagementOperator Lifecycle Management
Operator Lifecycle Management
 
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupKubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)
 
Chotot k8s experiences.pptx
Chotot k8s experiences.pptxChotot k8s experiences.pptx
Chotot k8s experiences.pptx
 
Containers and Kubernetes -Notes Leo
Containers and Kubernetes -Notes LeoContainers and Kubernetes -Notes Leo
Containers and Kubernetes -Notes Leo
 
Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native Way
 
Kubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesKubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and Services
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 

More from Chris Adkin

Ci with jenkins docker and mssql belgium
Ci with jenkins docker and mssql belgiumCi with jenkins docker and mssql belgium
Ci with jenkins docker and mssql belgiumChris Adkin
 
Continuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL ServerContinuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL ServerChris Adkin
 
Sql server scalability fundamentals
Sql server scalability fundamentalsSql server scalability fundamentals
Sql server scalability fundamentalsChris Adkin
 
Leveraging memory in sql server
Leveraging memory in sql serverLeveraging memory in sql server
Leveraging memory in sql serverChris Adkin
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton insertsChris Adkin
 
Scaling sql server 2014 parallel insert
Scaling sql server 2014 parallel insertScaling sql server 2014 parallel insert
Scaling sql server 2014 parallel insertChris Adkin
 
Sql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramSql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramChris Adkin
 
Sql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesSql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesChris Adkin
 
An introduction to column store indexes and batch mode
An introduction to column store indexes and batch modeAn introduction to column store indexes and batch mode
An introduction to column store indexes and batch modeChris Adkin
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Chris Adkin
 
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineScaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineChris Adkin
 
Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql serverChris Adkin
 
TSQL Coding Guidelines
TSQL Coding GuidelinesTSQL Coding Guidelines
TSQL Coding GuidelinesChris Adkin
 
J2EE Performance And Scalability Bp
J2EE Performance And Scalability BpJ2EE Performance And Scalability Bp
J2EE Performance And Scalability BpChris Adkin
 
J2EE Batch Processing
J2EE Batch ProcessingJ2EE Batch Processing
J2EE Batch ProcessingChris Adkin
 
Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql TuningChris Adkin
 

More from Chris Adkin (16)

Ci with jenkins docker and mssql belgium
Ci with jenkins docker and mssql belgiumCi with jenkins docker and mssql belgium
Ci with jenkins docker and mssql belgium
 
Continuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL ServerContinuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL Server
 
Sql server scalability fundamentals
Sql server scalability fundamentalsSql server scalability fundamentals
Sql server scalability fundamentals
 
Leveraging memory in sql server
Leveraging memory in sql serverLeveraging memory in sql server
Leveraging memory in sql server
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
Scaling sql server 2014 parallel insert
Scaling sql server 2014 parallel insertScaling sql server 2014 parallel insert
Scaling sql server 2014 parallel insert
 
Sql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramSql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ram
 
Sql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesSql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architectures
 
An introduction to column store indexes and batch mode
An introduction to column store indexes and batch modeAn introduction to column store indexes and batch mode
An introduction to column store indexes and batch mode
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)
 
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineScaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
 
Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql server
 
TSQL Coding Guidelines
TSQL Coding GuidelinesTSQL Coding Guidelines
TSQL Coding Guidelines
 
J2EE Performance And Scalability Bp
J2EE Performance And Scalability BpJ2EE Performance And Scalability Bp
J2EE Performance And Scalability Bp
 
J2EE Batch Processing
J2EE Batch ProcessingJ2EE Batch Processing
J2EE Batch Processing
 
Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql Tuning
 

Recently uploaded

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

Data weekender deploying prod grade sql 2019 big data clusters

  • 1. Deploying Production Grade SQL Server 2019 Big Data Clusters Chris Adkin
  • 2. Who Am I ?  SQL Server Solution Architect at Pure Storage  SQL Server user for 20 years  Was heavily involved in the SQL Server 2019 EAP  Co-author of the Microsoft workshop: Big Data Cluster: From Bare Metal to Kubernetes
  • 3. So That We Are All On The Same Page  What are containers ?  What is Kubernetes ?  What are master and worker nodes ?  What are pods ?
  • 4. The Big Picture Microsoft’s vision is that if you cannot go to Azure, Microsoft will bring Azure to you via a cloud native platform: Kubernetes
  • 5. #1 Planning Your Kubernetes Cluster  How many nodes should I have ?  What do I require in terms of memory and CPU ?  What about storage ?  What about cluster maintenance ?
  • 6. To Start Off With . . .  Master nodes 2GB memory, 2 logical processors  Worker nodes 64GB memory, 8 logical processors
  • 7. A Minimum Viable Production Kubernetes Cluster Kubelet Pod Kubelet Pod Kubelet Pod Containers { } API Server Scheduler Controller Master node Worker nodes etcd 2 x master nodes  3 x worker nodes  3 x etcd instances
  • 8. Node Maintenance and Pod Mobility Worker node POD Worker node POD Worker node POD Worker node Worker node Worker node Worker node app: nginx replicas: 3 T = 1 T = 0 T = 2 POD POD POD POD POD POD Worker node Worker node Pro tip: factor in extra memory and CPU resources for node maintenance and pod mobility
  • 9. Big Data Cluster Architecture Pro tip: plan for worker nodes to be dedicated to the storage pool
  • 10.  Volume  Persistent volume claim  Persistent volume  Underlying storage  A Big Takeaway Point State needs to be able to follow pods around the cluster The Kubernetes Storage “Layer Cake”
  • 11. Manual provisioning Automatic Provisioning volumes/LUNs have to be created manually Plugin creates volumes/LUNs automatically Storage Provisioning
  • 12. Where Does State Live ?  etcd  Data pool  Master pool  Controller, compute and app pools  Storage pool HDFS - file  Storage pool HDFS – parquet  Certificates and images  Object – public cloud: S3 or Azure Data Lake Gen 2  Object – on-premises: S3 Persistent volumes
  • 13. Stateful Pod Mobility Worker node T = 1 T = 0 T = 2 Worker node With replicated state Solution #1: reschedule pod(s) to where state is replicated to POD POD Worker nodeWorker node POD Worker nodeWorker node
  • 14. Stateful Pod Mobility Solution #2: shared state that every node in the cluster can see Worker node POD T = 1 T = 0 T = 2 Worker node Worker nodeWorker node Worker nodeWorker node With shared state POD POD
  • 15. Storage Pool - HDFS  Default HDFS replication factor is 3  However much data you need to store in the storage pool => multiple this by 3 !!!  Triples the write element of your storage pool IO workload  . . . but !!!
  • 16. Storage Pool HDFS Replication Factor Pro tip: if your storage platform supports RAID and / or erasure coding, => HDFS replication factor can be set to 1
  • 17. Storage Plugins In-tree FlexVolume Container Storage Interface Pro tip: prefer storage plugins that adhere to the CSI standard, refer to https://kubernetes-csi.github.io/docs/drivers.html
  • 19. Deployment and Management Tools  kubectl  azdata  Helm  Kubespray (more on this later) Pro tip: use a separate server that does not host any cluster nodes to run all of your tools from
  • 20. #2 Deployments  Deploying Kubernetes clusters  Deploying SQL Server 2019 Big Data Clusters  Idempotent infrastructure
  • 21.  On each machine:  Add current machine to /etc/hosts.  Disable swapping on all devices.  Import the keys and register the repository for Kubernetes.  Configure docker and Kubernetes prerequisites on the machine.  Set net.bridge.bridge-nf-call-iptables=1. On Ubuntu 18.04, the following commands first enable br_netfilter.  Run kubeadm init on first master  Run kubeadm join on each worker node Kubeadm – Standard Cluster Creation Tool Is there a better way ?
  • 22. Kubespray – Managing Your Cluster Life Cycle  Clone kubespray GitHub repo  Install python, ansible and pre-requisites from requirements.txt  Ensure that each node host is contactable via ssh  Copy inventory directory  Edit inventory.ini file  Run cluster.yml ansible playbook
  • 23. 1. Create a cluster ansible-playbook -i inventory/<cluster-name>/inventory.ini cluster.yml --become --become-user=root –K 2. Tear down a cluster ansible-playbook -i inventory/<cluster-name>/inventory.ini reset.yml --become --become-user=root –K 3. Add node(s) to a cluster ansible-playbook -i inventory/<cluster-name>/inventory.ini scale.yml --become --become-user=root –K 4. Rebuild a cluster’s master nodes ansible-playbook -i inventory/<cluster-name>/inventory.ini recover-control-plane.yml --become --become-user=root –K 5. Upgrade a cluster ansible-playbook -i inventory/<cluster-name>/inventory.ini upgrade-cluster.yml --become --become-user=root –K Managing The Entire Kubernetes Cluster Life Cycle
  • 24. Should I Virtualize My Nodes ? Bare Metal Container Engine How Kubernetes ‘Purists’ view this . . . but Hypervisor Guest OS Container Base OS Image Application Image Layer(s)
  • 25. VMs are Great For Automation 2. Use Terraform to create node hosts 1. VM template 4. Kubespray3. Node hosts created 5. Goal: Kubernetes cluster
  • 26. SQL Server OLAP 101: How Much IO Can A CPU Core Consume ? SELECT COUNT(*) FROM SomeBigTable OPTION (MAXDOP 1)
  • 27. Storage Pool: Gauging IO Throughput Populate a simple data frame, and then run a count() on it.
  • 28. Leveraging Pod / Node Affinity Worker node 1 Worker node 1 Worker nodes 4, 5 and 6. Worker node 2 Tie specific parts of the Big Data Cluster to specific K8s worker nodes. Pro Tip: At the bare minimum, dedicate worker nodes to the storage pool.
  • 29. Pod / Node Affinity – Worked Example 1. Label your worker nodes kubectl label node z-ca-bdc-worker1 mssql-cluster=bdc mssql-resource=bdc-shared --overwrite=true kubectl label node z-ca-bdc-worker2 mssql-cluster=bdc mssql-resource=bdc-storage-pool --overwrite=true kubectl label node z-ca-bdc-worker3 mssql-cluster=bdc mssql-resource=bdc-storage-pool --overwrite=true 2.Create a configuration azdata bdc config init --source kubeadm-dev-test --target ca-bdc-dev-test 3. Add default cluster and node labels to control.json azdata bdc config add -c ca-bdc-dev-test/control.json -j "$.spec.clusterLabel=bdc" azdata bdc config add -c ca-bdc-dev-test/control.json -j "$.spec.nodeLabel=bdc-shared“ 4. Assign node level labels in bdc.json azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.master.spec.nodeLabel=bdc-shared" azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.compute-0.spec.nodeLabel=bdc-shared" azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.data-0.spec.nodeLabel=bdc-shared" azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.storage-0.spec.nodeLabel=bdc-storage-pool" azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.nmnode-0.spec.nodeLabel=bdc-shared" azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.sparkhead.spec.nodeLabel=bdc-shared" azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.zookeeper.spec.nodeLabel=bdc-shared" azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.gateway.spec.nodeLabel=bdc-shared" azdata bdc config add -c ca-bdc-dev-test/bdc.json -j "$.spec.resources.appproxy.spec.nodeLabel=bdc-shared“ 5. Deploy your Big Data Cluster azdata bdc create --accept-eula yes --config-profile ca-bdc-dev-test
  • 30. DEMO
  • 31.  The default on bare metal for external access is NodePort.  Can be changed to LoadBalancer . . . requires something to do this. A Word On . . . Ingress Control
  • 32. #3 Resilience  Pod mobility and node maintenance  Rolling back a version upgrade  Backup and recovery
  • 33. Pod Mobility / Node Maintenance  Draining a node of its current POD workload: Kubectl drain <node-name>  Re-enabling a node to accept pods Kubectl uncordon <node-name> Pro Tip: test draining worker nodes, important both for worker node maintenance and testing the resilience of your Kubernetes cluster.
  • 34. Rolling Back Upgrades  Big Data Cluster images are cached on all worker nodes  Images will build up on each node, until manual intervention (docker rmi) is taken Pro Tip: Keep, the images for the previous CU, so that you have somewhere to rollback to without having to download images. Kubelet Pod Kubelet Pod Kubelet Pod Containers { } API Server Scheduler Controller Master node Worker nodes etcd
  • 35. Backup And Recovery  etcd  Data pool  Master pool  Controller, compute and app pools  Storage pool HDFS - file  Storage pool HDFS – parquet  Certificates and images  Object – public cloud: S3 or Azure Data Lake Gen 2  Object – on-premises: S3 What we need to worry about
  • 36. Backup And Recovery  etcd  Data pool  Master pool  Controller, compute and app pools  Storage pool HDFS - file  Storage pool HDFS – parquet  Certificates and images  Object – public cloud: S3 or Azure Data Lake Gen 2  Object – on-premises: S3 SQL Server BACKUP and RESTORE etcdctl azdata bdc hdfs cp Tools native to platform velero
  • 37. Two Parts To Every Storage Platform  Storage element, how well the platform handles:  Metadata  RAID / erasure coding  Snapshots and replication  Consistency of performance  API: how well it integrates with Kubernetes Pro Tip: Valero and 3rd party backup tools usually tap a storage platforms snapshot capabilities => the quality of this will influence the quality of your backup / restore experience