SlideShare a Scribd company logo
1 of 15
Download to read offline
SQL Server
Big Data Clusters
Rock Pereira
SQL Saturday, Redmond
April 27, 2019
Contents
1.Kubernetes for Data Science
2.SQL Server Big Data Clusters
3.Understand the problem
4.Data exploration and analysis
5.Data-driven application development with Kubernetes
1 Kubernetes for Data Science
1.1 What is Kubernetes?
1.1 What is Kubernetes?
Docker Containers
MCR:
Microsoft Container
Registry
1.2 Benefits
Build &
Configure
InsightObservation
Estimate
Compute Needs
Parameterized
Deployment
Autoscaling
1.3 Team Data Science Lifecycle
1.4 Demo: SS 2019 in Minikube
2 SQL Server Big Data Clusters
2.1 What is a Big Data Cluster?
Unified data platform for analytics
Data-driven solutions using Kubernetes
Components of a BDC:
●
Spark - Distributed, In-memory compute
●
HDFS - Elastic Storage
●
SQL Server - Data Hub for structured &
unstructured data
●
Kubernetes - Scale-out, fault-tolerant
2.2 Features
●
Deploy anywhere there is managed Kubernetes
●
Management services for logging, monitoring,
backup and high availability
●
Consistent portal for managing all your clusters
2.3 Polybase
Query HDFS (Azure Blob Storage, Hortonworks, Cloudera)
using External Tables in SQL Server
●
Manage permissions with Active Directory
●
No data duplication – The data is not persistent
New in SQL Server 2019:
●
Connectors to Azure SQL DB, Azure SQL DW, Oracle,
Teradata, MongoDB, Azure CosmosDB + any ODBC
compliant source with an ODBC driver (IBM DB2, SAP
Hana, Microsoft Excel)
●
Read CSV & parquet
2.4 Architecture
Compute Pool:
Parallel Ingest
Storage Pool:
Scalable Storage
Data Processing
SQL Data Pool:
Caching External Data
Distributed across SS
Instances
SS Master Pool:
Read-Write OLTP
Store dimensional
2.5 Azure Data Studio
●
Work with relational (big) data in SQL Server
●
HDFS browser – like Azure Storage Explorer
●
External Table wizard, incl column mapping
●
Jupyter-based notebooks
●
Collaboration
●
Code with intellisense
●
Submit Spark jobs
2.6 Deploying a Big Data Cluster
Minikube On-Prem Cloud (AKS)
Single Node
Requirements:
Memory: 32 GB
CPU: 8
Disk Space: 100 GB
Use kubeadm Use python script
Set environment
variables before
deploying
Tools:
mssqlctl (app_commands, ref), kubectl, Azure CLI
Azure Data Studio + SQL Server 2019 extension

More Related Content

What's hot

DotnetConf - Cloud native and .Net5 announcements
DotnetConf - Cloud native and .Net5 announcementsDotnetConf - Cloud native and .Net5 announcements
DotnetConf - Cloud native and .Net5 announcementsSajeetharan
 
Compute Security - Container Security
Compute Security - Container SecurityCompute Security - Container Security
Compute Security - Container SecurityEng Teong Cheah
 
Dell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with IsilonDell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with IsilonGreg Kirchoff
 
Uri budnik moving from virtualized infrastructure to open stack-4.17.13
Uri budnik moving from virtualized infrastructure to open stack-4.17.13Uri budnik moving from virtualized infrastructure to open stack-4.17.13
Uri budnik moving from virtualized infrastructure to open stack-4.17.13OpenStack Foundation
 
Virtual Kubernetes Clusters on Amazon EKS
Virtual Kubernetes Clusters on Amazon EKSVirtual Kubernetes Clusters on Amazon EKS
Virtual Kubernetes Clusters on Amazon EKSJim Bugwadia
 
Azure Container Services​
Azure Container Services​Azure Container Services​
Azure Container Services​Pedro Sousa
 
Azure Container Services​
Azure Container Services​Azure Container Services​
Azure Container Services​Pedro Sousa
 
Windows 2012 Technical Overview
Windows 2012 Technical OverviewWindows 2012 Technical Overview
Windows 2012 Technical OverviewAmit Gatenyo
 
Enterprise data management for microsoft hd insight
Enterprise data management for microsoft hd insightEnterprise data management for microsoft hd insight
Enterprise data management for microsoft hd insightJana Lass
 
Taking Care of Business at Office Depot with Elastic Cloud Enterprise
Taking Care of Business at Office Depot with Elastic Cloud Enterprise Taking Care of Business at Office Depot with Elastic Cloud Enterprise
Taking Care of Business at Office Depot with Elastic Cloud Enterprise Elasticsearch
 
Amis Query (02-09-2008): Reports From Oracle Open World - Database
Amis Query (02-09-2008): Reports From Oracle Open World - DatabaseAmis Query (02-09-2008): Reports From Oracle Open World - Database
Amis Query (02-09-2008): Reports From Oracle Open World - DatabaseMarco Gralike
 
Monitoring Your AWS EKS Environment with Datadog
Monitoring Your AWS EKS Environment with DatadogMonitoring Your AWS EKS Environment with Datadog
Monitoring Your AWS EKS Environment with DatadogDevOps.com
 

What's hot (14)

DotnetConf - Cloud native and .Net5 announcements
DotnetConf - Cloud native and .Net5 announcementsDotnetConf - Cloud native and .Net5 announcements
DotnetConf - Cloud native and .Net5 announcements
 
Compute Security - Container Security
Compute Security - Container SecurityCompute Security - Container Security
Compute Security - Container Security
 
Dell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with IsilonDell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with Isilon
 
Uri budnik moving from virtualized infrastructure to open stack-4.17.13
Uri budnik moving from virtualized infrastructure to open stack-4.17.13Uri budnik moving from virtualized infrastructure to open stack-4.17.13
Uri budnik moving from virtualized infrastructure to open stack-4.17.13
 
Virtual Kubernetes Clusters on Amazon EKS
Virtual Kubernetes Clusters on Amazon EKSVirtual Kubernetes Clusters on Amazon EKS
Virtual Kubernetes Clusters on Amazon EKS
 
Azure Container Services​
Azure Container Services​Azure Container Services​
Azure Container Services​
 
Azure Container Services​
Azure Container Services​Azure Container Services​
Azure Container Services​
 
Windows 2012 Technical Overview
Windows 2012 Technical OverviewWindows 2012 Technical Overview
Windows 2012 Technical Overview
 
Enterprise data management for microsoft hd insight
Enterprise data management for microsoft hd insightEnterprise data management for microsoft hd insight
Enterprise data management for microsoft hd insight
 
Taking Care of Business at Office Depot with Elastic Cloud Enterprise
Taking Care of Business at Office Depot with Elastic Cloud Enterprise Taking Care of Business at Office Depot with Elastic Cloud Enterprise
Taking Care of Business at Office Depot with Elastic Cloud Enterprise
 
Amis Query (02-09-2008): Reports From Oracle Open World - Database
Amis Query (02-09-2008): Reports From Oracle Open World - DatabaseAmis Query (02-09-2008): Reports From Oracle Open World - Database
Amis Query (02-09-2008): Reports From Oracle Open World - Database
 
Tokyo Azure Meetup #29 AKS
Tokyo Azure Meetup #29 AKSTokyo Azure Meetup #29 AKS
Tokyo Azure Meetup #29 AKS
 
Monitoring Your AWS EKS Environment with Datadog
Monitoring Your AWS EKS Environment with DatadogMonitoring Your AWS EKS Environment with Datadog
Monitoring Your AWS EKS Environment with Datadog
 
Openstack
OpenstackOpenstack
Openstack
 

Similar to Introduction to SQL Server Big Data Clusters

SQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsSQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsMichaela Murray
 
Modern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesModern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesSlim Baltagi
 
Microsoft ignite 2018 SQL Server 2019 big data clusters - intro session
Microsoft ignite 2018  SQL Server 2019 big data clusters - intro sessionMicrosoft ignite 2018  SQL Server 2019 big data clusters - intro session
Microsoft ignite 2018 SQL Server 2019 big data clusters - intro sessionTravis Wright
 
20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stackAlexandre BERGERE
 
SQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsSQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsMichaela Murray
 
Measuring Resources & Workload Skew In Micro-Service MPP Analytic Query Engine
Measuring Resources & Workload Skew In Micro-Service MPP Analytic Query EngineMeasuring Resources & Workload Skew In Micro-Service MPP Analytic Query Engine
Measuring Resources & Workload Skew In Micro-Service MPP Analytic Query Engineparekhnikunj
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Mydbops
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Cédrick Lunven
 
Microsoft Azure News - 2018 June
Microsoft Azure News - 2018 JuneMicrosoft Azure News - 2018 June
Microsoft Azure News - 2018 JuneDaniel Toomey
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageMatthew Sheppard
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clustersChris Adkin
 
Lets talk about: Azure Kubernetes Service (AKS)
Lets talk about: Azure Kubernetes Service (AKS)Lets talk about: Azure Kubernetes Service (AKS)
Lets talk about: Azure Kubernetes Service (AKS)Pedro Sousa
 
Logging, indicateurs et APM : le trio gagnant pour des opérations réussies
Logging, indicateurs et APM : le trio gagnant pour des opérations réussiesLogging, indicateurs et APM : le trio gagnant pour des opérations réussies
Logging, indicateurs et APM : le trio gagnant pour des opérations réussiesElasticsearch
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...DataWorks Summit
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsAraf Karsh Hamid
 
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...DataScienceConferenc1
 
Azure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeAzure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeRick van den Bosch
 

Similar to Introduction to SQL Server Big Data Clusters (20)

SQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsSQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT Solutions
 
Modern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesModern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetes
 
Microsoft ignite 2018 SQL Server 2019 big data clusters - intro session
Microsoft ignite 2018  SQL Server 2019 big data clusters - intro sessionMicrosoft ignite 2018  SQL Server 2019 big data clusters - intro session
Microsoft ignite 2018 SQL Server 2019 big data clusters - intro session
 
20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack
 
SQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsSQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT Solutions
 
Highlights of OpenStack Mitaka and the OpenStack Summit
Highlights of OpenStack Mitaka and the OpenStack SummitHighlights of OpenStack Mitaka and the OpenStack Summit
Highlights of OpenStack Mitaka and the OpenStack Summit
 
Measuring Resources & Workload Skew In Micro-Service MPP Analytic Query Engine
Measuring Resources & Workload Skew In Micro-Service MPP Analytic Query EngineMeasuring Resources & Workload Skew In Micro-Service MPP Analytic Query Engine
Measuring Resources & Workload Skew In Micro-Service MPP Analytic Query Engine
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
 
Microsoft Azure News - 2018 June
Microsoft Azure News - 2018 JuneMicrosoft Azure News - 2018 June
Microsoft Azure News - 2018 June
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
 
Managing containers at scale
Managing containers at scale          Managing containers at scale
Managing containers at scale
 
Lets talk about: Azure Kubernetes Service (AKS)
Lets talk about: Azure Kubernetes Service (AKS)Lets talk about: Azure Kubernetes Service (AKS)
Lets talk about: Azure Kubernetes Service (AKS)
 
Logging, indicateurs et APM : le trio gagnant pour des opérations réussies
Logging, indicateurs et APM : le trio gagnant pour des opérations réussiesLogging, indicateurs et APM : le trio gagnant pour des opérations réussies
Logging, indicateurs et APM : le trio gagnant pour des opérations réussies
 
Serverless SQL
Serverless SQLServerless SQL
Serverless SQL
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native Apps
 
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
[DSC Europe 22] REDLake Azure - Utilizing 750 TB with Azure Components - Pred...
 
Azure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeAzure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data Lake
 

Recently uploaded

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 

Introduction to SQL Server Big Data Clusters

  • 1. SQL Server Big Data Clusters Rock Pereira SQL Saturday, Redmond April 27, 2019
  • 2. Contents 1.Kubernetes for Data Science 2.SQL Server Big Data Clusters 3.Understand the problem 4.Data exploration and analysis 5.Data-driven application development with Kubernetes
  • 3. 1 Kubernetes for Data Science
  • 4. 1.1 What is Kubernetes?
  • 5. 1.1 What is Kubernetes? Docker Containers MCR: Microsoft Container Registry
  • 6. 1.2 Benefits Build & Configure InsightObservation Estimate Compute Needs Parameterized Deployment Autoscaling
  • 7. 1.3 Team Data Science Lifecycle
  • 8. 1.4 Demo: SS 2019 in Minikube
  • 9. 2 SQL Server Big Data Clusters
  • 10. 2.1 What is a Big Data Cluster? Unified data platform for analytics Data-driven solutions using Kubernetes Components of a BDC: ● Spark - Distributed, In-memory compute ● HDFS - Elastic Storage ● SQL Server - Data Hub for structured & unstructured data ● Kubernetes - Scale-out, fault-tolerant
  • 11. 2.2 Features ● Deploy anywhere there is managed Kubernetes ● Management services for logging, monitoring, backup and high availability ● Consistent portal for managing all your clusters
  • 12. 2.3 Polybase Query HDFS (Azure Blob Storage, Hortonworks, Cloudera) using External Tables in SQL Server ● Manage permissions with Active Directory ● No data duplication – The data is not persistent New in SQL Server 2019: ● Connectors to Azure SQL DB, Azure SQL DW, Oracle, Teradata, MongoDB, Azure CosmosDB + any ODBC compliant source with an ODBC driver (IBM DB2, SAP Hana, Microsoft Excel) ● Read CSV & parquet
  • 13. 2.4 Architecture Compute Pool: Parallel Ingest Storage Pool: Scalable Storage Data Processing SQL Data Pool: Caching External Data Distributed across SS Instances SS Master Pool: Read-Write OLTP Store dimensional
  • 14. 2.5 Azure Data Studio ● Work with relational (big) data in SQL Server ● HDFS browser – like Azure Storage Explorer ● External Table wizard, incl column mapping ● Jupyter-based notebooks ● Collaboration ● Code with intellisense ● Submit Spark jobs
  • 15. 2.6 Deploying a Big Data Cluster Minikube On-Prem Cloud (AKS) Single Node Requirements: Memory: 32 GB CPU: 8 Disk Space: 100 GB Use kubeadm Use python script Set environment variables before deploying Tools: mssqlctl (app_commands, ref), kubectl, Azure CLI Azure Data Studio + SQL Server 2019 extension