Azure Data Services
Afternoons with Azure Part 3
AGENDA
Modern BI Architecture
What is PaaS
Storage Options
Azure Active Directory
SQL DB
SQL DW
-- Demonstration --
Data Lake
Data Factory
-- Demonstration --
Modern BI Architecture
Common BI
Solution
Diagram
Data sources Non-relational data
New data
sources & types
Increasing
data volumes
Real-time data
Cloud-born data
Modern Business Intelligence Landscape
Azure PaaS
Key Benefits
Less Administration - free up
administrative resources
Scalability - When using SQL
Database, you pay-as-you-go with
options to scale up or out for
greater power with no interruption
Faster deployments
No operating system cost/updates
overhead
Backups are done by default
Development continues without
any major changes.
Azure Storage
The Foundation of All Things Azure
Azure Storage
Azure Storage is…
Durable
Scalable
Highly Available
Azure storage provides the foundation upon which most
of the remaining Azure cloud offerings depend. Whether
it's IaaS virtual servers, PaaS SQL and warehouse data, it
all needs to be stored somewhere.
Azure Blob
Specialized storage account for
storing LARGE amounts
unstructured data as Blob
objects
Could include image, video, text
documents, etc.
Stored in containers
Use when you want your
application to support
streaming and random access
scenarios.
3 Main Types of Azure Storage
Azure File
Offers fully managed file shares
in the cloud
Replace or supplement typical
on-premise file systems
Easily mount and Azure File
Share and access through
Windows (Map network drive)
Use when you want to lift and
shift an application to the cloud
that already uses an on-premise
file system
Azure Disk
Managed Disk space on a
Virtual Machine
Solid State Drive (SSD)
You want to store data that
is not required to be
accessed from outside the
virtual machine to which
the disk is attached
Azure Active Directory
The Foundation of All Things Azure
Azure Active Directory
(AAD)
• AAD can be synced with On-Prem Windows AD
• Extends a user’s scope of access into cloud offerings
• Free offering with any Azure subscription
• Premium SKU allows for write back from AAD to on-premise
active directory
Identity and access management cloud solution that
provides a robust set of capabilities to manage users and
groups. Helps secure access to on-premises and cloud.
Azure SQL DB
Azure Databases
Topics
What is Azure SQL DB
Fears of the Cloud
Security
Benefits
Things to keep in mind
What is Azure SQL Database?
Azure SQL Database is the intelligent, fully managed relational cloud
database service that provides the broadest SQL Server engine
compatibility so you can migrate your SQL Server databases without
changing your apps.
• Accelerates app development
• Makes maintenance easy and productive
Take advantage of built-in intelligence that learns app patterns and
adapts to maximize performance, reliability, and data protection.
~MSFT
Fears of the Cloud
Change
Job Security
Accessibility
Compatibility
Infrastructure
https://www.petri.com/azure-sql-versus-sql-server-azure-vm
Security
Benefits
Save on costs
Focused & faster development
• Great for Agile!
Scalability
Auto back up
Security
Things to Keep in Mind
No SQL Server Agent
Can’t go cross database
Slight difference in terms of SQL
Heavy Loads
Deleting Database
Azure SQL Data Warehouse
Azure Databases
Topics
What is Azure SQL DW
Benefits
Costs breakdown
Azure SQL DV vs. Azure
SQL DW
What is Azure SQL Data Warehouse (ADW)
• Microsoft managed service platform (PaaS offering) build
of SQL Server technology.
• A cloud-based Enterprise Data Warehouse (EDW) that
leverages Massively Parallel Processing (MPP) to quickly
run complex queries across petabytes of data.
• Uses PolyBase to query the big data stores. PolyBase uses
standard T-SQL queries to bring the data into SQL Data
Warehouse.
Benefits of Azure SQL DW
Cost-efficiencies, elasticity and hyper-scale of cloud for LARGE data warehouses
Allows for separation of compute and storage as needed
Ability to combine relational and non-relational data hosted in Hadoop using PolyBase
Allows for pausing an instance to save costs
Azure SQL DW Tiers and Costs
***
*** Pricing changes frequently ***
Azure SQL DB versus Azure SQL DW
Azure Databases
Both are PaaS offerings.
Azure SQL Data Warehouse was built for OLAP systems. Massively Parallel
processing system or MPP are made up of multiple nodes each with their
own resources and they work together to provide increased performance.
Azure SQL DB is optimized for quick reads and writes. (OLTP)
Azure SQL Server versus Azure SQL Data Warehouse
Which to Choose When???
Azure
SQL Server
Azure SQL Data
Warehouse
Which to choose when…
Azure SQL Server versus Azure SQL Data Warehouse
…you are building an
application database?
Answer:
Which to choose when…
Azure SQL Server versus Azure SQL Data Warehouse
…you expect to work with
large amounts of data (> 4TB)
with a small number of users??
Answer:
Which to choose when…
Azure SQL Server versus Azure SQL Data Warehouse
…you want to build a
data warehouse?
Answer:
Guidelines
When to choose
Azure DW over Azure DB
When you expect to have a need to scale significantly. There
are limitations on scaling up with Azure SQL DB. With MPP,
you have nearly unlimited scale out.
When you expect to work with large amounts of data (> 4TB)
– ADW is optimized for performing data analytics tasks
When you need to join relational and Hadoop data to Azure
Blob Storage (PolyBase)
– Allows the data to remain stored in Hadoop
Low to medium concurrency
Guidelines
When to choose
Azure DB over Azure DW
When you have small dataset (<4TB).
When you anticipate high concurrency of queries
– Large number of users/ systems connecting firing off
multiple queries
When there are frequent reads and writes (OLTP)
– Building and application database
When costs are a consideration
Guidelines
Other things to consider
ADW does not support Geo Replication
ADB has auto tuning features
ADW has the ability to pause where ADB you cannot
ADW only partially supports common table expressions
Azure Data Factory
What is
Azure Data Factory?
Hybrid data integration
service
Fully managed in the cloud
Over 60 available data
connectors
Process and transform all
types of data using compute
services
ADF in the Modern Data Warehouse
Pipeline: a logical grouping of activities that performs a unit of work
Activity: a processing step in a pipeline
Linked services: connections to external data resources
Datasets: structures within the data stores
Basic Components of ADFv2
SSIS/SSDT Connection Managers for Azure
SSIS and Azure Data Factory
SSIS/SSDT Control Flow components for Azure
SSIS/SSDT Data Flow components for Azure
Azure Data Lake
What is it?
– Blob storage with a prettier GUI that is intended for
enterprise-wide, big data analytics workloads
Data Lake Store
– Unlimited storage with individual files up to a petabyte
– Accepts all data in native format without requiring any
transformations
– Performance tuned for big data analytic by enabling files
to be broken into parts and spread across multiple
individual servers for enhanced throughput
Data Lake Analytics
– Analytics service that enables transformation of data via
written queries in USQL
– USQL is a variation of SQL and C#
Azure Data Lake Explored
Blob Storage
General purpose store for a variety of storage needs
Any type of text, backup, or general purpose data
Limitations on number of storage accounts per region,
max storage account capacity, and a few other criteria
Encrypted data at rest
Allow Rest API functionality
Data Lake vs. Blob Storage
Data Lake
Optimized for big data
Best utilized by implementations with streaming
analytics, machine learning, IoT and large datasets
No limits on size of data
Encrypted data at rest
Allow Rest API functionality
None of that cool, hip, “big data” stuff (yet)
Primary use case at the moment is using as both source and target for various ETL tasks
– Source of homesite and financial data for all of LEN and CAA
– Target for validation files and “null catch” output files
Extraction of Salesforce data for “future data science exploration”
– Somewhere around 50 tables of things such as Leads, Opportunities, etc.
But really, what are we using it for?
Questions?
APPENDIX
On-Premise versus Azure SQL DB
Speed: If you need a SQL database, you
deploy it in the Azure Portal and it’s ready.
You do not need to wait for Azure
infrastructure to be deployed.
Focus: You no longer are distracted by non-
database activities.
Portal: Make complicated technical changes
to your database with the click of a button
Familiarity: You know how to work with SQL
Server. You know the backup tools, how to
explain it, what works and what doesn’t
work, and some of the problems that you will
encounter.
Compatibility: When some application
requires SQL Server you know that SQL
Server on Windows Server will work.
https://www.petri.com/azure-sql-versus-sql-server-azure-vm

Afternoons with Azure - Azure Data Services

  • 1.
  • 2.
    AGENDA Modern BI Architecture Whatis PaaS Storage Options Azure Active Directory SQL DB SQL DW -- Demonstration -- Data Lake Data Factory -- Demonstration --
  • 3.
  • 4.
    Common BI Solution Diagram Data sourcesNon-relational data New data sources & types Increasing data volumes Real-time data Cloud-born data
  • 5.
  • 6.
    Azure PaaS Key Benefits LessAdministration - free up administrative resources Scalability - When using SQL Database, you pay-as-you-go with options to scale up or out for greater power with no interruption Faster deployments No operating system cost/updates overhead Backups are done by default Development continues without any major changes.
  • 8.
    Azure Storage The Foundationof All Things Azure
  • 9.
    Azure Storage Azure Storageis… Durable Scalable Highly Available Azure storage provides the foundation upon which most of the remaining Azure cloud offerings depend. Whether it's IaaS virtual servers, PaaS SQL and warehouse data, it all needs to be stored somewhere.
  • 10.
    Azure Blob Specialized storageaccount for storing LARGE amounts unstructured data as Blob objects Could include image, video, text documents, etc. Stored in containers Use when you want your application to support streaming and random access scenarios. 3 Main Types of Azure Storage Azure File Offers fully managed file shares in the cloud Replace or supplement typical on-premise file systems Easily mount and Azure File Share and access through Windows (Map network drive) Use when you want to lift and shift an application to the cloud that already uses an on-premise file system Azure Disk Managed Disk space on a Virtual Machine Solid State Drive (SSD) You want to store data that is not required to be accessed from outside the virtual machine to which the disk is attached
  • 11.
    Azure Active Directory TheFoundation of All Things Azure
  • 12.
    Azure Active Directory (AAD) •AAD can be synced with On-Prem Windows AD • Extends a user’s scope of access into cloud offerings • Free offering with any Azure subscription • Premium SKU allows for write back from AAD to on-premise active directory Identity and access management cloud solution that provides a robust set of capabilities to manage users and groups. Helps secure access to on-premises and cloud.
  • 13.
  • 14.
    Topics What is AzureSQL DB Fears of the Cloud Security Benefits Things to keep in mind
  • 15.
    What is AzureSQL Database? Azure SQL Database is the intelligent, fully managed relational cloud database service that provides the broadest SQL Server engine compatibility so you can migrate your SQL Server databases without changing your apps. • Accelerates app development • Makes maintenance easy and productive Take advantage of built-in intelligence that learns app patterns and adapts to maximize performance, reliability, and data protection. ~MSFT
  • 16.
    Fears of theCloud Change Job Security Accessibility Compatibility Infrastructure https://www.petri.com/azure-sql-versus-sql-server-azure-vm
  • 17.
  • 18.
    Benefits Save on costs Focused& faster development • Great for Agile! Scalability Auto back up Security
  • 19.
    Things to Keepin Mind No SQL Server Agent Can’t go cross database Slight difference in terms of SQL Heavy Loads Deleting Database
  • 20.
    Azure SQL DataWarehouse Azure Databases
  • 21.
    Topics What is AzureSQL DW Benefits Costs breakdown Azure SQL DV vs. Azure SQL DW
  • 22.
    What is AzureSQL Data Warehouse (ADW) • Microsoft managed service platform (PaaS offering) build of SQL Server technology. • A cloud-based Enterprise Data Warehouse (EDW) that leverages Massively Parallel Processing (MPP) to quickly run complex queries across petabytes of data. • Uses PolyBase to query the big data stores. PolyBase uses standard T-SQL queries to bring the data into SQL Data Warehouse.
  • 23.
    Benefits of AzureSQL DW Cost-efficiencies, elasticity and hyper-scale of cloud for LARGE data warehouses Allows for separation of compute and storage as needed Ability to combine relational and non-relational data hosted in Hadoop using PolyBase Allows for pausing an instance to save costs
  • 24.
    Azure SQL DWTiers and Costs *** *** Pricing changes frequently ***
  • 25.
    Azure SQL DBversus Azure SQL DW Azure Databases
  • 26.
    Both are PaaSofferings. Azure SQL Data Warehouse was built for OLAP systems. Massively Parallel processing system or MPP are made up of multiple nodes each with their own resources and they work together to provide increased performance. Azure SQL DB is optimized for quick reads and writes. (OLTP) Azure SQL Server versus Azure SQL Data Warehouse
  • 27.
    Which to ChooseWhen??? Azure SQL Server Azure SQL Data Warehouse
  • 28.
    Which to choosewhen… Azure SQL Server versus Azure SQL Data Warehouse …you are building an application database? Answer:
  • 29.
    Which to choosewhen… Azure SQL Server versus Azure SQL Data Warehouse …you expect to work with large amounts of data (> 4TB) with a small number of users?? Answer:
  • 30.
    Which to choosewhen… Azure SQL Server versus Azure SQL Data Warehouse …you want to build a data warehouse? Answer:
  • 31.
    Guidelines When to choose AzureDW over Azure DB When you expect to have a need to scale significantly. There are limitations on scaling up with Azure SQL DB. With MPP, you have nearly unlimited scale out. When you expect to work with large amounts of data (> 4TB) – ADW is optimized for performing data analytics tasks When you need to join relational and Hadoop data to Azure Blob Storage (PolyBase) – Allows the data to remain stored in Hadoop Low to medium concurrency
  • 32.
    Guidelines When to choose AzureDB over Azure DW When you have small dataset (<4TB). When you anticipate high concurrency of queries – Large number of users/ systems connecting firing off multiple queries When there are frequent reads and writes (OLTP) – Building and application database When costs are a consideration
  • 33.
    Guidelines Other things toconsider ADW does not support Geo Replication ADB has auto tuning features ADW has the ability to pause where ADB you cannot ADW only partially supports common table expressions
  • 34.
  • 35.
    What is Azure DataFactory? Hybrid data integration service Fully managed in the cloud Over 60 available data connectors Process and transform all types of data using compute services
  • 36.
    ADF in theModern Data Warehouse
  • 37.
    Pipeline: a logicalgrouping of activities that performs a unit of work Activity: a processing step in a pipeline Linked services: connections to external data resources Datasets: structures within the data stores Basic Components of ADFv2
  • 38.
    SSIS/SSDT Connection Managersfor Azure SSIS and Azure Data Factory SSIS/SSDT Control Flow components for Azure SSIS/SSDT Data Flow components for Azure
  • 39.
  • 40.
    What is it? –Blob storage with a prettier GUI that is intended for enterprise-wide, big data analytics workloads Data Lake Store – Unlimited storage with individual files up to a petabyte – Accepts all data in native format without requiring any transformations – Performance tuned for big data analytic by enabling files to be broken into parts and spread across multiple individual servers for enhanced throughput Data Lake Analytics – Analytics service that enables transformation of data via written queries in USQL – USQL is a variation of SQL and C# Azure Data Lake Explored
  • 41.
    Blob Storage General purposestore for a variety of storage needs Any type of text, backup, or general purpose data Limitations on number of storage accounts per region, max storage account capacity, and a few other criteria Encrypted data at rest Allow Rest API functionality Data Lake vs. Blob Storage Data Lake Optimized for big data Best utilized by implementations with streaming analytics, machine learning, IoT and large datasets No limits on size of data Encrypted data at rest Allow Rest API functionality
  • 42.
    None of thatcool, hip, “big data” stuff (yet) Primary use case at the moment is using as both source and target for various ETL tasks – Source of homesite and financial data for all of LEN and CAA – Target for validation files and “null catch” output files Extraction of Salesforce data for “future data science exploration” – Somewhere around 50 tables of things such as Leads, Opportunities, etc. But really, what are we using it for?
  • 43.
  • 44.
  • 45.
    On-Premise versus AzureSQL DB Speed: If you need a SQL database, you deploy it in the Azure Portal and it’s ready. You do not need to wait for Azure infrastructure to be deployed. Focus: You no longer are distracted by non- database activities. Portal: Make complicated technical changes to your database with the click of a button Familiarity: You know how to work with SQL Server. You know the backup tools, how to explain it, what works and what doesn’t work, and some of the problems that you will encounter. Compatibility: When some application requires SQL Server you know that SQL Server on Windows Server will work. https://www.petri.com/azure-sql-versus-sql-server-azure-vm

Editor's Notes