SlideShare a Scribd company logo
1 of 19
Modern Analytics Academy
Modeling
https://aka.ms/maa
Agenda • What is Modern Analytics Academy?
• The Team
• Modern Analytics in Azure
 Model & Serve
 Data Lake Structure
 Synapse as solution
 Demo
 Review
Senior Cloud Solution
Architect
Alex Karasek
Modern Analytics Academy Team
Principal Cloud Solution
Architect
Chris Mitchell
Principal Cloud Solution
Architect
Jason Virtue
Senior Cloud Solution
Architect
Annie Xu
Senior Cloud Solution
Architect
Brian Hitney
https://aka.ms/maa
Academy Sessions
Acquisition &
Storage
Modeling Pipelines Security &
Governance
Visualization
 Azure Data Factory
 Synapse Pipelines
 Power BI Dataflows
 Azure Stream
Analytics
 Data Lake Structure
 Synapse Spark Pools
 Synapse SQL Pools
 Synapse Serverless
SQL
 Azure Data Lake
 Azure Cosmos DB
 Azure Event Hubs
 Synapse Link
 Auditing
 Security
 Azure purview
 Power BI
 Paginated Reports
 Power BI Embedded
Modern Analytics in Azure
Advanced Analytics
INGEST PREP & TRAIN MODEL & SERVE BI + Reporting
Real Time Analytics
STORE
Big data store
EXPLORE
Query All Data Analytics Engines Data Warehouse
Data Orchestration
and Monitoring
METADATA MANAGEMENT & GOVERNANCE
Social
LOB
Graph
IoT
Image
CRM
Role of Data in Modern Analytics
Experimentation
Fast exploration
Semi-structured data
Big Data
OR
Proven security & privacy
Dependable performance
Operational data
Relational Data
Data Lake Data Warehouse
Role of Data in Modern Analytics
Data warehousing & big data analytics—all in one service
Azure Synapse Analytics
Data warehousing & big data analytics—all in one service
Azure Synapse Analytics
Introducing Azure
Synapse Analytics
A limitless analytics service with unmatched
time to insight, that delivers insights from all
your data, across data warehouses and big
data analytics systems, with blazing speed
Simply put, Azure Synapse is Azure SQL Data
Warehouse evolved
We have taken the same industry leading data
warehouse and elevated it to a whole new level of
performance and capabilities
Azure Synapse Analytics
Key Terms
Data Lake
A data lake is a storage repository that holds a large amount of data in its native, raw format. Data lake
stores are optimized for scaling to terabytes and petabytes of data. The data typically comes from
multiple heterogeneous sources, and may be structured, semi-structured, or unstructured. The idea with
a data lake is to store everything in its original, untransformed state. This approach differs from a
traditional data warehouse, which transforms and processes the data at the time of ingestion.
Azure Data Lake (ADLS Gen 2)
Data Lake Storage Gen2 makes Azure Storage the foundation for building enterprise data lakes on
Azure
Logical Data Warehouse (LDW)
A relational layer built on top of Azure data sources such as Azure Data Lake storage (ADLS Gen 2),
Azure Cosmos DB analytical storage, or Azure Blob storage
Data Warehouse (DW)
A data warehouse is a centralized repository of integrated data from one or more disparate sources.
Data warehouses store current and historical data and are used for reporting and analysis of the data.
ADLS Gen 2
Azure Synapse Analytics
Rich surface area
T-SQL language for data analytics
Supporting large number of
languages and tools
Enterprise-grade security
dedicated SQL pool
Modern Data Warehouse
Indexing and caching
Import and query external data
Workload management
serverless SQL pool
Querying external data
Model raw files as virtual tables
and views
Easy data transformation
Azure Synapse Analytics
Dedicated SQL Pools (Formerly SQL DW)
Best in class price
per performance
Developer
productivity
Intelligent workload
management
Data flexibility
Up to 94% less expensive
than competitors
Prioritize resources for
the most valuable
workloads
Ingest variety of data
sources to derive the
maximum benefit
Use preferred tooling for
SQL data warehouse
development
Industry-leading
security
Defense-in-depth
security and 99.9%
financially backed
availability SLA
Azure Synapse Analytics
Dedicated SQL Pools
T-SQL Querying
• Windowing aggregates
• Approximate execution (Hyperloglog)
• JSON data support
• Score machine learning models in ONNX format
Advanced storage system
• Columnstore Indexes
• Table partitions
• Distributed tables
• Isolation
• Materialized Views
• Nonclustered Indexes
• Result-set caching
Complete SQL object model
• Tables
• Views
• Stored procedures
• Functions
Azure Synapse Analytics
Serverless SQL Pools
Quick data exploration
Easily explore schema and data in
files on Azure storage
Supports various file formats
(Parquet, CSV, JSON)
Direct connector to Azure storage
for large BI ecosystem
Logical Data Warehouse
Model raw files as virtual tables and
views
Use any tool that works with SQL to
analyze files
Use enterprise-grade security model
Easy data transformation
Transform CSV to parquet format
Move data between containers and
accounts
Save the results of queries on
external storage
Azure Synapse Analytics
Apache Spark Pools
Spark Unifies:
 Batch Processing





An unified, open source, parallel, data processing framework for Big Data Analytics
Spark Core Engine
Spark SQL
Batch processing
Spark MLlib
Machine
Learning
Yarn
Spark MLlib
Machine
Learning
Spark
Streaming
Stream processing
GraphX
Graph
Computation
Demo:
A tour of Azure Synapse
Demo Architecture
Thank you

More Related Content

Similar to Modern Analytics Academy - Data Modeling (1).pptx

Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptxFedoRam1
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Michael Rys
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Rukmani Gopalan
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS Amazon Web Services
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseJames Serra
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Michael Rys
 
Azure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptxAzure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptxshaikmadarbi3zen
 
Azure Data Engineering Course in Hyderabad
Azure Data Engineering  Course in HyderabadAzure Data Engineering  Course in Hyderabad
Azure Data Engineering Course in Hyderabadsowmyavibhin
 
"Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad ""Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad "madhupriya3zen
 
SQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosSQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosGeorge Grammatikos
 
Azure Data Platform Overview.pdf
Azure Data Platform Overview.pdfAzure Data Platform Overview.pdf
Azure Data Platform Overview.pdfDustin Vannoy
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)James Serra
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage CCG
 

Similar to Modern Analytics Academy - Data Modeling (1).pptx (20)

Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 
Azure synapse by usama whaba khan
Azure synapse by usama whaba khanAzure synapse by usama whaba khan
Azure synapse by usama whaba khan
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
 
Solucion de BI en Azure
Solucion de BI en AzureSolucion de BI en Azure
Solucion de BI en Azure
 
Azure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptxAzure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptx
 
Azure Data Engineering Course in Hyderabad
Azure Data Engineering  Course in HyderabadAzure Data Engineering  Course in Hyderabad
Azure Data Engineering Course in Hyderabad
 
"Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad ""Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad "
 
SQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosSQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George Grammatikos
 
Azure Data Platform Overview.pdf
Azure Data Platform Overview.pdfAzure Data Platform Overview.pdf
Azure Data Platform Overview.pdf
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
 
Azure Data Engineering.pptx
Azure Data Engineering.pptxAzure Data Engineering.pptx
Azure Data Engineering.pptx
 

Recently uploaded

Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth MarketingShawn Pang
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionMintel Group
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
Marketing Management Business Plan_My Sweet Creations
Marketing Management Business Plan_My Sweet CreationsMarketing Management Business Plan_My Sweet Creations
Marketing Management Business Plan_My Sweet Creationsnakalysalcedo61
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCRashishs7044
 
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedLean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedKaiNexus
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsApsara Of India
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportMintel Group
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCRashishs7044
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 

Recently uploaded (20)

Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted Version
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
Marketing Management Business Plan_My Sweet Creations
Marketing Management Business Plan_My Sweet CreationsMarketing Management Business Plan_My Sweet Creations
Marketing Management Business Plan_My Sweet Creations
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
 
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedLean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample Report
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 

Modern Analytics Academy - Data Modeling (1).pptx

  • 2. Agenda • What is Modern Analytics Academy? • The Team • Modern Analytics in Azure  Model & Serve  Data Lake Structure  Synapse as solution  Demo  Review
  • 3. Senior Cloud Solution Architect Alex Karasek Modern Analytics Academy Team Principal Cloud Solution Architect Chris Mitchell Principal Cloud Solution Architect Jason Virtue Senior Cloud Solution Architect Annie Xu Senior Cloud Solution Architect Brian Hitney https://aka.ms/maa
  • 4. Academy Sessions Acquisition & Storage Modeling Pipelines Security & Governance Visualization  Azure Data Factory  Synapse Pipelines  Power BI Dataflows  Azure Stream Analytics  Data Lake Structure  Synapse Spark Pools  Synapse SQL Pools  Synapse Serverless SQL  Azure Data Lake  Azure Cosmos DB  Azure Event Hubs  Synapse Link  Auditing  Security  Azure purview  Power BI  Paginated Reports  Power BI Embedded
  • 5. Modern Analytics in Azure Advanced Analytics INGEST PREP & TRAIN MODEL & SERVE BI + Reporting Real Time Analytics STORE Big data store EXPLORE Query All Data Analytics Engines Data Warehouse Data Orchestration and Monitoring METADATA MANAGEMENT & GOVERNANCE Social LOB Graph IoT Image CRM
  • 6. Role of Data in Modern Analytics Experimentation Fast exploration Semi-structured data Big Data OR Proven security & privacy Dependable performance Operational data Relational Data Data Lake Data Warehouse
  • 7. Role of Data in Modern Analytics Data warehousing & big data analytics—all in one service Azure Synapse Analytics Data warehousing & big data analytics—all in one service Azure Synapse Analytics
  • 8. Introducing Azure Synapse Analytics A limitless analytics service with unmatched time to insight, that delivers insights from all your data, across data warehouses and big data analytics systems, with blazing speed Simply put, Azure Synapse is Azure SQL Data Warehouse evolved We have taken the same industry leading data warehouse and elevated it to a whole new level of performance and capabilities
  • 10. Key Terms Data Lake A data lake is a storage repository that holds a large amount of data in its native, raw format. Data lake stores are optimized for scaling to terabytes and petabytes of data. The data typically comes from multiple heterogeneous sources, and may be structured, semi-structured, or unstructured. The idea with a data lake is to store everything in its original, untransformed state. This approach differs from a traditional data warehouse, which transforms and processes the data at the time of ingestion. Azure Data Lake (ADLS Gen 2) Data Lake Storage Gen2 makes Azure Storage the foundation for building enterprise data lakes on Azure Logical Data Warehouse (LDW) A relational layer built on top of Azure data sources such as Azure Data Lake storage (ADLS Gen 2), Azure Cosmos DB analytical storage, or Azure Blob storage Data Warehouse (DW) A data warehouse is a centralized repository of integrated data from one or more disparate sources. Data warehouses store current and historical data and are used for reporting and analysis of the data.
  • 12. Azure Synapse Analytics Rich surface area T-SQL language for data analytics Supporting large number of languages and tools Enterprise-grade security dedicated SQL pool Modern Data Warehouse Indexing and caching Import and query external data Workload management serverless SQL pool Querying external data Model raw files as virtual tables and views Easy data transformation
  • 13. Azure Synapse Analytics Dedicated SQL Pools (Formerly SQL DW) Best in class price per performance Developer productivity Intelligent workload management Data flexibility Up to 94% less expensive than competitors Prioritize resources for the most valuable workloads Ingest variety of data sources to derive the maximum benefit Use preferred tooling for SQL data warehouse development Industry-leading security Defense-in-depth security and 99.9% financially backed availability SLA
  • 14. Azure Synapse Analytics Dedicated SQL Pools T-SQL Querying • Windowing aggregates • Approximate execution (Hyperloglog) • JSON data support • Score machine learning models in ONNX format Advanced storage system • Columnstore Indexes • Table partitions • Distributed tables • Isolation • Materialized Views • Nonclustered Indexes • Result-set caching Complete SQL object model • Tables • Views • Stored procedures • Functions
  • 15. Azure Synapse Analytics Serverless SQL Pools Quick data exploration Easily explore schema and data in files on Azure storage Supports various file formats (Parquet, CSV, JSON) Direct connector to Azure storage for large BI ecosystem Logical Data Warehouse Model raw files as virtual tables and views Use any tool that works with SQL to analyze files Use enterprise-grade security model Easy data transformation Transform CSV to parquet format Move data between containers and accounts Save the results of queries on external storage
  • 16. Azure Synapse Analytics Apache Spark Pools Spark Unifies:  Batch Processing      An unified, open source, parallel, data processing framework for Big Data Analytics Spark Core Engine Spark SQL Batch processing Spark MLlib Machine Learning Yarn Spark MLlib Machine Learning Spark Streaming Stream processing GraphX Graph Computation
  • 17. Demo: A tour of Azure Synapse

Editor's Notes

  1. The agenda for today’s session will include: A review of the modern analytics academy framework The team behind it An overview of a modern analytics architecture in Azure Finally a deep dive into ways to model and serve data in Azure. Specific topics covered will include data lake structures for analytical use cases, how synapse fits as a solution and a demo of these concepts
  2. The MAA was built to show breadth and depth of capabilities in Azure that can be used to implement a Modern Analytics Solution in the cloud. This is the team behind this program, and we are all Cloud Solution Architectes on Microsoft’s Global Partner Solutions Team.
  3. Let’s take a quick look at the detailed sessions available. In Data Acquisition and Storage we looked at the nuances of how source systems provide data and the various mechanisms for getting that data into Azure for long term retention and analysis. In Data Modeling we’ll be looking at how to structure and store your data for the purposes of serving data to applications like reporting tools. In Data Pipelines we’ll be looking at how to move your data through a data engineering process and prepare the data for serving. In Data Security and Governance we’ll examine how to deal with topics like controlling access to your data, understanding the lineage of your data and addressing policy compliance on your data. In Data Visualization we’ll examine how visualization tools like Power BI can be used to enable users to get timely business value out of your data.
  4. Ingestion – we need to be able to connect to the data no matter where it comes from. Line of business applications, cloud services, social networks, sensor networks as well as being able to support the different cadences with which that data arrives. Storage - highly scalable and cost effect storage is critical to these solutions, with theoretically limitless amounts of data being required to support these types of solutions, choosing the right methods for storage is always critical Exploration – data processing engines and services are used more for ad hoc data exploration of data Prep & Train - New analytics and data processing engines such as Spark are being leveraged to prepare data and train models over large data volumes Model & Serve – Without giving users access to the data to do ad-hoc analysis an analytics solution is useless. We must surface the data in a useable way through modern tools. Metadata management & governance – with all this data security and regulatory concerns haven’t gone away, data must be discoverable, compliant, and trusted.
  5. Data is fundamental to the types of modern analytics solutions that we are discussing in this series. When it comes to data at this scale there are generally 2 ways to store, model, and serve to consumes. Data Lakes are generally ideal for experimentation, exploratory, or ad hoc queries over semi-structured data at any scale. For more traditional analytics and reporting scenarios, a DW is often more suitable for serving data via a relational data model with proven security and dependable performance over aggregated operational data sets.
  6. By being able to support both scenarios, Synapse Analytics becomes the natural hub for all of your modern analytical solutions
  7. Designed for analytics workloads at any scale Interfaces: Synapse Studio – web based portal for object exploration, data engineering, development, orchestration all through a single pane of glass SaaS developer experiences for code free and code first Multiple languages suited to different analytics workloads Integrated analytics runtimes available provisioned and serverless on-demand SQL Analytics offering T-SQL for batch, streaming and interactive processing Spark for big data processing with Python, Scala, R and .NET Integrated platform services for, management, security, monitoring, and metastore Data lake integrated and Common Data Model aware
  8. Raw data: This is data as it comes from the source systems. This data is typically ingested in raw format via automated streaming or batch oriented pipelines and is consumed by an analytics engine such as Spark or managed integration pipelines in ADF or Synapse to perform cleansing and enrichment operations to generate the enriched and curated data. Common formats might include csv or JSON which are optimized for efficient write operations Enriched data: This layer of data is the version where raw data (as is or aggregated) has a defined schema and also, the data is cleansed, enriched (with other sources) and is available to analytics engines to extract high value data. Data engineers generate these datasets and also proceed to extract high value/curated data from these datasets. Data in this zone might be stored in a format such as parquet which is compressed and includes schema which makes it much more suitable for read heavy and analytical workloads. Curated data: This layer of data contains the high value information that is served to the consumers of the data – the BI analysts and the data scientists. This data has structure and can be served to the consumers either as is in its native format or through a more traditional data warehouse. Data assets in this layer is usually highly governed and well documented. Data in this zone might be stored in a format like delta which is an enhanced version of parquet that leverages a transaction log to enable advanced capabilities like merge operations, time travel, and significantly better performance when using for analytical queries
  9. An interactive query service that enables you to use standard T-SQL queries over files in Azure storage.
  10. Having Spark integrated directly inside of Synapse allows users to pick the appropriate query engine for every scenario without ever needing to leave the Synapse workspace environment. Because the available Hive metastore is automatically synced with SQL Serverless engine, data can be directly integrated across platforms, and because it is embedded directly inside of Synapse, that means that all of the associated benefits such as security, administration, monitoring, etc… would apply. Some supported use cases that are made easier through the use of this embedded engine include: Batch Processing Interactive SQL Real-time processing Machine Learning Deep Learning Graph Processing