08-May-20 7:12 AM
1
Azure Synapse es la evolución de Azure SQL Data Warehouse,
combinando big data, almacenamiento de datos e integración de datos
en un único servicio para análisis de extremo a extremo a escala de nube.
Azure Synapse Analytics
Servicio de análisis ilimitado con un tiempo inigualable para obtener información
08-May-20 7:12 AM
2
INGEST
Data warehouse moderno
PREPARE TRANSFORM
& ENRICH
SERVE
STORE
VISUALIZE
On-premises data
Cloud data
SaaS data
Integrated data platform for BI, AI and continuous intelligence
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
METASTORE
SECURITY
MANAGEMENT
MONITORING
DATA INTEGRATION
Analytics Runtimes
PROVISIONED ON-DEMAND
Form Factors
SQL
Languages
Python .NET Java Scala R
Experience Synapse Analytics Studio
Artificial Intelligence / Machine Learning / Internet of Things
Intelligent Apps / Business Intelligence
08-May-20 7:12 AM
3
Plataforma de datos integrada para BI, IA e inteligencia continua
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
METASTORE
SECURITY
MANAGEMENT
MONITORING
DATA INTEGRATION
Analytics Runtimes
PROVISIONED ON-DEMAND
Form Factors
SQL
Languages
Python .NET Java Scala R
Experience Synapse Analytics Studio
Inteligencia Artificial / Aprendizaje Automático / Internet de las
cosas/ Aplicaciones inteligentes / Inteligencia empresarial
Servicios conectados
Azure Data Catalog
Azure Data Lake Storage
Azure Data Share
Azure Databricks
Azure HDInsight
Azure Machine Learning
Power BI
3rd Party Integration
Arquitecturas elásticas
Híbrido
Analizar todos los datosComputación
optimizada para cargas
de trabajo
Autoservicio gobernadoSin silos de datos
08-May-20 7:12 AM
4
Tiempo Costo Riesgo
Plataforma: Rendimiento
• Azure Synapse aprovecha el ecosistema de Azure y las
mejoras principales del motor de SQL Server para producir
mejoras masivas en el rendimiento.
• Estos beneficios no requieren ninguna configuración del
cliente y se proporcionan de fábrica para cada almacén de
datos
• Gen2 adaptive caching – utilizando unidades de estado
sólido (NVMe) de memoria no volátil para aumentar el
ancho de banda de E/S disponible para las consultas.
• Azure FPGA-accelerated networking enhancements – para
mover datos a velocidades de hasta 1 GB/s por nodo para
mejorar las consultas
• Instant data movement – aprovecha el paralelismo
multinúcleo en los servidores SQL Server subyacentes para
mover datos de forma eficiente entre nodos de proceso.
• Query Optimization –optimización de consultas
distribuidas
08-May-20 7:12 AM
5
Synapse SQL MPP componentes arquitectónicos
Tablas distribuidas por hash
08-May-20 7:12 AM
6
Tablas replicadas
08-May-20 7:12 AM
7
Gestión de la
carga de
trabajo
Scale-In Isolation
Coste predecible
Elasticidaden línea
Eficiente paracargasde trabajo impredecibles
Intra Cluster Workload Isolation
(Scale In)
Marketing
CREATE WORKLOAD GROUP Sales
WITH
(
[ MIN_PERCENTAGE_RESOURCE = 60 ]
[ CAP_PERCENTAGE_RESOURCE = 100 ]
[ MAX_CONCURRENCY = 6 ] )
40%
Compute
1000c DWU
60%
Sales
60%
100%
Seguridad integral
Category Feature
Data Protection
Data in Transit
Data Encryption at Rest
Data Discovery and Classification
Access Control
Object Level Security (Tables/Views)
Row Level Security
Column Level Security
Dynamic Data Masking
SQL Login
Authentication Azure Active Directory
Multi-Factor Authentication
Virtual Networks
Network Security Firewall
Azure ExpressRoute
Thread Detection
Threat Protection Auditing
Vulnerability Assessment
08-May-20 7:12 AM
8
Integración de
datos
Data Warehouse Reporting
Integración de datos de Synapse
Más de 90 conectores listos para usar
Sin servidor, sin infraestructura que
administrar
Ingestión sostenida de 4 GB/s
CSV, AVRO, ORC, Parquet, JSON support
08-May-20 7:12 AM
9
Integración de datos de Synapse
Code First
Code Free
GUI based
+ many more
Power BI Azure Machine Learning
Azure Data Share Ecosystem
Azure Synapse Analytics
08-May-20 7:12 AM
10
Data Integration Data Warehouse Reporting
Almacenamiento optimizado para el rendimiento
Elastic Architecture Columnar Storage Columnar Ordering Table Partitioning
Nonclustered Indexes Hash Distribution Materialized Views Resultset Cache
08-May-20 7:12 AM
11
Migración de tablas de base de datos
CREATE TABLE StoreSales (
[sales_city] varchar(60),
[sales_year] int,
[sales_state] char(2),
[item_sk] int,
[sales_zip] char(10),
[sales_date] date,
[customer_sk] int)
WITH(
CLUSTERED COLUMNSTORE INDEX ORDER ([customer_sk]),
DISTRIBUTION = HASH([sales_zip],[item_sk]),
PARTITION ([sales_year] RANGE RIGHT FOR VALUES (1998,1999,2000,2001,2002,2003)))
Vista de base de
datos
Migración Materialized Views
Views
08-May-20 7:12 AM
12
Migración de vista de base de
datos
Vista Vista materializada
Abstrae estructura a los usuarios YES YES
Requiere una referencia explícita YES No
Mejora el rendimiento No YES
Se requiere almacenamiento adicional No YES
Asegurable YES YES
Soporte completo de SQL
YES No
Migración de vista de base de datos
CREATE VIEW vw_TopSalesState
AS
SELECT
SubQ.StateAbbrev,
SubQ.FirstSoldDate,
(SubQ.SalesPrice / sum(SubQ.SalesPrice) OVER (order by (select null)))*100,
(1- (SalesPrice/ListPrice))*100 AS Discount,
RANK() OVER (order by (1- (SalesPrice/ListPrice))) AS StateDiscRank
FROM (
SELECT
s_state AS StateAbbrev,
MIN(d_date) AS FirstSoldDate,
SUM([ss_list_price]) AS ListPrice,
SUM([ss_sales_price]) AS SalesPrice
FROM [tpcds10TB].[store_sales2] ss
INNER JOIN [tpcds10TB].store s on s.[s_store_sk] = ss.[ss_store_sk]
INNER JOIN [tpcds10TB].[date_dim] d on d.[d_date_sk] = ss.ss_sold_date_sk
GROUP BY
s_state) AS SubQ
08-May-20 7:12 AM
13
Migración de la vista materializada de la base de datos
CREATE MATERIALIZED VIEW [dbo].[mvw_StoreSalesSummary]
WITH (DISTRIBUTION = HASH(ss_store_sk))
AS
SELECT
s_state,
c_birth_country,
ss_store_sk AS ss_store_sk,
ss_sold_date_sk AS ss_sold_date_sk,
SUM([ss_list_price]) AS [ss_list_price],
SUM([ss_sales_price]) AS [ss_sales_price],
count_big(*) AS cb
FROM [tpcds10TB].[store_sales2] ss
INNER JOIN [tpcds10TB].customer c ON c.[c_customer_sk] = ss.[ss_customer_sk]
INNER JOIN [tpcds10TB].store s on s.[s_store_sk] = ss.[ss_store_sk]
GROUP BY
s_state,c_birth_country,ss_store_sk, ss_sold_date_sk
Customer
65
Million
Rows
Store
1500
Rows
Store Sales
26
Billion
Rows
Materialized View
287
Million
Rows
Data Integration Data Warehouse Informes
08-May-20 7:12 AM
14
Synapse Connected Service: Power BI
Experiencia integrada de
creación de Power BI
Publicar en Power BI
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
CREATE MATERIALZIED VIEW vw_ProductSales
WITH (DISTRIBUTION = HASH(ProductKey))
AS
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey
GROUP BY
ProductName,
ProductKey
08-May-20 7:12 AM
15
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
ProductName ProductKey TotalSales
Product A 5453 784,943.00
Product B 763 48,723.00
… … …
FactSales Table
10B Records
DimProduct Table
1,000 Records
FactSales
DimProduct
FactInventory
Table
mvw_ProductSales
1,000 Records
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp
GROUP BY
ProductName,
ProductKey
FactInventory
Escalado a
Petabytes
Result set Cache
Automaticquery matching
Implicitcreatingfrom queryactivity
Resilient to cluster elasticity
Execution2
Cache Hit
~.2 seconds
Execution1
Cache Miss
Regular Execution
08-May-20 7:12 AM
16
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
CREATE MATERIALZIED VIEW vw_ProductSales
WITH (DISTRIBUTION = HASH(ProductKey))
AS
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey
GROUP BY
ProductName,
ProductKey
ProductName ProductKey TotalSales
Product A 5453 784,943.00
Product B 763 48,723.00
… … …
FactSales Table
10B Records
DimProduct Table
1,000 Records
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
FactSales
DimProduct
FactInventory
Table
mvw_ProductSales
1,000 Records
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp
GROUP BY
ProductName,
ProductKey
FactInventory
08-May-20 7:12 AM
17
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
SELECT
c_customerkey,
c_nationkey,
SUM(l_quantity),
SUM(l_extendedprice)
FROM [dbo].[lineitem_MonthPartition] l
INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey
INNER JOIN [dbo].[customer] c on c.c_customerkey = o.o_customerkey
GROUP BY
c_customerkey,
c_nationkey
[dbo].[lineitem_MonthPartition] HASH(l_orderkey)
[dbo].[orders] HASH(o_orderkey)
[dbo].[customer] HASH(c_customerkey)
Table Distributions
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
LineItem Orders
Collocated Join (DistributionAligned)
Customer
Non-collocatedJoin (Shuffle Required)
FROM [dbo].[lineitem_MonthPartition] l
INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey
INNER JOIN [dbo].[customer] c on c.c_customerkey = o.o_customerkey
08-May-20 7:12 AM
18
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
(Shuffle Required)
LineItem Orders
Collocated Join (DistributionAligned)
Stage 1
Customer
Stage 2
#temp (Orders + Lineitem)
Nation
Collocated Join (Replicate Aligned)
Collocated Join (DistributionAligned)
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
CREATE MATERIALIZED VIEW mvw_CustomerSales
WITH (DISTRIBUTION = HASH(o_custkey))
AS
SELECT
o_custkey,
l_shipdate,
SUM(l_quantity) AS l_quantity,
SUM(l_extendedprice) AS l_extendedprice
FROM [dbo].[lineitem_MonthPartition] l
INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey
WHERE
l_shipdate >= CONVERT(DATETIME, '1998-11-01', 103)
GROUP BY
o_custkey,
l_shipdate
08-May-20 7:12 AM
19
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
Legend
mvw_CustomerSales
Nation
Customer
<replicated table>
Collocated Join (DistributionAligned)
Collocated Join (Replicate Aligned)
Escalado a
Petabytes
Materialized Views
Transactionalconsistentlyto datamodification
AutomaticQueryOptimizermatching
275
5
0
50
100
150
200
250
300
No MaterializedView WithMaterializedView
Seconds
Query Execution Time
08-May-20 7:12 AM
20
Power BI
Materialized Views
Tables
Escalado a
Petabytes
Power BI
DirectQuery
Composite Models
Aggregation Tables

Data warehouse con azure synapse analytics

  • 1.
    08-May-20 7:12 AM 1 AzureSynapse es la evolución de Azure SQL Data Warehouse, combinando big data, almacenamiento de datos e integración de datos en un único servicio para análisis de extremo a extremo a escala de nube. Azure Synapse Analytics Servicio de análisis ilimitado con un tiempo inigualable para obtener información
  • 2.
    08-May-20 7:12 AM 2 INGEST Datawarehouse moderno PREPARE TRANSFORM & ENRICH SERVE STORE VISUALIZE On-premises data Cloud data SaaS data Integrated data platform for BI, AI and continuous intelligence Platform Azure Data Lake Storage Common Data Model Enterprise Security Optimized for Analytics METASTORE SECURITY MANAGEMENT MONITORING DATA INTEGRATION Analytics Runtimes PROVISIONED ON-DEMAND Form Factors SQL Languages Python .NET Java Scala R Experience Synapse Analytics Studio Artificial Intelligence / Machine Learning / Internet of Things Intelligent Apps / Business Intelligence
  • 3.
    08-May-20 7:12 AM 3 Plataformade datos integrada para BI, IA e inteligencia continua Platform Azure Data Lake Storage Common Data Model Enterprise Security Optimized for Analytics METASTORE SECURITY MANAGEMENT MONITORING DATA INTEGRATION Analytics Runtimes PROVISIONED ON-DEMAND Form Factors SQL Languages Python .NET Java Scala R Experience Synapse Analytics Studio Inteligencia Artificial / Aprendizaje Automático / Internet de las cosas/ Aplicaciones inteligentes / Inteligencia empresarial Servicios conectados Azure Data Catalog Azure Data Lake Storage Azure Data Share Azure Databricks Azure HDInsight Azure Machine Learning Power BI 3rd Party Integration Arquitecturas elásticas Híbrido Analizar todos los datosComputación optimizada para cargas de trabajo Autoservicio gobernadoSin silos de datos
  • 4.
    08-May-20 7:12 AM 4 TiempoCosto Riesgo Plataforma: Rendimiento • Azure Synapse aprovecha el ecosistema de Azure y las mejoras principales del motor de SQL Server para producir mejoras masivas en el rendimiento. • Estos beneficios no requieren ninguna configuración del cliente y se proporcionan de fábrica para cada almacén de datos • Gen2 adaptive caching – utilizando unidades de estado sólido (NVMe) de memoria no volátil para aumentar el ancho de banda de E/S disponible para las consultas. • Azure FPGA-accelerated networking enhancements – para mover datos a velocidades de hasta 1 GB/s por nodo para mejorar las consultas • Instant data movement – aprovecha el paralelismo multinúcleo en los servidores SQL Server subyacentes para mover datos de forma eficiente entre nodos de proceso. • Query Optimization –optimización de consultas distribuidas
  • 5.
    08-May-20 7:12 AM 5 SynapseSQL MPP componentes arquitectónicos Tablas distribuidas por hash
  • 6.
  • 7.
    08-May-20 7:12 AM 7 Gestiónde la carga de trabajo Scale-In Isolation Coste predecible Elasticidaden línea Eficiente paracargasde trabajo impredecibles Intra Cluster Workload Isolation (Scale In) Marketing CREATE WORKLOAD GROUP Sales WITH ( [ MIN_PERCENTAGE_RESOURCE = 60 ] [ CAP_PERCENTAGE_RESOURCE = 100 ] [ MAX_CONCURRENCY = 6 ] ) 40% Compute 1000c DWU 60% Sales 60% 100% Seguridad integral Category Feature Data Protection Data in Transit Data Encryption at Rest Data Discovery and Classification Access Control Object Level Security (Tables/Views) Row Level Security Column Level Security Dynamic Data Masking SQL Login Authentication Azure Active Directory Multi-Factor Authentication Virtual Networks Network Security Firewall Azure ExpressRoute Thread Detection Threat Protection Auditing Vulnerability Assessment
  • 8.
    08-May-20 7:12 AM 8 Integraciónde datos Data Warehouse Reporting Integración de datos de Synapse Más de 90 conectores listos para usar Sin servidor, sin infraestructura que administrar Ingestión sostenida de 4 GB/s CSV, AVRO, ORC, Parquet, JSON support
  • 9.
    08-May-20 7:12 AM 9 Integraciónde datos de Synapse Code First Code Free GUI based + many more Power BI Azure Machine Learning Azure Data Share Ecosystem Azure Synapse Analytics
  • 10.
    08-May-20 7:12 AM 10 DataIntegration Data Warehouse Reporting Almacenamiento optimizado para el rendimiento Elastic Architecture Columnar Storage Columnar Ordering Table Partitioning Nonclustered Indexes Hash Distribution Materialized Views Resultset Cache
  • 11.
    08-May-20 7:12 AM 11 Migraciónde tablas de base de datos CREATE TABLE StoreSales ( [sales_city] varchar(60), [sales_year] int, [sales_state] char(2), [item_sk] int, [sales_zip] char(10), [sales_date] date, [customer_sk] int) WITH( CLUSTERED COLUMNSTORE INDEX ORDER ([customer_sk]), DISTRIBUTION = HASH([sales_zip],[item_sk]), PARTITION ([sales_year] RANGE RIGHT FOR VALUES (1998,1999,2000,2001,2002,2003))) Vista de base de datos Migración Materialized Views Views
  • 12.
    08-May-20 7:12 AM 12 Migraciónde vista de base de datos Vista Vista materializada Abstrae estructura a los usuarios YES YES Requiere una referencia explícita YES No Mejora el rendimiento No YES Se requiere almacenamiento adicional No YES Asegurable YES YES Soporte completo de SQL YES No Migración de vista de base de datos CREATE VIEW vw_TopSalesState AS SELECT SubQ.StateAbbrev, SubQ.FirstSoldDate, (SubQ.SalesPrice / sum(SubQ.SalesPrice) OVER (order by (select null)))*100, (1- (SalesPrice/ListPrice))*100 AS Discount, RANK() OVER (order by (1- (SalesPrice/ListPrice))) AS StateDiscRank FROM ( SELECT s_state AS StateAbbrev, MIN(d_date) AS FirstSoldDate, SUM([ss_list_price]) AS ListPrice, SUM([ss_sales_price]) AS SalesPrice FROM [tpcds10TB].[store_sales2] ss INNER JOIN [tpcds10TB].store s on s.[s_store_sk] = ss.[ss_store_sk] INNER JOIN [tpcds10TB].[date_dim] d on d.[d_date_sk] = ss.ss_sold_date_sk GROUP BY s_state) AS SubQ
  • 13.
    08-May-20 7:12 AM 13 Migraciónde la vista materializada de la base de datos CREATE MATERIALIZED VIEW [dbo].[mvw_StoreSalesSummary] WITH (DISTRIBUTION = HASH(ss_store_sk)) AS SELECT s_state, c_birth_country, ss_store_sk AS ss_store_sk, ss_sold_date_sk AS ss_sold_date_sk, SUM([ss_list_price]) AS [ss_list_price], SUM([ss_sales_price]) AS [ss_sales_price], count_big(*) AS cb FROM [tpcds10TB].[store_sales2] ss INNER JOIN [tpcds10TB].customer c ON c.[c_customer_sk] = ss.[ss_customer_sk] INNER JOIN [tpcds10TB].store s on s.[s_store_sk] = ss.[ss_store_sk] GROUP BY s_state,c_birth_country,ss_store_sk, ss_sold_date_sk Customer 65 Million Rows Store 1500 Rows Store Sales 26 Billion Rows Materialized View 287 Million Rows Data Integration Data Warehouse Informes
  • 14.
    08-May-20 7:12 AM 14 SynapseConnected Service: Power BI Experiencia integrada de creación de Power BI Publicar en Power BI Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching CREATE MATERIALZIED VIEW vw_ProductSales WITH (DISTRIBUTION = HASH(ProductKey)) AS SELECT ProductName ProductKey, SUM(Amount) AS TotalSales FROM FactSales fs INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey GROUP BY ProductName, ProductKey
  • 15.
    08-May-20 7:12 AM 15 Escaladoa Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching ProductName ProductKey TotalSales Product A 5453 784,943.00 Product B 763 48,723.00 … … … FactSales Table 10B Records DimProduct Table 1,000 Records FactSales DimProduct FactInventory Table mvw_ProductSales 1,000 Records SELECT ProductName ProductKey, SUM(Amount) AS TotalSales FROM FactSales fs INNER JOIN DimProduct dp GROUP BY ProductName, ProductKey FactInventory Escalado a Petabytes Result set Cache Automaticquery matching Implicitcreatingfrom queryactivity Resilient to cluster elasticity Execution2 Cache Hit ~.2 seconds Execution1 Cache Miss Regular Execution
  • 16.
    08-May-20 7:12 AM 16 Escaladoa Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching CREATE MATERIALZIED VIEW vw_ProductSales WITH (DISTRIBUTION = HASH(ProductKey)) AS SELECT ProductName ProductKey, SUM(Amount) AS TotalSales FROM FactSales fs INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey GROUP BY ProductName, ProductKey ProductName ProductKey TotalSales Product A 5453 784,943.00 Product B 763 48,723.00 … … … FactSales Table 10B Records DimProduct Table 1,000 Records Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching FactSales DimProduct FactInventory Table mvw_ProductSales 1,000 Records SELECT ProductName ProductKey, SUM(Amount) AS TotalSales FROM FactSales fs INNER JOIN DimProduct dp GROUP BY ProductName, ProductKey FactInventory
  • 17.
    08-May-20 7:12 AM 17 Escaladoa Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching SELECT c_customerkey, c_nationkey, SUM(l_quantity), SUM(l_extendedprice) FROM [dbo].[lineitem_MonthPartition] l INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey INNER JOIN [dbo].[customer] c on c.c_customerkey = o.o_customerkey GROUP BY c_customerkey, c_nationkey [dbo].[lineitem_MonthPartition] HASH(l_orderkey) [dbo].[orders] HASH(o_orderkey) [dbo].[customer] HASH(c_customerkey) Table Distributions Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching LineItem Orders Collocated Join (DistributionAligned) Customer Non-collocatedJoin (Shuffle Required) FROM [dbo].[lineitem_MonthPartition] l INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey INNER JOIN [dbo].[customer] c on c.c_customerkey = o.o_customerkey
  • 18.
    08-May-20 7:12 AM 18 Escaladoa Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching (Shuffle Required) LineItem Orders Collocated Join (DistributionAligned) Stage 1 Customer Stage 2 #temp (Orders + Lineitem) Nation Collocated Join (Replicate Aligned) Collocated Join (DistributionAligned) Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching CREATE MATERIALIZED VIEW mvw_CustomerSales WITH (DISTRIBUTION = HASH(o_custkey)) AS SELECT o_custkey, l_shipdate, SUM(l_quantity) AS l_quantity, SUM(l_extendedprice) AS l_extendedprice FROM [dbo].[lineitem_MonthPartition] l INNER JOIN [dbo].[orders] o on o.o_orderkey = l.l_orderkey WHERE l_shipdate >= CONVERT(DATETIME, '1998-11-01', 103) GROUP BY o_custkey, l_shipdate
  • 19.
    08-May-20 7:12 AM 19 Escaladoa Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching Legend mvw_CustomerSales Nation Customer <replicated table> Collocated Join (DistributionAligned) Collocated Join (Replicate Aligned) Escalado a Petabytes Materialized Views Transactionalconsistentlyto datamodification AutomaticQueryOptimizermatching 275 5 0 50 100 150 200 250 300 No MaterializedView WithMaterializedView Seconds Query Execution Time
  • 20.
    08-May-20 7:12 AM 20 PowerBI Materialized Views Tables Escalado a Petabytes Power BI DirectQuery Composite Models Aggregation Tables