SlideShare a Scribd company logo
1 of 16
Download to read offline
Lakehouse in Azure
Sergio Zenatti Filho
Sr Cloud Solution Architect - Data & Analytics
@Microsoft
Sergio has over 20 years of experience designing and
delivering Data and Analytics Solutions. He has extensive
experience in the Microsoft Data and Analytics Platform in the
cloud and also on-premises. Sergio is passionate about
learning new technology and helping customers to define the
best solution for their business.
Sergio Zenatti Filho
Senior Cloud Solution Architect
at Microsoft
Connect
©Microsoft Corporation
Azure
Agenda • Lakehouse
• Delta Lake
• Ingestion and Transformation
• Architecture
• Power BI
• Next Steps
• Q&A
©Microsoft Corporation
Azure
Data Warehouse and Data Lake
• Have Powered BI for over 30
years
• Purpose-built for BI and
Reporting
• Limited support for Semi-
Structured and Unstructured
data
• Limited support for
streaming
BI
Data
Science
Machine
Learning
Structured, Semi-Structured and Unstructured
Data
Data Lake
Real-Time
Database
Reports
Data
Warehouses
Data Prep and
Validation
ETL
ETL
External Data Operational Data
Data Warehouses
BI Reports
• Powered by technological
advances in data storage
• Cheap to store any data
• Support machine
learning user cases
• Poor BI Support
• Complex to set up
• Hard to append data
Data Lake
Data Warehouse
©Microsoft Corporation
Azure
Lakehouse
Data Warehouse Data Lake
Streaming
Analytics
BI Data
Science
Machine
Learning
Structured, Semi-Structured and Unstructured
Data
Key features:
• Transaction support
• Schema enforcement and
governance
• Data reliability and consistency
• Low query latency and high
reliability for BI and advanced
analytics
• Optimized for machine learning
and data science
• Enable end-to-end streaming
Lakehouse Platform combines the best elements of data lakes and data warehouses to deliver the reliability, strong governance
and performance of data warehouses with the openness, flexibility and machine learning support of data lakes.
©Microsoft Corporation
Azure
Delta Lake
Key features:
• ACID Transactions
• Scalable Metadata
• Unified Streaming and Batch
• Schema Evolution / Enforcement
• Time Travel
• Upserts and deletes
Delta Lake is an open source project that enables building a Lakehouse architecture on top of data
lakes.
Demo
Delta Lake
Data Ingestion and Transformation
Power BI
©Microsoft Corporation
Azure
Data Ingestion
Azure Synapse Pipeline or Azure Data Factory Databricks Other Solutions
• 90+ Data Sources including files, databases,
SaaS, PaaS and more
• Copy activity: supports Azure Databricks Delta
Lake connector to copy data from any
supported source to delta lake table, and from
delta lake table to any supported sink data
store.
• Mapping Data Flow: supports generic Delta
format on Azure Storage as source and sink to
read and write Delta files for code-free ETL, and
runs on managed Azure Integration Runtime.
• Data Formats: Delta Lake, Parquet, ORC,
JSON, CSV, Avro, Text and Binary
• Data Sources: SQL Server, MariaDB,
MySQL, PostgreSQL, Azure Synapse
Analytics, Azure Cosmos DB, MongoDB,
Cassandra, Couchbase, ElasticSearch,
Neo4j, Redis, Snowflake and more.
• Event Hub
• IoT Hub
• SQL Server BCP (bulk copy program)
• Polybase
• SAP Data Services
• Informatica
• Striim
• Fivetran
• Qlik
• Confluent
©Microsoft Corporation
Azure
Data Transformation
Databricks
Synapse Spark
Azure Synapse Pipeline and Azure Data Factory
• Spark notebooks using Python, Scala, SQL
and R
• Spark Notebook using Python, Scala, Spark
SQL, C# and R (Preview)
• Mapping data flows: visually designed data
transformations in Azure Data Factory and Azure Synapse
Pipeline
• External Transformations: Azure Synapse Notebook and
Databricks.
Architecture
©Microsoft Corporation
Azure
Lakehouse Architecture - Databricks
©Microsoft Corporation
Azure
Lakehouse Architecture – Azure Synapse
©Microsoft Corporation
Azure
Lakehouse Architecture – Azure Synapse and Databricks
©Microsoft Corporation
Azure
Power BI
Azure Synapse
Databricks Delta Sharing
• Databricks (Beta): connector for
Databricks SQL Warehouse running on
AWS and using OAuth
• Azure Databricks: for Databricks SQL
Warehouse in Azure or on AWS but not
using OAuth
• Authentication using Personal Access
Token or OAuth
• Azure Synapse Analytics SQL: connector
for Lake DB (Spark), Serverless DB and
Dedicated SQL Pool
• Azure Synapse Analytics workspace
(beta): connector for Lake DB (Spark),
Serverless DB and Dedicated SQL Pool
• Authentication using Microsoft Account,
Windows and Database
• Import Mode Only
• Authentication using Token
Delta.io connector (Open Source)
• Reading Delta Lake tables natively in
PowerBI
• Support all storage systems that are
supported by PowerBI
https://github.com/delta-
io/connectors/tree/master/powerbi
©Microsoft Corporation
Azure
What next?
• Free training - Databricks Lakehouse Fundamentals: https://www.databricks.com/learn/training/lakehouse-
fundamentals
• Free training - Use Delta Lake in Azure Synapse Analytics: https://learn.microsoft.com/en-
us/training/modules/use-delta-lake-azure-synapse-analytics/
• Solution Accelerator for Financial Analytics: https://github.com/microsoft/Azure-Databricks-Solution-
Accelerator-Financial-Analytics-Customer-Revenue-Growth-Factor
• Open Education Analytics: https://github.com/microsoft/OpenEduAnalytics
• Delta Lake: https://delta.io/
• Dynamics 365 Finance and Operations Apps - Export to data lake: https://github.com/microsoft/Dynamics-
365-FastTrack-Implementation-Assets/tree/master/Analytics/ArchitecturePatterns
© Copyright Microsoft Corporation. All rights reserved.
Q&A
Thank you!
Sergio Zenatti Filho - Sr Cloud Solution Architect at Microsoft
Email: zenatti@gmail.com
LinkedIn: https://www.linkedin.com/in/sergiozenatti/
Connect

More Related Content

What's hot

Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseData Con LA
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformDatabricks
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data MeshLibbySchulze
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks FundamentalsDalibor Wijas
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptxWasm1953
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Cathrine Wilhelmsen
 

What's hot (20)

Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis Platform
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 

Similar to Lakehouse in Azure

Accelerating Business Intelligence Solutions with Microsoft Azure pass
Accelerating Business Intelligence Solutions with Microsoft Azure   passAccelerating Business Intelligence Solutions with Microsoft Azure   pass
Accelerating Business Intelligence Solutions with Microsoft Azure passJason Strate
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Databricks
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsThomas Sykes
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Michael Rys
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeTom Kerkhove
 
Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data PlatformShu-Jeng Hsieh
 
2014.10.22 Building Azure Solutions with Office 365
2014.10.22 Building Azure Solutions with Office 3652014.10.22 Building Azure Solutions with Office 365
2014.10.22 Building Azure Solutions with Office 365Marco Parenzan
 
Azure Data Platform Overview.pdf
Azure Data Platform Overview.pdfAzure Data Platform Overview.pdf
Azure Data Platform Overview.pdfDustin Vannoy
 
Analyzing StackExchange Data with Azure Data Lake (Tom Kerkhove @ Integration...
Analyzing StackExchange Data with Azure Data Lake (Tom Kerkhove @ Integration...Analyzing StackExchange Data with Azure Data Lake (Tom Kerkhove @ Integration...
Analyzing StackExchange Data with Azure Data Lake (Tom Kerkhove @ Integration...Codit
 
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca SartoriCCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartoriwalk2talk srl
 
Azure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeAzure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeRick van den Bosch
 
20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stackAlexandre BERGERE
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Trivadis
 
DBP-010_Using Azure Data Services for Modern Data Applications
DBP-010_Using Azure Data Services for Modern Data ApplicationsDBP-010_Using Azure Data Services for Modern Data Applications
DBP-010_Using Azure Data Services for Modern Data Applicationsdecode2016
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudCAMMS
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 
Azure - Data Platform
Azure - Data PlatformAzure - Data Platform
Azure - Data Platformgiventocode
 
Building a Turbo-fast Data Warehousing Platform with Databricks
Building a Turbo-fast Data Warehousing Platform with DatabricksBuilding a Turbo-fast Data Warehousing Platform with Databricks
Building a Turbo-fast Data Warehousing Platform with DatabricksDatabricks
 

Similar to Lakehouse in Azure (20)

Accelerating Business Intelligence Solutions with Microsoft Azure pass
Accelerating Business Intelligence Solutions with Microsoft Azure   passAccelerating Business Intelligence Solutions with Microsoft Azure   pass
Accelerating Business Intelligence Solutions with Microsoft Azure pass
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data Flows
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
 
CC -Unit4.pptx
CC -Unit4.pptxCC -Unit4.pptx
CC -Unit4.pptx
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data Lake
 
Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data Platform
 
2014.10.22 Building Azure Solutions with Office 365
2014.10.22 Building Azure Solutions with Office 3652014.10.22 Building Azure Solutions with Office 365
2014.10.22 Building Azure Solutions with Office 365
 
Azure Data Platform Overview.pdf
Azure Data Platform Overview.pdfAzure Data Platform Overview.pdf
Azure Data Platform Overview.pdf
 
Analyzing StackExchange Data with Azure Data Lake (Tom Kerkhove @ Integration...
Analyzing StackExchange Data with Azure Data Lake (Tom Kerkhove @ Integration...Analyzing StackExchange Data with Azure Data Lake (Tom Kerkhove @ Integration...
Analyzing StackExchange Data with Azure Data Lake (Tom Kerkhove @ Integration...
 
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca SartoriCCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
 
Azure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeAzure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data Lake
 
20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
 
DBP-010_Using Azure Data Services for Modern Data Applications
DBP-010_Using Azure Data Services for Modern Data ApplicationsDBP-010_Using Azure Data Services for Modern Data Applications
DBP-010_Using Azure Data Services for Modern Data Applications
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in Cloud
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Azure - Data Platform
Azure - Data PlatformAzure - Data Platform
Azure - Data Platform
 
Building a Turbo-fast Data Warehousing Platform with Databricks
Building a Turbo-fast Data Warehousing Platform with DatabricksBuilding a Turbo-fast Data Warehousing Platform with Databricks
Building a Turbo-fast Data Warehousing Platform with Databricks
 

More from Sergio Zenatti Filho

Global Azure Bootcamp 2019 - Modernize your Data Platform with Azure
Global Azure Bootcamp 2019 - Modernize your Data Platform with AzureGlobal Azure Bootcamp 2019 - Modernize your Data Platform with Azure
Global Azure Bootcamp 2019 - Modernize your Data Platform with AzureSergio Zenatti Filho
 
Azure SQL Database Part 1 Setup and Monitoring
Azure SQL Database Part 1 Setup and MonitoringAzure SQL Database Part 1 Setup and Monitoring
Azure SQL Database Part 1 Setup and MonitoringSergio Zenatti Filho
 
Perth Microsoft Data & Analytics User Group - Building Solutions with Azure D...
Perth Microsoft Data & Analytics User Group - Building Solutions with Azure D...Perth Microsoft Data & Analytics User Group - Building Solutions with Azure D...
Perth Microsoft Data & Analytics User Group - Building Solutions with Azure D...Sergio Zenatti Filho
 
Auckland SQL Saturday - Azure Data Lake
Auckland SQL Saturday - Azure Data LakeAuckland SQL Saturday - Azure Data Lake
Auckland SQL Saturday - Azure Data LakeSergio Zenatti Filho
 
Auckland SQLSaturday 2018 - Building a Modern Analytics Solution in the cloud...
Auckland SQLSaturday 2018 - Building a Modern Analytics Solution in the cloud...Auckland SQLSaturday 2018 - Building a Modern Analytics Solution in the cloud...
Auckland SQLSaturday 2018 - Building a Modern Analytics Solution in the cloud...Sergio Zenatti Filho
 
Azure Data Lake Store and Analytics
Azure Data Lake Store and AnalyticsAzure Data Lake Store and Analytics
Azure Data Lake Store and AnalyticsSergio Zenatti Filho
 
Unleash the Power of Azure Data Factory - SQL User Group
Unleash the Power of Azure Data Factory - SQL User GroupUnleash the Power of Azure Data Factory - SQL User Group
Unleash the Power of Azure Data Factory - SQL User GroupSergio Zenatti Filho
 
Unleash the power of Azure Data Factory
Unleash the power of Azure Data Factory Unleash the power of Azure Data Factory
Unleash the power of Azure Data Factory Sergio Zenatti Filho
 

More from Sergio Zenatti Filho (9)

Global Azure Bootcamp 2019 - Modernize your Data Platform with Azure
Global Azure Bootcamp 2019 - Modernize your Data Platform with AzureGlobal Azure Bootcamp 2019 - Modernize your Data Platform with Azure
Global Azure Bootcamp 2019 - Modernize your Data Platform with Azure
 
Azure SQL Database Part 1 Setup and Monitoring
Azure SQL Database Part 1 Setup and MonitoringAzure SQL Database Part 1 Setup and Monitoring
Azure SQL Database Part 1 Setup and Monitoring
 
Perth Microsoft Data & Analytics User Group - Building Solutions with Azure D...
Perth Microsoft Data & Analytics User Group - Building Solutions with Azure D...Perth Microsoft Data & Analytics User Group - Building Solutions with Azure D...
Perth Microsoft Data & Analytics User Group - Building Solutions with Azure D...
 
Auckland SQL Saturday - Azure Data Lake
Auckland SQL Saturday - Azure Data LakeAuckland SQL Saturday - Azure Data Lake
Auckland SQL Saturday - Azure Data Lake
 
Auckland SQLSaturday 2018 - Building a Modern Analytics Solution in the cloud...
Auckland SQLSaturday 2018 - Building a Modern Analytics Solution in the cloud...Auckland SQLSaturday 2018 - Building a Modern Analytics Solution in the cloud...
Auckland SQLSaturday 2018 - Building a Modern Analytics Solution in the cloud...
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
Azure Data Lake Store and Analytics
Azure Data Lake Store and AnalyticsAzure Data Lake Store and Analytics
Azure Data Lake Store and Analytics
 
Unleash the Power of Azure Data Factory - SQL User Group
Unleash the Power of Azure Data Factory - SQL User GroupUnleash the Power of Azure Data Factory - SQL User Group
Unleash the Power of Azure Data Factory - SQL User Group
 
Unleash the power of Azure Data Factory
Unleash the power of Azure Data Factory Unleash the power of Azure Data Factory
Unleash the power of Azure Data Factory
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Lakehouse in Azure

  • 1. Lakehouse in Azure Sergio Zenatti Filho Sr Cloud Solution Architect - Data & Analytics @Microsoft
  • 2. Sergio has over 20 years of experience designing and delivering Data and Analytics Solutions. He has extensive experience in the Microsoft Data and Analytics Platform in the cloud and also on-premises. Sergio is passionate about learning new technology and helping customers to define the best solution for their business. Sergio Zenatti Filho Senior Cloud Solution Architect at Microsoft Connect
  • 3. ©Microsoft Corporation Azure Agenda • Lakehouse • Delta Lake • Ingestion and Transformation • Architecture • Power BI • Next Steps • Q&A
  • 4. ©Microsoft Corporation Azure Data Warehouse and Data Lake • Have Powered BI for over 30 years • Purpose-built for BI and Reporting • Limited support for Semi- Structured and Unstructured data • Limited support for streaming BI Data Science Machine Learning Structured, Semi-Structured and Unstructured Data Data Lake Real-Time Database Reports Data Warehouses Data Prep and Validation ETL ETL External Data Operational Data Data Warehouses BI Reports • Powered by technological advances in data storage • Cheap to store any data • Support machine learning user cases • Poor BI Support • Complex to set up • Hard to append data Data Lake Data Warehouse
  • 5. ©Microsoft Corporation Azure Lakehouse Data Warehouse Data Lake Streaming Analytics BI Data Science Machine Learning Structured, Semi-Structured and Unstructured Data Key features: • Transaction support • Schema enforcement and governance • Data reliability and consistency • Low query latency and high reliability for BI and advanced analytics • Optimized for machine learning and data science • Enable end-to-end streaming Lakehouse Platform combines the best elements of data lakes and data warehouses to deliver the reliability, strong governance and performance of data warehouses with the openness, flexibility and machine learning support of data lakes.
  • 6. ©Microsoft Corporation Azure Delta Lake Key features: • ACID Transactions • Scalable Metadata • Unified Streaming and Batch • Schema Evolution / Enforcement • Time Travel • Upserts and deletes Delta Lake is an open source project that enables building a Lakehouse architecture on top of data lakes.
  • 7. Demo Delta Lake Data Ingestion and Transformation Power BI
  • 8. ©Microsoft Corporation Azure Data Ingestion Azure Synapse Pipeline or Azure Data Factory Databricks Other Solutions • 90+ Data Sources including files, databases, SaaS, PaaS and more • Copy activity: supports Azure Databricks Delta Lake connector to copy data from any supported source to delta lake table, and from delta lake table to any supported sink data store. • Mapping Data Flow: supports generic Delta format on Azure Storage as source and sink to read and write Delta files for code-free ETL, and runs on managed Azure Integration Runtime. • Data Formats: Delta Lake, Parquet, ORC, JSON, CSV, Avro, Text and Binary • Data Sources: SQL Server, MariaDB, MySQL, PostgreSQL, Azure Synapse Analytics, Azure Cosmos DB, MongoDB, Cassandra, Couchbase, ElasticSearch, Neo4j, Redis, Snowflake and more. • Event Hub • IoT Hub • SQL Server BCP (bulk copy program) • Polybase • SAP Data Services • Informatica • Striim • Fivetran • Qlik • Confluent
  • 9. ©Microsoft Corporation Azure Data Transformation Databricks Synapse Spark Azure Synapse Pipeline and Azure Data Factory • Spark notebooks using Python, Scala, SQL and R • Spark Notebook using Python, Scala, Spark SQL, C# and R (Preview) • Mapping data flows: visually designed data transformations in Azure Data Factory and Azure Synapse Pipeline • External Transformations: Azure Synapse Notebook and Databricks.
  • 13. ©Microsoft Corporation Azure Lakehouse Architecture – Azure Synapse and Databricks
  • 14. ©Microsoft Corporation Azure Power BI Azure Synapse Databricks Delta Sharing • Databricks (Beta): connector for Databricks SQL Warehouse running on AWS and using OAuth • Azure Databricks: for Databricks SQL Warehouse in Azure or on AWS but not using OAuth • Authentication using Personal Access Token or OAuth • Azure Synapse Analytics SQL: connector for Lake DB (Spark), Serverless DB and Dedicated SQL Pool • Azure Synapse Analytics workspace (beta): connector for Lake DB (Spark), Serverless DB and Dedicated SQL Pool • Authentication using Microsoft Account, Windows and Database • Import Mode Only • Authentication using Token Delta.io connector (Open Source) • Reading Delta Lake tables natively in PowerBI • Support all storage systems that are supported by PowerBI https://github.com/delta- io/connectors/tree/master/powerbi
  • 15. ©Microsoft Corporation Azure What next? • Free training - Databricks Lakehouse Fundamentals: https://www.databricks.com/learn/training/lakehouse- fundamentals • Free training - Use Delta Lake in Azure Synapse Analytics: https://learn.microsoft.com/en- us/training/modules/use-delta-lake-azure-synapse-analytics/ • Solution Accelerator for Financial Analytics: https://github.com/microsoft/Azure-Databricks-Solution- Accelerator-Financial-Analytics-Customer-Revenue-Growth-Factor • Open Education Analytics: https://github.com/microsoft/OpenEduAnalytics • Delta Lake: https://delta.io/ • Dynamics 365 Finance and Operations Apps - Export to data lake: https://github.com/microsoft/Dynamics- 365-FastTrack-Implementation-Assets/tree/master/Analytics/ArchitecturePatterns
  • 16. © Copyright Microsoft Corporation. All rights reserved. Q&A Thank you! Sergio Zenatti Filho - Sr Cloud Solution Architect at Microsoft Email: zenatti@gmail.com LinkedIn: https://www.linkedin.com/in/sergiozenatti/ Connect