SlideShare a Scribd company logo
Microsoft Fabric
A unified analytics solution for the era of AI
James Serra
Industry Advisor
Microsoft, Federal Civilian
jamesserra3@gmail.com
6/16/23
About Me
 Microsoft, Data & AI Solution Architect in Microsoft Federal Civilian
 At Microsoft for most of the last nine years as a Data & AI Architect , with a brief stop at EY
 In IT for 35 years, worked on many BI and DW projects
 Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM
architect, PDW/APS developer
 Been perm employee, contractor, consultant, business owner
 Presenter at PASS Summit, SQLBits, Enterprise Data World conference, Big Data Conference
Europe, SQL Saturdays, Informatica World
 Blog at JamesSerra.com
 Former SQL Server MVP
 Author of book “Deciphering Data Architectures: Choosing Between a Modern Data Warehouse,
Data Fabric, Data Lakehouse, and Data Mesh”
My upcoming book
- Foundation
- Big data
- Types of data architectures
- Architecture Design Session
- Common data architecture concepts
- Relational Data Warehouse
- Data Lake
- Approaches to Data Stores
- Approaches to Design
- Approaches to Data Modeling
- Approaches to Data Ingestion
- Data Architectures
- Modern Data Warehouse (MDW)
- Data Fabric
- Data Lakehouse
- Data Mesh Foundation
- Data Mesh Adoption
- People, Process, and Technology
- People and process
- Technologies
- Data architectures on Microsoft Azure
First two chapters available now:
Deciphering Data Architectures (oreilly.com)
Table of contents
Agenda
 What is Microsoft Fabric?
 Workspaces and capacities
 OneLake
 Lakehouse
 Data Warehouse
 ADF
 Power BI / DirectLake
 Resources
 Not covered:
 Real-time analytics
 Spark
 Data science
 Fabric capacities
 Billing / Pricing
 Reflex / Data Activator
 Git integration
 Admin monitoring
 Purview integration
 Data mesh
 Copilot
Microsoft Fabric does it all—in a unified solution
An end-to-end analytics platform that brings together all the data and analytics tools that
organizations need to go from the data lake to the business user
Data Integration
Data Factory
Data Engineering
Synapse
Data Warehouse
Synapse
Data Science
Synapse
Real Time Analytics
Synapse
Business Intelligence
Power BI
UNIFIED
SaaS product experience
Unified data foundation
OneLake
Observability
Data Activator
Security and governance Compute and storage Business model
Onboarding and trials
Sign-on
Navigation model
UX model
Workspace organization
Collaboration experience
Data Lake
Storage format
Data copy for all engines
Security model
CI/CD
Monitoring hub
Data Hub
Governance & compliance
Single…
The Intelligent data foundation
AI Assisted
Shared Workspaces
Universal Compute Capacities
OneSecurity
OneLake
Data
Factory
Synapse Data
Engineering
Synapse Data
Science
Synapse Data
Warehousing
Synapse Real
Time Analytics Power BI
Data
Activator
Microsoft Fabric
The data platform for the era of AI
SaaS
Frictionless onboarding
Quick results w/ Intuitive UX
Minimal knobs
Auto optimized
Auto Integrated
Tenant-wide governance
Instant Provisioning
5x5
Centralized security
management
Compliance built-in
Centralized
administration
Success
by Default
5 seconds to signup, 5 minutes to wow
Old vs New
Understanding Microsoft Fabric / FAQ
• Think of it as taking the PBI workspace and adding a SaaS version of Synapse to it
• You will wake up one day and PBI workspaces will be automatically migrated to Fabric workspaces: PBI
capacities will become fabric capacities. Your PBI tenant will have the Fabric workloads automatically built-
in
• Aligned to backend fabric capacity. Similar to Power BI capacity – specific amount of compute assigned to it.
A universal bucket of compute. No more Synapse DWU’s, Spark clusters, etc
• Serverless Pool and Dedicated Pool combined into one – no more relational storage or dedicated resources.
Everything is serverless. All about data lakehouse
• No Azure portal, subscriptions, creating storage. User won’t even realize they are using Azure
• Fabric has strong separation between person who buys and pays the bill, with person who builds stuff. In
Azure, the person building the solution has to also have the power to buy
• This is not just for departmental use. It’s not PaaS services (i.e., Synapse) vs Fabric. Fabric is the future.
Fabric is going to run your entire data estate: departmental projects as well as the largest data warehouse,
data lakehouses and data science projects
• One platform for enterprise data professional and citizen developer (next slide)
•Quickly tune a custom model by
integrating a model built and trained in
Azure ML in a Spark notebook
•Work faster with the ability to user your
preferred data science frameworks,
languages, and tools
•Bypass engineering dependencies
with the ability to use your preferred no-
code ML Ops to deploy and operate
models in production
•Tap into proven-at-scale models and
services to accelerate your AI
differentiation (AOAI, Cognitive Services,
ONNX integration, etc).
•Avoid slow, progress-stagnating
data wrangling by seamlessly triggering
a workflow that can unlock data
engineering tools and capabilities quickly.
•Accelerate your work with visual and
SQL based tools for self-serve data
transformations and modeling as well as
self-serve tools for reporting, dashboards,
and data visualizations
•Turn data into impact with industry-
leading BI tools and integration with the
apps your people use everyday like
Microsoft 365
•Make more data-driven decisions
with actionable insights and intelligence
in your preferred applications
•Maintain access to all the data you
need, without being overwhelmed by
data ancillary to your role thanks to fine
grain data access management controls
Data Engineers
•Execute faster with the ability to spin up
a Spark VM cluster in seconds, or
configure with familiar experiences like
Git DevOps pipelines for data
engineering artifacts
•Streamline your work with a single
platform to build and operate real-time
analytics pipelines, data lakes, lake
houses, warehouses, marts, and cubes
using your preferred IDE, plug-ins, and
tools.
•Reduce costly data replication and
movement with the ability to produce
base datasets that can serve data analysts
and data scientists without needing to
build pipelines
Supporting experiences:
Data Scientists
Supporting experiences
Data Analysts
Supporting experiences
Data Citizens
Supporting experiences
Serve data via
warehouse or
lakehouse
Serve
transformed
data
Serve insights
via
embedding
Serve data via warehouse or lakehouse
Data Stewards
•Maintain visibility and control of costs with a unified consumption and cost model that provides evergreen spend optics on your end-to-end data estate
•Gain full visibility and governance over your entire analytics estate from data sources and connections to your data lake, to users and their insights
Data Factory
Real-time analytics
Data Warehouse
Data Engineering
Data
Warehouse
Power BI
Real-time
analytics
Data Science Azure ML Power BI Microsoft 365
Workspaces and capacities
Company examples
Create fabric capacity
Capacity is a dedicated set of resources reserved for exclusive use. It offers dependable,
consistent performance for your content. Each capacity offers a selection of SKUs, and
each SKU provides different resource tiers for memory and computing power. You pay
for the provisioned capacity whether you use it or not.
A capacity is a quota-based system, and scaling up or down a capacity doesn't involve
provisioning compute or moving data, so it’s instant.
Once the capacity is created, we can see the capacity on the Admin portal- Capacity Settings pane under the "Fabric Capacity" tab
Create fabric capacity
Create fabric capacity
Turning on Microsoft Fabric
Enable Microsoft Fabric for your organization - Microsoft Fabric | Microsoft Learn
Demo
OneLake
OneLake for all data 2
“The OneDrive for data”
A single unified logical SaaS data lake for
the whole organization (no silos)
Organize data into domains
Foundation for all Fabric data items
Provides full and open access through
industry standard APIs and formats to any
application (no lock-in)
OneLake
One Copy
One Security
OneLake Data Hub Intelligent data fabric
Data
Factory
Synapse Data
Warehousing
Synapse Data
Engineering
Synapse Data
Science
Synapse Real
Time Analytics
Power BI
Data
Activator
One Copy for all computes 4
Real separation of compute and storage
No matter which engine or item you use,
everyone contributes to building the same lake.
Engines are being optimized to work with
Delta Parquet as their native format
Compute powers the applications and
experiences in Fabric. The compute is
separate from the storage.
Multiple compute engines are available, and
all engines can access the same data without
needing to import or export it. You are able
to choose the right engine for the right job.
Non-Fabric engines can also read/write
to the same copy of data using the
ADLS APIs or added through shortcuts
Unified management and governance
Workspace A
Warehouse
Finance
Lakehouse
Customer
360
Workspace B
Lakehouse
Service
telemetry
Warehouse
Business
KPIs
Data
Factory
Synapse Data
Warehousing
Synapse Data
Engineering
Synapse Data
Science
Synapse Real
Time Analytics
Power BI
Data
Activator
T-SQL Spark
Analysis
services
KQL
Shortcuts virtualize data across domains and clouds
No data movements or duplication
A shortcut is a symbolic link which points
from one data location to another
Create a shortcut to make data from a
warehouse part of your lakehouse
Create a shortcut within Fabric to consolidate
data across items or workspaces without
changing the ownership of the data. Data can be
reused multiple times without data duplication.
Existing ADLS gen2 storage accounts and
Amazon S3 buckets can be managed
externally to Fabric and Microsoft while still
being virtualized into OneLake with shortcuts
All data is mapped to a unified namespace
and can be accessed using the same APIs
including the ADLS Gen2 DFS APIs
Unified management and governance
Workspace A
Warehouse
Finance
Lakehouse
Customer
360
Workspace B
Lakehouse
Service
telemetry
Warehouse
Business
KPIs
Amazon
Azure
Data
Factory
Synapse Data
Warehousing
Synapse Data
Engineering
Synapse Data
Science
Synapse Real
Time Analytics
Power BI
Data
Activator
OneLake Scenarios
OneLake Data Hub
Discover, manage and use data in one place
Central location within Fabric to discover,
manage, and reuse data
Data can be easily discovered by its domain
(e.g. Finance) so users can see what matters
for them
Explorer capability to easily browse and find
data by its folder (workspace) hierarchy
Efficient data discovery using search, filter
and sort
Lakehouse
Lakehouse
Data Source
Shortcut Enabled
Structured /
Unstructured
Ingestion
Shortcuts
Pipelines &
Dataflows
Store
Lakehouse(s)
Transform
Notebooks &
Dataflows
Expose
PBI
Lake Warehouse
Lakehouse – Lakehouse mode
Table - This is a virtual view of the managed area in your lake. This is the main container to host
tables of all types (CSV, Parquet, Delta, Managed tables and External tables). All tables, whether
automatically or explicitly created, will show up as a table under the managed area of the Lakehouse.
This area can also include any types of files or folder/subfolder organizations.
Files - This is a virtual view of the unmanaged area in your lake. It can contain any files and
folders/subfolder’s structure. The main distinction between the managed area and the unmanaged
area is the automatic delta table detection process which runs over any folders created in the
managed area. Any delta format files (parquet + transaction log) will be automatically registered as a
table and will also be available from the serving layer (TSQL)
Automatic Table Discovery and Registration
Lakehouse Table Automatic discovery and registration is a feature of the lakehouse that provides a fully managed
file to table experience for data engineers and data scientists. Users can drop a file into the managed area of the
lakehouse and the file will be automatically validated for supported structured formats, which is currently only
Delta tables, and registered into the metastore with the necessary metadata such as column names, formats,
compression and more. Users can then reference the file as a table and use SparkSQL syntax to interact with the
data.
Lakehouse – SQL endpoint mode
Lakehouse – shortcuts (to lakehouse)
Workspaces and capacities accessing OneLake
Each tenant will have only one OneLake, and any tenant can
access files in a OneLake from other tenants via shortcuts
Lakehouse
Sales
Demo
Data Warehouse
Data warehouse
Data Source
Shortcut Enabled
Structured /
Unstructured
Ingestion
Mounts
Pipelines &
Dataflows
Store
Data Warehouse
Transform
Procedures
Expose
PBI
Warehouse
Synapse Data Warehouse
Infinitely scalable and open
Synapse Data Warehouse in Fabric
Infinite serverless compute
Open Storage Format
in customer owned Data Lake
Relational Engine
Data
Warehouse
Data
Warehouse
Data
Warehouse
Data
Warehouse
1
1 Open standard format in an open
data lake replaces proprietary
formats as the native storage
• First transactional data warehouse natively
embracing an open standard format
• Data is stored in Delta – Parquet with no
vendor lock-in
• Is auto-integrated and auto-optimized with
minimal knobs
• Extends full SQL ecosystem benefits
Infinite serverless compute
Open Storage Format
in customer owned Data Lake
Relational Engine
Data
Warehouse
Data
Warehouse
Data
Warehouse
Data
Warehouse
Synapse Data Warehouse
Infinitely scalable and open
Synapse Data Warehouse in Fabric
2
2 Dedicated clusters are replaced by
serverless compute infrastructure
1
• Physical compute resources assigned
within milliseconds to jobs
• Infinite scaling with dynamic resource
allocation tailored to data volume and
query complexity
• Instant scaling up/down with no physical
provisioning involved
• Resource pooling providing significant
efficiencies and pricing
Workspaces and capacities accessing OneLake
Each tenant will have only one OneLake, and any tenant can
access files in a OneLake from other tenants via shortcuts
Warehouse
Sales
Data Warehouse
Use this to build a relational layer on top of the physical data
in the Lakehouse and expose it to analysis and reporting
tools using T-SQL/TDS end-point.
This offers a transactional data warehouse with T-SQL DML
support, stored procedures, tables, and views
How can I control “bad actor” queries?
Fabric compute is designed to automatically classify queries
to allocate resources and ensure high priority queries (i.e. ETL,
data preparation, and reporting) are not impacted by
potentially poorly written ad hoc queries.
How is the classification for an incoming query determined?
Queries are intelligently classified by a combination of the
source (i.e., pipeline vs. Power BI) and the query type (I.e.,
INSERT vs. SELECT)
Where is the physical storage for the Data Warehouse? All
data for Fabric is stored in OneLake in the open Delta format.
A single COPY of the data is therefore exposed to all the
compute engines of Fabric without needing to move or
duplicate data
Access via other tools
Demo
Microsoft Fabric
Advancing Analytics
Why two options?
Delta lake shortcomings:
- No multi-table transactions
- Lack of full T-SQL support (no
updates, limited reads)
- Performance problem for trickle
transactions
Microsoft Fabric
Advancing Analytics
Advancing Analytics
Microsoft Fabric
ADF
ADF Review Mapping data flows Wrangling data flows
ADF Review Mapping data flows Wrangling data flows
Data Pipelines Don’t
Exist
Dataflow Gen2
Dataflow Gen1
Data Factory in Fabric
What is Dataflows Gen2?
This is the new generation of Dataflows Gen1. Dataflows provide a low-code
interface for ingesting data from 100s of data sources, transforming your data
using 300+ data transformations and loading the resulting data into multiple
destinations such as Azure SQL Databases, Lakehouse, and more
We currently have multiple Dataflows experiences with Power BI Dataflows
Gen1, Power Query Dataflows and ADF Data flows. What is the strategy with
Fabric with these various experiences?
Our goal is to evolve over time with a single Dataflow that combines the ease of
use of PBI, Power Query and the scale of ADF
What is Fabric Pipelines?
Fabric pipelines enable powerful workflow capabilities at cloud-scale. With data
pipelines, you can build complex workflows that can refresh your dataflow, move
PB-size data, and define sophisticated control flow pipelines. Use data pipelines to
build complex ETL and Data factory workflows that can perform a number of
different tasks at scale. Control flow capabilities are built into pipelines that will
allow you to build workflow logic which provides loops and conditional.
Power BI / DirectLake
For best performance you should compress the data using
the VORDER compression method (50%-70% more
compression). Stored this way by ADF by default
Should I use Fabric now?
 Yes, for prototyping
 Yes, if you won’t be in production for several months
 You have to be OK with bugs, missing features, and possible performance issues
 Don’t use if have hundreds of terabytes
If building in Synapse, how to make transition to Fabric smooth?
 Do not use dedicated pools, unless needed for serving and performance
 Don’t use any stored procedures to modify data in dedicated pools
 Use ADF for pipelines and for PowerQuery, and don’t use ADF mapping data flows. Don’t use
Synapse pipelines or mapping data flows
 Embrace the data lakehouse architecture
Resources
Microsoft Fabric webinar series: https://aka.ms/fabric-webinar-series
New documentation: https://aka.ms/fabric-docs. Check out the tutorials.
Data Mesh, Data Fabric, Data Lakehouse – (video from Toronto Data Professional Community on 2/15/23)
Build videos:
Build 2-day demos
Microsoft Fabric Synapse data warehouse, Q&A
My intro blog on Microsoft Fabric (with helpful links at the bottom)
Fabric notes
Advancing Analytics videos
Ask me Anything (AMA) about Microsoft Fabric!
Q & A ?
James Serra, Microsoft, Industry Advisor
Email me at: jamesserra3@gmail.com
Follow me at: @JamesSerra
Link to me at: www.linkedin.com/in/JamesSerra
Visit my blog at: JamesSerra.com

More Related Content

What's hot

Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The CloudCloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
New Relic
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Introduction to power apps
Introduction to power appsIntroduction to power apps
Introduction to power apps
RezaDorrani1
 
Power BI Governance - Access Management, Recommendations and Best Practices
Power BI Governance - Access Management, Recommendations and Best PracticesPower BI Governance - Access Management, Recommendations and Best Practices
Power BI Governance - Access Management, Recommendations and Best Practices
Learning SharePoint
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
James Serra
 
Application Portfolio Migration
Application Portfolio MigrationApplication Portfolio Migration
Application Portfolio Migration
Amazon Web Services
 
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
Databricks
 
AWS Cloud Adoption Framework
AWS Cloud Adoption Framework AWS Cloud Adoption Framework
AWS Cloud Adoption Framework
Amazon Web Services
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Denodo
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
Laurent Leturgez
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
Ishan Bhawantha Hewanayake
 
Microsoft power platform
Microsoft power platformMicrosoft power platform
Microsoft power platform
Michael Stephenson
 
Introduction to PowerApps and Flow
Introduction to PowerApps and FlowIntroduction to PowerApps and Flow
Introduction to PowerApps and Flow
James Milne
 
Microsoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D KoutsanastasisMicrosoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D Koutsanastasis
Uni Systems S.M.S.A.
 
Executing a Large-Scale Migration to AWS
Executing a Large-Scale Migration to AWSExecuting a Large-Scale Migration to AWS
Executing a Large-Scale Migration to AWS
Amazon Web Services
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra
 
Azure Sentinel.pptx
Azure Sentinel.pptxAzure Sentinel.pptx
Azure Sentinel.pptx
Mohit Chhabra
 
Introduction to Microsoft Azure
Introduction to Microsoft AzureIntroduction to Microsoft Azure
Introduction to Microsoft Azure
Kasun Kodagoda
 

What's hot (20)

Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The CloudCloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Introduction to power apps
Introduction to power appsIntroduction to power apps
Introduction to power apps
 
Power BI Governance - Access Management, Recommendations and Best Practices
Power BI Governance - Access Management, Recommendations and Best PracticesPower BI Governance - Access Management, Recommendations and Best Practices
Power BI Governance - Access Management, Recommendations and Best Practices
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
 
Application Portfolio Migration
Application Portfolio MigrationApplication Portfolio Migration
Application Portfolio Migration
 
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
 
AWS Cloud Adoption Framework
AWS Cloud Adoption Framework AWS Cloud Adoption Framework
AWS Cloud Adoption Framework
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Microsoft power platform
Microsoft power platformMicrosoft power platform
Microsoft power platform
 
Introduction to PowerApps and Flow
Introduction to PowerApps and FlowIntroduction to PowerApps and Flow
Introduction to PowerApps and Flow
 
Microsoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D KoutsanastasisMicrosoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D Koutsanastasis
 
Executing a Large-Scale Migration to AWS
Executing a Large-Scale Migration to AWSExecuting a Large-Scale Migration to AWS
Executing a Large-Scale Migration to AWS
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Azure Sentinel.pptx
Azure Sentinel.pptxAzure Sentinel.pptx
Azure Sentinel.pptx
 
Introduction to Microsoft Azure
Introduction to Microsoft AzureIntroduction to Microsoft Azure
Introduction to Microsoft Azure
 

Similar to Microsoft Fabric Introduction

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
James Serra
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
James Serra
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
Riccardo Zamana
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
James Serra
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
CCG
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release
Jen Stirrup
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
Matthew W. Bowers
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDS
Nicolas Georgeault
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
DATAVERSITY
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
FedoRam1
 
Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Itay Braun
 
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data AnalyticsComprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
Sparity1
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data Lake
Trivadis
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
Trivadis
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Dataconomy Media
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloud
James Serra
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
James Serra
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure
Abhimanyu Singhal
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
Bob Marcus
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
Collective Intelligence Inc.
 

Similar to Microsoft Fabric Introduction (20)

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDS
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
 
Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011
 
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data AnalyticsComprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data Lake
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloud
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
 

More from James Serra

Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
James Serra
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
Power BI Overview
Power BI OverviewPower BI Overview
Power BI Overview
James Serra
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
James Serra
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
James Serra
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
James Serra
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
James Serra
 
How to build your career
How to build your careerHow to build your career
How to build your career
James Serra
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
James Serra
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed Instance
James Serra
 
What’s new in SQL Server 2017
What’s new in SQL Server 2017What’s new in SQL Server 2017
What’s new in SQL Server 2017
James Serra
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
James Serra
 
Learning to present and becoming good at it
Learning to present and becoming good at itLearning to present and becoming good at it
Learning to present and becoming good at it
James Serra
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
James Serra
 
Introducing DocumentDB
Introducing DocumentDB Introducing DocumentDB
Introducing DocumentDB
James Serra
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
James Serra
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine Learning
James Serra
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 

More from James Serra (20)

Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Power BI Overview
Power BI OverviewPower BI Overview
Power BI Overview
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
 
How to build your career
How to build your careerHow to build your career
How to build your career
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed Instance
 
What’s new in SQL Server 2017
What’s new in SQL Server 2017What’s new in SQL Server 2017
What’s new in SQL Server 2017
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Learning to present and becoming good at it
Learning to present and becoming good at itLearning to present and becoming good at it
Learning to present and becoming good at it
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
 
Introducing DocumentDB
Introducing DocumentDB Introducing DocumentDB
Introducing DocumentDB
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine Learning
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 

Recently uploaded

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 

Recently uploaded (20)

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 

Microsoft Fabric Introduction

  • 1. Microsoft Fabric A unified analytics solution for the era of AI James Serra Industry Advisor Microsoft, Federal Civilian jamesserra3@gmail.com 6/16/23
  • 2. About Me  Microsoft, Data & AI Solution Architect in Microsoft Federal Civilian  At Microsoft for most of the last nine years as a Data & AI Architect , with a brief stop at EY  In IT for 35 years, worked on many BI and DW projects  Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM architect, PDW/APS developer  Been perm employee, contractor, consultant, business owner  Presenter at PASS Summit, SQLBits, Enterprise Data World conference, Big Data Conference Europe, SQL Saturdays, Informatica World  Blog at JamesSerra.com  Former SQL Server MVP  Author of book “Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh”
  • 3. My upcoming book - Foundation - Big data - Types of data architectures - Architecture Design Session - Common data architecture concepts - Relational Data Warehouse - Data Lake - Approaches to Data Stores - Approaches to Design - Approaches to Data Modeling - Approaches to Data Ingestion - Data Architectures - Modern Data Warehouse (MDW) - Data Fabric - Data Lakehouse - Data Mesh Foundation - Data Mesh Adoption - People, Process, and Technology - People and process - Technologies - Data architectures on Microsoft Azure First two chapters available now: Deciphering Data Architectures (oreilly.com) Table of contents
  • 4. Agenda  What is Microsoft Fabric?  Workspaces and capacities  OneLake  Lakehouse  Data Warehouse  ADF  Power BI / DirectLake  Resources  Not covered:  Real-time analytics  Spark  Data science  Fabric capacities  Billing / Pricing  Reflex / Data Activator  Git integration  Admin monitoring  Purview integration  Data mesh  Copilot
  • 5. Microsoft Fabric does it all—in a unified solution An end-to-end analytics platform that brings together all the data and analytics tools that organizations need to go from the data lake to the business user Data Integration Data Factory Data Engineering Synapse Data Warehouse Synapse Data Science Synapse Real Time Analytics Synapse Business Intelligence Power BI UNIFIED SaaS product experience Unified data foundation OneLake Observability Data Activator Security and governance Compute and storage Business model
  • 6. Onboarding and trials Sign-on Navigation model UX model Workspace organization Collaboration experience Data Lake Storage format Data copy for all engines Security model CI/CD Monitoring hub Data Hub Governance & compliance Single… The Intelligent data foundation AI Assisted Shared Workspaces Universal Compute Capacities OneSecurity OneLake Data Factory Synapse Data Engineering Synapse Data Science Synapse Data Warehousing Synapse Real Time Analytics Power BI Data Activator Microsoft Fabric The data platform for the era of AI
  • 7. SaaS Frictionless onboarding Quick results w/ Intuitive UX Minimal knobs Auto optimized Auto Integrated Tenant-wide governance Instant Provisioning 5x5 Centralized security management Compliance built-in Centralized administration Success by Default 5 seconds to signup, 5 minutes to wow
  • 9. Understanding Microsoft Fabric / FAQ • Think of it as taking the PBI workspace and adding a SaaS version of Synapse to it • You will wake up one day and PBI workspaces will be automatically migrated to Fabric workspaces: PBI capacities will become fabric capacities. Your PBI tenant will have the Fabric workloads automatically built- in • Aligned to backend fabric capacity. Similar to Power BI capacity – specific amount of compute assigned to it. A universal bucket of compute. No more Synapse DWU’s, Spark clusters, etc • Serverless Pool and Dedicated Pool combined into one – no more relational storage or dedicated resources. Everything is serverless. All about data lakehouse • No Azure portal, subscriptions, creating storage. User won’t even realize they are using Azure • Fabric has strong separation between person who buys and pays the bill, with person who builds stuff. In Azure, the person building the solution has to also have the power to buy • This is not just for departmental use. It’s not PaaS services (i.e., Synapse) vs Fabric. Fabric is the future. Fabric is going to run your entire data estate: departmental projects as well as the largest data warehouse, data lakehouses and data science projects • One platform for enterprise data professional and citizen developer (next slide)
  • 10. •Quickly tune a custom model by integrating a model built and trained in Azure ML in a Spark notebook •Work faster with the ability to user your preferred data science frameworks, languages, and tools •Bypass engineering dependencies with the ability to use your preferred no- code ML Ops to deploy and operate models in production •Tap into proven-at-scale models and services to accelerate your AI differentiation (AOAI, Cognitive Services, ONNX integration, etc). •Avoid slow, progress-stagnating data wrangling by seamlessly triggering a workflow that can unlock data engineering tools and capabilities quickly. •Accelerate your work with visual and SQL based tools for self-serve data transformations and modeling as well as self-serve tools for reporting, dashboards, and data visualizations •Turn data into impact with industry- leading BI tools and integration with the apps your people use everyday like Microsoft 365 •Make more data-driven decisions with actionable insights and intelligence in your preferred applications •Maintain access to all the data you need, without being overwhelmed by data ancillary to your role thanks to fine grain data access management controls Data Engineers •Execute faster with the ability to spin up a Spark VM cluster in seconds, or configure with familiar experiences like Git DevOps pipelines for data engineering artifacts •Streamline your work with a single platform to build and operate real-time analytics pipelines, data lakes, lake houses, warehouses, marts, and cubes using your preferred IDE, plug-ins, and tools. •Reduce costly data replication and movement with the ability to produce base datasets that can serve data analysts and data scientists without needing to build pipelines Supporting experiences: Data Scientists Supporting experiences Data Analysts Supporting experiences Data Citizens Supporting experiences Serve data via warehouse or lakehouse Serve transformed data Serve insights via embedding Serve data via warehouse or lakehouse Data Stewards •Maintain visibility and control of costs with a unified consumption and cost model that provides evergreen spend optics on your end-to-end data estate •Gain full visibility and governance over your entire analytics estate from data sources and connections to your data lake, to users and their insights Data Factory Real-time analytics Data Warehouse Data Engineering Data Warehouse Power BI Real-time analytics Data Science Azure ML Power BI Microsoft 365
  • 13. Create fabric capacity Capacity is a dedicated set of resources reserved for exclusive use. It offers dependable, consistent performance for your content. Each capacity offers a selection of SKUs, and each SKU provides different resource tiers for memory and computing power. You pay for the provisioned capacity whether you use it or not. A capacity is a quota-based system, and scaling up or down a capacity doesn't involve provisioning compute or moving data, so it’s instant.
  • 14. Once the capacity is created, we can see the capacity on the Admin portal- Capacity Settings pane under the "Fabric Capacity" tab Create fabric capacity
  • 16. Turning on Microsoft Fabric Enable Microsoft Fabric for your organization - Microsoft Fabric | Microsoft Learn
  • 17. Demo
  • 19. OneLake for all data 2 “The OneDrive for data” A single unified logical SaaS data lake for the whole organization (no silos) Organize data into domains Foundation for all Fabric data items Provides full and open access through industry standard APIs and formats to any application (no lock-in) OneLake One Copy One Security OneLake Data Hub Intelligent data fabric Data Factory Synapse Data Warehousing Synapse Data Engineering Synapse Data Science Synapse Real Time Analytics Power BI Data Activator
  • 20. One Copy for all computes 4 Real separation of compute and storage No matter which engine or item you use, everyone contributes to building the same lake. Engines are being optimized to work with Delta Parquet as their native format Compute powers the applications and experiences in Fabric. The compute is separate from the storage. Multiple compute engines are available, and all engines can access the same data without needing to import or export it. You are able to choose the right engine for the right job. Non-Fabric engines can also read/write to the same copy of data using the ADLS APIs or added through shortcuts Unified management and governance Workspace A Warehouse Finance Lakehouse Customer 360 Workspace B Lakehouse Service telemetry Warehouse Business KPIs Data Factory Synapse Data Warehousing Synapse Data Engineering Synapse Data Science Synapse Real Time Analytics Power BI Data Activator T-SQL Spark Analysis services KQL
  • 21. Shortcuts virtualize data across domains and clouds No data movements or duplication A shortcut is a symbolic link which points from one data location to another Create a shortcut to make data from a warehouse part of your lakehouse Create a shortcut within Fabric to consolidate data across items or workspaces without changing the ownership of the data. Data can be reused multiple times without data duplication. Existing ADLS gen2 storage accounts and Amazon S3 buckets can be managed externally to Fabric and Microsoft while still being virtualized into OneLake with shortcuts All data is mapped to a unified namespace and can be accessed using the same APIs including the ADLS Gen2 DFS APIs Unified management and governance Workspace A Warehouse Finance Lakehouse Customer 360 Workspace B Lakehouse Service telemetry Warehouse Business KPIs Amazon Azure Data Factory Synapse Data Warehousing Synapse Data Engineering Synapse Data Science Synapse Real Time Analytics Power BI Data Activator
  • 23. OneLake Data Hub Discover, manage and use data in one place Central location within Fabric to discover, manage, and reuse data Data can be easily discovered by its domain (e.g. Finance) so users can see what matters for them Explorer capability to easily browse and find data by its folder (workspace) hierarchy Efficient data discovery using search, filter and sort
  • 25. Lakehouse Data Source Shortcut Enabled Structured / Unstructured Ingestion Shortcuts Pipelines & Dataflows Store Lakehouse(s) Transform Notebooks & Dataflows Expose PBI Lake Warehouse
  • 26. Lakehouse – Lakehouse mode Table - This is a virtual view of the managed area in your lake. This is the main container to host tables of all types (CSV, Parquet, Delta, Managed tables and External tables). All tables, whether automatically or explicitly created, will show up as a table under the managed area of the Lakehouse. This area can also include any types of files or folder/subfolder organizations. Files - This is a virtual view of the unmanaged area in your lake. It can contain any files and folders/subfolder’s structure. The main distinction between the managed area and the unmanaged area is the automatic delta table detection process which runs over any folders created in the managed area. Any delta format files (parquet + transaction log) will be automatically registered as a table and will also be available from the serving layer (TSQL) Automatic Table Discovery and Registration Lakehouse Table Automatic discovery and registration is a feature of the lakehouse that provides a fully managed file to table experience for data engineers and data scientists. Users can drop a file into the managed area of the lakehouse and the file will be automatically validated for supported structured formats, which is currently only Delta tables, and registered into the metastore with the necessary metadata such as column names, formats, compression and more. Users can then reference the file as a table and use SparkSQL syntax to interact with the data.
  • 27. Lakehouse – SQL endpoint mode
  • 28. Lakehouse – shortcuts (to lakehouse)
  • 29. Workspaces and capacities accessing OneLake Each tenant will have only one OneLake, and any tenant can access files in a OneLake from other tenants via shortcuts Lakehouse Sales
  • 30. Demo
  • 32. Data warehouse Data Source Shortcut Enabled Structured / Unstructured Ingestion Mounts Pipelines & Dataflows Store Data Warehouse Transform Procedures Expose PBI Warehouse
  • 33. Synapse Data Warehouse Infinitely scalable and open Synapse Data Warehouse in Fabric Infinite serverless compute Open Storage Format in customer owned Data Lake Relational Engine Data Warehouse Data Warehouse Data Warehouse Data Warehouse 1 1 Open standard format in an open data lake replaces proprietary formats as the native storage • First transactional data warehouse natively embracing an open standard format • Data is stored in Delta – Parquet with no vendor lock-in • Is auto-integrated and auto-optimized with minimal knobs • Extends full SQL ecosystem benefits
  • 34. Infinite serverless compute Open Storage Format in customer owned Data Lake Relational Engine Data Warehouse Data Warehouse Data Warehouse Data Warehouse Synapse Data Warehouse Infinitely scalable and open Synapse Data Warehouse in Fabric 2 2 Dedicated clusters are replaced by serverless compute infrastructure 1 • Physical compute resources assigned within milliseconds to jobs • Infinite scaling with dynamic resource allocation tailored to data volume and query complexity • Instant scaling up/down with no physical provisioning involved • Resource pooling providing significant efficiencies and pricing
  • 35.
  • 36. Workspaces and capacities accessing OneLake Each tenant will have only one OneLake, and any tenant can access files in a OneLake from other tenants via shortcuts Warehouse Sales
  • 37. Data Warehouse Use this to build a relational layer on top of the physical data in the Lakehouse and expose it to analysis and reporting tools using T-SQL/TDS end-point. This offers a transactional data warehouse with T-SQL DML support, stored procedures, tables, and views How can I control “bad actor” queries? Fabric compute is designed to automatically classify queries to allocate resources and ensure high priority queries (i.e. ETL, data preparation, and reporting) are not impacted by potentially poorly written ad hoc queries. How is the classification for an incoming query determined? Queries are intelligently classified by a combination of the source (i.e., pipeline vs. Power BI) and the query type (I.e., INSERT vs. SELECT) Where is the physical storage for the Data Warehouse? All data for Fabric is stored in OneLake in the open Delta format. A single COPY of the data is therefore exposed to all the compute engines of Fabric without needing to move or duplicate data
  • 39. Demo
  • 41. Why two options? Delta lake shortcomings: - No multi-table transactions - Lack of full T-SQL support (no updates, limited reads) - Performance problem for trickle transactions
  • 44. ADF
  • 45. ADF Review Mapping data flows Wrangling data flows
  • 46. ADF Review Mapping data flows Wrangling data flows Data Pipelines Don’t Exist Dataflow Gen2 Dataflow Gen1
  • 47. Data Factory in Fabric What is Dataflows Gen2? This is the new generation of Dataflows Gen1. Dataflows provide a low-code interface for ingesting data from 100s of data sources, transforming your data using 300+ data transformations and loading the resulting data into multiple destinations such as Azure SQL Databases, Lakehouse, and more We currently have multiple Dataflows experiences with Power BI Dataflows Gen1, Power Query Dataflows and ADF Data flows. What is the strategy with Fabric with these various experiences? Our goal is to evolve over time with a single Dataflow that combines the ease of use of PBI, Power Query and the scale of ADF What is Fabric Pipelines? Fabric pipelines enable powerful workflow capabilities at cloud-scale. With data pipelines, you can build complex workflows that can refresh your dataflow, move PB-size data, and define sophisticated control flow pipelines. Use data pipelines to build complex ETL and Data factory workflows that can perform a number of different tasks at scale. Control flow capabilities are built into pipelines that will allow you to build workflow logic which provides loops and conditional.
  • 48. Power BI / DirectLake
  • 49.
  • 50. For best performance you should compress the data using the VORDER compression method (50%-70% more compression). Stored this way by ADF by default
  • 51. Should I use Fabric now?  Yes, for prototyping  Yes, if you won’t be in production for several months  You have to be OK with bugs, missing features, and possible performance issues  Don’t use if have hundreds of terabytes
  • 52. If building in Synapse, how to make transition to Fabric smooth?  Do not use dedicated pools, unless needed for serving and performance  Don’t use any stored procedures to modify data in dedicated pools  Use ADF for pipelines and for PowerQuery, and don’t use ADF mapping data flows. Don’t use Synapse pipelines or mapping data flows  Embrace the data lakehouse architecture
  • 53. Resources Microsoft Fabric webinar series: https://aka.ms/fabric-webinar-series New documentation: https://aka.ms/fabric-docs. Check out the tutorials. Data Mesh, Data Fabric, Data Lakehouse – (video from Toronto Data Professional Community on 2/15/23) Build videos: Build 2-day demos Microsoft Fabric Synapse data warehouse, Q&A My intro blog on Microsoft Fabric (with helpful links at the bottom) Fabric notes Advancing Analytics videos Ask me Anything (AMA) about Microsoft Fabric!
  • 54. Q & A ? James Serra, Microsoft, Industry Advisor Email me at: jamesserra3@gmail.com Follow me at: @JamesSerra Link to me at: www.linkedin.com/in/JamesSerra Visit my blog at: JamesSerra.com

Editor's Notes

  1. Abstract Microsoft Fabric Microsoft Fabric is the next version of Azure Data Factory, Azure Data Explorer, Azure Synapse Analytics, and Power BI. It brings all of these capabilities together into a single unified analytics platform that goes from the data lake to the business user in a SaaS-like environment. Therefore, the vision of Fabric is to be a one-stop shop for all the analytical needs for every enterprise. Fabric will cover the complete spectrum of services including data movement, data lake, data engineering, data integration and data science, observational analytics, and business intelligence. With Fabric, there is no need to stitch together different services from multiple vendors. Instead, the customer enjoys end-to-end, highly integrated, single offering that is easy to understand, onboard, create and operate.     This is a hugely important new product that I have spent a ton of hours understanding and simplifying into a deck and demo that I will present to you.  This will shortcut your time to upskill on it so you are prepared to answer customer questions.  My presentation comes from the angle that you are in the field and are familiar with Azure Synapse and want to know how this differs. ----------------------- May public preview, Microsoft build GA by end-of-year MVP for GA, incremental updates, release features over time TODO: can you have multiple workspace per capacities. Show diagram with workspaces all pointing to the same warehouse dataflow how do we do snowflake cloning How will CDC work with no data flows how do we talk to customers about waiting for GA for Fabric when they need to do something now, go Lakehouse route in Synapse slide on mounting a dedicated pool Architecture diagrams on how things are done in Fabric / Use Cases This is giving new power to pbi users enterprise solution vs department wide solution slide https://www.jamesserra.com/archive/2022/06/power-bi-as-a-enterprise-solution/ Slide with Synapse missing items Will copilot be added? Highlight no more dedicated pools - use serverless PBI capacities - how to delegate get email about behind the scenes warehouse PBI desktop vs Fabric – what features are not yet in Fabric as modeling is there now Schema drift Failover TODO specialist: all or nothing with users who talks about it? Segment out Synapse - I want to build something now, what product do I use? Don't use dedicated pools, but will have a mounting option. Use ADF instead of Synpase piplelines Missing features in Synapse and what is better Snowflake compete slides link to S3 Pricing Position Synapse today to move seameless into Fabric - migration path Purview not in here What about synapse database templates Demo built? Ability to request a demo/presentation from PM/GBB who have been doing it
  2. Fluff, but point is I bring real work experience to the session
  3. Microsoft Fabric combines Data Factory, Synapse Analytics, Data Explorer, and Power BI into a single, unified experience, on the cloud. The open and governed data lakehouse foundation is a cost-effective and performance-optimized fabric for business intelligence, machine kerning, and AI workloads at any scale. It is the foundation for migrating and modernizing existing analytics solutions, whether this be data appliances or traditional data warehouses.   Talk Track for Greenfield/Growing Analytics Customers – Microsoft Fabric’s SaaS environment makes it easier to deploy an entire end-to-end analytics engine from the ground up at an accelerated pace. With the solution’s built-in security and governance capabilities, you can be rest assured your data and insights are protected.   Talk Track for Existing Synapse Customers – Microsoft Fabric is an evolution of Azure Synapse. You will still be able to enjoy the benefits and limitless scale of Synapse in an easier to use SaaS solution while adopting new capabilities that enhance your entire analytics approach. And with the addition of Power BI, you can help democratize the ability to uncover insights and create interactive reports across the organization, helping everyone make more data-driven decisions in their everyday work.   Talk Track for Existing Power BI Customers – With Microsoft Fabric, you’ll be able to access new and powerful data tools and services like Azure Synapse within the same user experience you already enjoy with Power BI. You can unify these tools with your disparate data sources in the same environment to establish a single source of truth for all data, driving the ability for everyone to uncover more accurate and consistent insights than before. And instead of having to worry about security concerns of a patchwork analytics estate, you can be rest assured your data is protected with the built-in security and governance capabilities.
  4. 5x5 = 5 seconds to signup, 5 minutes to wow
  5. https://microsoft-my.sharepoint.com/:v:/p/jamesserra/EQegpHEGCrlMkohlSrJbdjQBpfhlYpfEjL515flGy5UW7w?e=50ksgm
  6. It is not possible to share capacities across tenants
  7. https://learn.microsoft.com/en-us/fabric/enterprise/licenses PBI v-cores will evolve to Compute Units (group of 8 v-cores)
  8. Public preview available March 23rd. Switch in admin console to run the functionality on/off completely. It will be off by default until July 1st, when it will be switched on unless they go into the admin console and say “No, I don’t want to have this switched on starting July 1st”. Can control at the tenant and capacity levels. The Microsoft Fabric (Preview) trial includes access to the Fabric product experiences and the resources to create and host Fabric items. The Fabric (Preview) trial lasts until Fabric General Availability (GA), unless canceled. After GA, the Fabric (Preview) trial converts to the GA version and is extended for 60 days.
  9. https://microsoft-my.sharepoint.com/:v:/p/jamesserra/EQegpHEGCrlMkohlSrJbdjQBpfhlYpfEjL515flGy5UW7w?e=50ksgm
  10. Starting with OneLake itself. OneLake provides you a single data lake for your entire organization. Users cannot create OneLake storage. OneLake storage (ADLS Gen2) managed by OneLake API is attached to Fabric tenant. When a workspace is created, a folder is created in OneLake storage (ADLS Gen2 behind the scenes) on a customer tenant.
  11. Everyone is able to contribute to the same lake no matter which engine you use. We are doing a lot a work to optimize our engines to work directly with delta parquet as their native format for tabular data as you can see with T-SQL for Data warehousing and DirectLake mode in Analysis Service for BI.
  12. Think of OneLake as an abstraction layer. You can mount existing ADLS Gen2 to it. Virtualization across many storage account. Maintains a single namespace. A shortcut is nothing more than a symbolic link which points from one data location to another. Just like you can create shortcuts in Windows or Linux, the data will appear in the shortcut location as if it were physically there. Today, if you have tables in a data warehouse, which you want you want to make available along side other tables or files in a lakehouse, you will need to copy that data out of warehouse. With OneLake, you simply create a shortcut in the lakehouse pointing to the warehouse. The data will appear in your lakehouse as if you had physically copied it. Since you didn’t copy it, when data is updated in the warehouse, changes are automatically reflected in the lakehouse. You can also use shortcuts to consolidate data across workspace and domains without changing the ownership of the data. In this example, the workspace B still owns the data. They still have ultimate control over who can access it and how it stays up to date. Many of you already have existing data lakes stored in ADLS gen2 or in Amazon S3 buckets. These lakes can continue to exist and be managed externally to Fabric. We have extended shortcuts to include lake outside of OneLake and even outside of Azure so that you can virtualize you existing ADLS gen 2 accounts or Amazon S3 buckets into OneLake. All data is mapped to the same unified namespace and can be accessed using the same ADLS gen2 APIs even when it is coming from S3.
  13. https://blog.fabric.microsoft.com/en-us/blog/using-azure-databricks-with-microsoft-fabric-and-onelake/
  14. https://microsoft-my.sharepoint.com/:v:/p/jamesserra/EQegpHEGCrlMkohlSrJbdjQBpfhlYpfEjL515flGy5UW7w?e=50ksgm
  15. The Microsoft Fabric Lakehouse analytics scenario makes it so that data can be ingested into OneLake with shortcuts to other clouds repositories, pipelines, and dataflows in order to allow end-users to leverage other data.   Once that data has been pulled into Microsoft Fabric, users can leverage notebooks to transform that data in OneLake and then store them in Lakehouses with medallion structure.   From there, users can begin to analyze and visualize that data with Power BI using the see-through mode or SQL endpoints.
  16. If you don’t see a Lakehouse table in the warehouse (default), check the data format. Only the tables in Delta Lake format are available in the warehouse (default). Parquet, CSV, and other formats cannot be queried using the warehouse (default)
  17. Warehouse mode in the Lakehouse allows a user to transition from the “Lake” view of the Lakehouse (which supports data engineering and Apache Spark) to the “SQL” experiences that a data warehouse would provide, supporting T-SQL. In warehouse mode the user has a subset of SQL commands that can define and query data objects but not manipulate the data. You can perform the following actions in your warehouse(default): • Query the tables that reference data in your Delta Lake folders in the lake. • Create views, inline TVFs, and procedures to encapsulate your semantics and business logic in T-SQL. • Manage permissions on the objects. Warehouse mode is primarily oriented towards designing your warehouse and BI needs and serving data.
  18. https://microsoft-my.sharepoint.com/:v:/p/jamesserra/EQegpHEGCrlMkohlSrJbdjQBpfhlYpfEjL515flGy5UW7w?e=50ksgm
  19. The Data Warehouse analytics scenario takes existing sources that are mounted, while pipelines and dataflows can bring in all other data that is needed.   IT teams can then define and store procedures to transform the data, which is stored as Parquet/Delta Lake files in OneLake.   From there, business users can analyze and visualize data with Power BI, again using the see-through mode or SQL endpoints.
  20. Can more than one capacity be connected to a Datawarehouse, for instance, one to handle data writes and one to handle data reads? Currently, a capacity is assigned to the workspace level and a Data Warehouse is associated to a single workspace. This means all artifacts in the workspace will share the same capacity and all read/write operations will use the same capacity. Does Fabric Data Warehouse support fine grained access control like row-level security, column-level security, dynamic data masking? These security constructs are not available but are planned for the Fabric Data Warehouse and will integrate with Fabric’s universal security model. We already support stats, it’s in the docs. Automatic stats on load and on metadata discovery should land soon. Query plans, indexes, SQL RLS etc also will land incrementally
  21. Python, R, Scala
  22. https://learn.microsoft.com/en-us/fabric/get-started/decision-guide-warehouse-lakehouse Lakehouse: call it delta lake, owned and managed by Spark, customer can update files - it's user owned. Use if customer likes Spark and using files Warehouse: well structured, SQL front door, transactional guarentees, multi-table transactions. Nobody except the SQL engine can update the files. Use if customer is comfortable with SQL, comes from a relational database world LDF and MDF are still used behind the scenes. Can query the warehouse from the lakehouse, but can’t do opposite. Data is synced into onelake from LDF and MDF (only INSERT works for now) Can't support everything SQL supports with the current open source format at Delta, (multi-table transactions, indexing), so have to use SQL engine Want to get to a point where don't use LDF/MDF files but use delta underneath. Talk about using Iceberg way down the road Other knobs to turn? What should we expose? Hide things (DMV, explain plan) because can't tune. Performance DBA's - getting rid of another part of your job
  23. Fabric: Can land in bronze zone in warehouse and just use that for all layers to use T-SQL to write Compared to Synapse: no longer having relational storage and dedicated compute - idea is it’s all done within lake 3 deployment models How to organize workspaces – dev/test/prod, by orgs, by cost High concurrency clusters, spark monitoring
  24. https://microsoft-my.sharepoint.com/:v:/p/jamesserra/EQegpHEGCrlMkohlSrJbdjQBpfhlYpfEjL515flGy5UW7w?e=50ksgm
  25. Mapping data flows = ADF Data flows Wrangling data flows = Power Query
  26. Mapping data flows = ADF Data flows Wrangling data flows = Power Query
  27. Just in the Fabric context, I would say it this way: Fabric Dataflows are the PQ UI with the scale of ADF the Gen1 vs. Gen2 is just the distinction between what is in PBI and Excel today vs. what is in Fabric what will be the response to customers who ask how they move ADF Data flows to Fabric? We are looking at ways to convert to Fabric data flows and work with Partners who can help with the conversions like 1 Since ADF PQ already had cloud scale, why not just move that into Fabric dataflow gen2? so Fabric Dataflow Gen2 will eventually be an improvement over ADF PQ
  28. https://microsoft-my.sharepoint.com/:v:/p/jamesserra/EQegpHEGCrlMkohlSrJbdjQBpfhlYpfEjL515flGy5UW7w?e=50ksgm
  29. DirectLake mode is a groundbreaking new engine capability to analyze very large datasets in Power BI. The technology is based on the idea to load parquet-formatted files directly from a data lake without having to query a Data Warehouse or Lakehouse endpoint and without having to import or duplicate data into a Power BI dataset. DirectLake is a fast path to load the data from the lake straight into the Power BI engine, ready for analysis. It loads the data directly from the files into memory at runtime. Because there is no explicit import process, it is possible to pick up any changes at the source as they occur, thus combining the advantages of DirectQuery and import mode while avoiding their disadvantages. DirectLake can read parquet-formatted delta files, but for best performance you should compress the data using the VORDER compression method (50%-70% more compression).
  30. https://supportability.visualstudio.com/Power%20BI/_wiki/wikis/PBICSSWIKI/786636/Readiness-Direct-Lake-(SeeThru)-mode-in-PBI-reports
  31. Polaris engine is base of Fabric Microsoft Fabric
  32. Polaris engine is base of Fabric Microsoft Fabric