SlideShare a Scribd company logo
1 of 19
Data Governance
with Databricks
Unity Catalog
Presenter Name
Kundan Kumar
Lack of etiquette and manners is a huge turn off.
KnolX Etiquettes
 Punctuality
Join the session 5 minutes prior to the session start time. We start on
time and conclude on time!
 Feedback
Make sure to submit a constructive feedback for all sessions as it is very
helpful for the presenter.
 Silent Mode
Keep your mobile devices in silent mode, feel free to move out of session
in case you need to attend an urgent call.
 Avoid Disturbance
Avoid unwanted chit chat during the session.
1. Introduction to Data Governance
2. Challenges in Data Governance
3. Introduction to Databricks Unity Catalog
4. Key Components of Databricks Unity Catalog
5. Databricks Unity Catalog: Features and
capabilities
6. Benefits of Using Databricks Unity Catalog
Introduction to Data Governance
 Data governance is the process of managing the availability,
usability, integrity and security of the data in enterprise
systems.
 Data governance is a set of processes, policies, and
standards that organizations use to manage their data assets
effectively.
 The goal of data governance is to ensure that data is of high
quality, accessible, secure, and compliant with regulations
and standards.
 Data governance is a holistic approach to data management
that encompasses people, processes, policies, and
technology.
Challenges in Data Governance
Introduction to Databricks and Lakehouse
 Databricks is a unified, open analytics platform for building,
deploying, sharing, and maintaining enterprise-grade data, analytics,
and AI solutions at scale.
 Databricks provides tools that help you connect your sources of data
to one platform to process, store, share, analyze, model, and
monetize datasets with solutions from BI to generative AI.
 Databricks can be used as a powerful component within a Data
Lakehouse architecture to streamline data processing, analytics, and
machine learning tasks.
Databricks Unity Catalog
 Databricks Unity Catalog offers a unified governance layer for data and AI within the Databricks Data Intelligence
Platform.
 With Unity Catalog, organizations can seamlessly govern their structured and unstructured data, machine learning
models, notebooks, dashboards and files on any cloud or platform.
 Data scientists, analysts and engineers can use Unity Catalog to securely discover, access and collaborate on
trusted data and AI assets, leveraging AI to boost productivity and unlock the full potential of the Lakehouse
architecture.
 This unified approach to governance accelerates data and AI initiatives while simplifying regulatory compliance.
 Unity Catalog provides centralized access control, auditing, lineage, and data discovery capabilities across
Databricks workspaces.
Databricks Unity Catalog's Architecture
Unity Catalog's
Architecture
Key Components of Databricks Unity Catalog
The Databricks Lakehouse architecture combines data stored
with the Delta Lake protocol in cloud object storage with
metadata registered to a metastore.
There are five primary objects in the Databricks Lakehouse:
▪ Catalog: a grouping of databases.
▪ Database or schema: a grouping of objects in a catalog.
Databases contain tables, views, and functions.
▪ Table: a collection of rows and columns stored
as data files in object storage.
▪ View: a saved query typically against one or
more tables or data sources.
▪ Function: saved logic that returns a scalar value or set
of rows.
Databricks Unity Catalog: Features and capabilities
The Unity Catalog's meta store is a blend of data catalog features, each designed to ease the journey of data
management.
Databricks Unity Catalog: Features and capabilities
Data discovery and Exploration: Unified visibility into data and AI
 Discover and classify structured and unstructured data, ML models, notebooks, dashboards and arbitrary files
on any cloud.
 Users can easily discover and explore available datasets and data assets using Unity Catalog's intuitive
interface.
 It supports searching, filtering, and browsing metadata based on attributes such as dataset name, description,
tags, schema, and lineage.
Databricks Unity Catalog: Features and capabilities
Data Lineage: Know about your data journey
 Unity Catalog offers comprehensive data lineage capabilities that enable users to track the flow of data from its
source to consumption.
 It provides visibility into data transformations, ETL processes, and data dependencies, helping users understand
data provenance and impact analysis.
Databricks Unity Catalog: Features and capabilities
Access control: Single permission model for data and AI
 Simplify access management with a unified interface to define access policies on data and AI assets and
consistently apply and audit these policies on any cloud or data platform.
Databricks Unity Catalog: Features and capabilities
Data Sharing: Open data sharing
 With unity catalog we can easily share data and AI assets across clouds, regions and platforms with open source
Delta Sharing, natively integrated within Unity Catalog.
 Securely collaborate with anyone, anywhere to unlock new revenue streams and drive business value, without
relying on proprietary formats, complex ETL processes or costly data replication.
Databricks Unity Catalog: Features and capabilities
AI-powered monitoring and observability
 Harness the power of AI to automate monitoring, diagnose errors and uphold data and ML model quality.
 Benefit from proactive alerts that automatically detect personally identifiable information (PII) data, track model
drift, and effectively resolve issues within your data and AI pipelines to maintain accuracy and integrity.
Benefits of Using Databricks Unity Catalog
 Enhanced Data Visibility and Transparency: Centralized metadata repository provides a single
source of truth for all data assets.
 Improved Data Quality and Consistency: Data lineage tracking helps identify data quality issues
and ensure consistency.
 Accelerated Data Discovery and Analysis: Data catalog simplifies data discovery, leading to
faster insights and analysis.
 Simplified Compliance and Regulatory Reporting: Policy management ensures adherence to
regulatory requirements and simplifies compliance reporting.
Q/A
https://www.databricks.com/
https://www.databricks.com/product/unity-catalog
https://learn.microsoft.com/en-IN/azure/databricks/
https://delta.io/
References
Data governance with Unity Catalog Presentation

More Related Content

What's hot

Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 

What's hot (20)

Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Building a Data Governance Strategy
Building a Data Governance StrategyBuilding a Data Governance Strategy
Building a Data Governance Strategy
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
DATA & ANALYTICS
DATA & ANALYTICSDATA & ANALYTICS
DATA & ANALYTICS
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity Resolution
 
Top 10 Artifacts Needed For Data Governance
Top 10 Artifacts Needed For Data GovernanceTop 10 Artifacts Needed For Data Governance
Top 10 Artifacts Needed For Data Governance
 
Data science
Data scienceData science
Data science
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Essential Metadata Strategies
Essential Metadata StrategiesEssential Metadata Strategies
Essential Metadata Strategies
 
Linking Data Governance to Business Goals
Linking Data Governance to Business GoalsLinking Data Governance to Business Goals
Linking Data Governance to Business Goals
 
Best practices data management
Best practices data managementBest practices data management
Best practices data management
 
Architecting Modern Data Platforms
Architecting Modern Data PlatformsArchitecting Modern Data Platforms
Architecting Modern Data Platforms
 

Similar to Data governance with Unity Catalog Presentation

Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo
 

Similar to Data governance with Unity Catalog Presentation (20)

Advantage of Quilt - HR Sports
Advantage of Quilt - HR SportsAdvantage of Quilt - HR Sports
Advantage of Quilt - HR Sports
 
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsEnterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
 
Alteryx Tutorial Step by Step Guide for Beginners
Alteryx Tutorial Step by Step Guide for BeginnersAlteryx Tutorial Step by Step Guide for Beginners
Alteryx Tutorial Step by Step Guide for Beginners
 
Adopting a Logical Data Architecture for Today's Data and Analytics Requirements
Adopting a Logical Data Architecture for Today's Data and Analytics RequirementsAdopting a Logical Data Architecture for Today's Data and Analytics Requirements
Adopting a Logical Data Architecture for Today's Data and Analytics Requirements
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
 
Next Gen Analytics Going Beyond Data Warehouse
Next Gen Analytics Going Beyond Data WarehouseNext Gen Analytics Going Beyond Data Warehouse
Next Gen Analytics Going Beyond Data Warehouse
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Speak to Your Data
Speak to Your DataSpeak to Your Data
Speak to Your Data
 
introduction to data warehousing
introduction to data warehousingintroduction to data warehousing
introduction to data warehousing
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
DBArtisan XE6 Datasheet
DBArtisan XE6 DatasheetDBArtisan XE6 Datasheet
DBArtisan XE6 Datasheet
 
RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
 
IOE_Individual.pptx
IOE_Individual.pptxIOE_Individual.pptx
IOE_Individual.pptx
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
Denodo’s Data Catalog: Bridging the Gap between Data and Business (APAC)
Denodo’s Data Catalog: Bridging the Gap between Data and Business (APAC)Denodo’s Data Catalog: Bridging the Gap between Data and Business (APAC)
Denodo’s Data Catalog: Bridging the Gap between Data and Business (APAC)
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 

More from Knoldus Inc.

More from Knoldus Inc. (20)

Clean Code in Test Automation Differentiating Between the Good and the Bad
Clean Code in Test Automation  Differentiating Between the Good and the BadClean Code in Test Automation  Differentiating Between the Good and the Bad
Clean Code in Test Automation Differentiating Between the Good and the Bad
 
Integrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test AutomationIntegrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test Automation
 
State Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptxState Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptx
 
Authentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptxAuthentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptx
 
OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)
 
Supply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxSupply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptx
 
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingMastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
 
Akka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionAkka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On Introduction
 
Entity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxEntity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptx
 
Introduction to Redis and its features.pptx
Introduction to Redis and its features.pptxIntroduction to Redis and its features.pptx
Introduction to Redis and its features.pptx
 
GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdf
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptx
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable Testing
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose Kubernetes
 
Introduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxIntroduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptx
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptx
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptx
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptx
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake Presentation
 

Recently uploaded

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Recently uploaded (20)

Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 

Data governance with Unity Catalog Presentation

  • 1. Data Governance with Databricks Unity Catalog Presenter Name Kundan Kumar
  • 2. Lack of etiquette and manners is a huge turn off. KnolX Etiquettes  Punctuality Join the session 5 minutes prior to the session start time. We start on time and conclude on time!  Feedback Make sure to submit a constructive feedback for all sessions as it is very helpful for the presenter.  Silent Mode Keep your mobile devices in silent mode, feel free to move out of session in case you need to attend an urgent call.  Avoid Disturbance Avoid unwanted chit chat during the session.
  • 3. 1. Introduction to Data Governance 2. Challenges in Data Governance 3. Introduction to Databricks Unity Catalog 4. Key Components of Databricks Unity Catalog 5. Databricks Unity Catalog: Features and capabilities 6. Benefits of Using Databricks Unity Catalog
  • 4. Introduction to Data Governance  Data governance is the process of managing the availability, usability, integrity and security of the data in enterprise systems.  Data governance is a set of processes, policies, and standards that organizations use to manage their data assets effectively.  The goal of data governance is to ensure that data is of high quality, accessible, secure, and compliant with regulations and standards.  Data governance is a holistic approach to data management that encompasses people, processes, policies, and technology.
  • 5. Challenges in Data Governance
  • 6. Introduction to Databricks and Lakehouse  Databricks is a unified, open analytics platform for building, deploying, sharing, and maintaining enterprise-grade data, analytics, and AI solutions at scale.  Databricks provides tools that help you connect your sources of data to one platform to process, store, share, analyze, model, and monetize datasets with solutions from BI to generative AI.  Databricks can be used as a powerful component within a Data Lakehouse architecture to streamline data processing, analytics, and machine learning tasks.
  • 7. Databricks Unity Catalog  Databricks Unity Catalog offers a unified governance layer for data and AI within the Databricks Data Intelligence Platform.  With Unity Catalog, organizations can seamlessly govern their structured and unstructured data, machine learning models, notebooks, dashboards and files on any cloud or platform.  Data scientists, analysts and engineers can use Unity Catalog to securely discover, access and collaborate on trusted data and AI assets, leveraging AI to boost productivity and unlock the full potential of the Lakehouse architecture.  This unified approach to governance accelerates data and AI initiatives while simplifying regulatory compliance.  Unity Catalog provides centralized access control, auditing, lineage, and data discovery capabilities across Databricks workspaces.
  • 8. Databricks Unity Catalog's Architecture Unity Catalog's Architecture
  • 9. Key Components of Databricks Unity Catalog The Databricks Lakehouse architecture combines data stored with the Delta Lake protocol in cloud object storage with metadata registered to a metastore. There are five primary objects in the Databricks Lakehouse: ▪ Catalog: a grouping of databases. ▪ Database or schema: a grouping of objects in a catalog. Databases contain tables, views, and functions. ▪ Table: a collection of rows and columns stored as data files in object storage. ▪ View: a saved query typically against one or more tables or data sources. ▪ Function: saved logic that returns a scalar value or set of rows.
  • 10. Databricks Unity Catalog: Features and capabilities The Unity Catalog's meta store is a blend of data catalog features, each designed to ease the journey of data management.
  • 11. Databricks Unity Catalog: Features and capabilities Data discovery and Exploration: Unified visibility into data and AI  Discover and classify structured and unstructured data, ML models, notebooks, dashboards and arbitrary files on any cloud.  Users can easily discover and explore available datasets and data assets using Unity Catalog's intuitive interface.  It supports searching, filtering, and browsing metadata based on attributes such as dataset name, description, tags, schema, and lineage.
  • 12. Databricks Unity Catalog: Features and capabilities Data Lineage: Know about your data journey  Unity Catalog offers comprehensive data lineage capabilities that enable users to track the flow of data from its source to consumption.  It provides visibility into data transformations, ETL processes, and data dependencies, helping users understand data provenance and impact analysis.
  • 13. Databricks Unity Catalog: Features and capabilities Access control: Single permission model for data and AI  Simplify access management with a unified interface to define access policies on data and AI assets and consistently apply and audit these policies on any cloud or data platform.
  • 14. Databricks Unity Catalog: Features and capabilities Data Sharing: Open data sharing  With unity catalog we can easily share data and AI assets across clouds, regions and platforms with open source Delta Sharing, natively integrated within Unity Catalog.  Securely collaborate with anyone, anywhere to unlock new revenue streams and drive business value, without relying on proprietary formats, complex ETL processes or costly data replication.
  • 15. Databricks Unity Catalog: Features and capabilities AI-powered monitoring and observability  Harness the power of AI to automate monitoring, diagnose errors and uphold data and ML model quality.  Benefit from proactive alerts that automatically detect personally identifiable information (PII) data, track model drift, and effectively resolve issues within your data and AI pipelines to maintain accuracy and integrity.
  • 16. Benefits of Using Databricks Unity Catalog  Enhanced Data Visibility and Transparency: Centralized metadata repository provides a single source of truth for all data assets.  Improved Data Quality and Consistency: Data lineage tracking helps identify data quality issues and ensure consistency.  Accelerated Data Discovery and Analysis: Data catalog simplifies data discovery, leading to faster insights and analysis.  Simplified Compliance and Regulatory Reporting: Policy management ensures adherence to regulatory requirements and simplifies compliance reporting.
  • 17. Q/A