#DenodoDataFest
Logical Data Fabric: Architectural Components
Executive VP & CTO, Denodo
AL B ERTO PAN
Agenda
1. Why a Data Fabric ?
2. Components of Data Fabric Architectures
3. Denodo Logical Data Fabric
3
The core of the matter is being able to consolidate many diverse data sources in an
efficient manner by allowing trusted data to be delivered from all relevant data
sources to all relevant data consumers through one common layer.
Source: Demystifying the Data Fabric, Gartner, September 2020
The Data fabric focuses on automating the process integration, transformation,
preparation, curation, security, governance, and orchestration to enable analytics
and insights quickly for business success.
Source: Enterprise Data Fabric Wave, Forrester, June 2020
4
Data Fabric: Supported by the Major Analysts
Source: Forrester Enterprise Data Fabric Wave,
June 2020
Source: Demystifying the Data Fabric Gartner,,
September 2020
5
Data Fabric: Data Integration and Delivery
6
Data Virtualization: Logical Data Delivery for the Business
Development
Lifecycle
Monitoring & Audit
Governance
Security
Development Tools
/ SDK
Scheduler
Cache
Optimiser
JDBC/ODBC/ADO.Net REST / GraphQL / OData
U
LoB
View
Mart
View
J
Application
Layer
Business
Layer
Unified View Unified View
Unified View
Unified View
A
J
J
Derived View Derived View
J
J
S
Transformation
& Cleansing
Data
Source
Layer
Base
View
Base
View
Base
View
Base
View
Base
View
Base
View
Base
View
Abstraction
Query Execution
Source
Abstraction
Virtual
Modelling
Business
Delivery
Query Optimizer
Security & Governance
Query Engine
Delegate processing to data sources
▪ Transparently switch workloads according to cost or
performance
Most advanced execution engine for distributed scenarios
▪ Unique techniques automatically rewrite user queries to
maximize pushdown
▪ Leverage MPP capabilities of data sources to deal with large
data volumes
Advancing Caching / Acceleration Mechanisms
▪ Selectively materialize subsets of the data for protecting data
sources and query acceleration
8
Integrated ETL / ELT Support: Remote Tables
Create table in any location
Load with data from any other data source
Examples:
• Data Lake
management
• Replicate data in lake
when needed
• Data Science
• Move data to Spark
after initial analysis
• Cloud Migrations
• Replicate and
update data to
cloud system
SCH
Often triggered from
Scheduler
9
Data Fabric: Graph-Based Semantics and Governance
Source: Forrester Enterprise Data Fabric Wave,
June 2020
Source: Demystifying the Data Fabric Gartner,,
September 2020
10
Data Virtualization for Data Governance
Single Entry Point
for Enforcing
Security and
Governance
Policies
Single Source of
Truth / Canonical
Views
Who is Doing /
Accessing What,
When and How
Fewer copies of
personal data.
Lineage of copies
is available.
Data on-premises
and off, combined
through the same
governed virtual
layer
11
Conformance with Semantic Models
11
Create or Import Semantic Models:
• RDF/OWL, Power Designer, ER Studio, Erwin,
IBM Data Architect…
“Contract” between modelers and developers
• Ensures data complies with standard data
models and governance rules
Allows parallel development of consuming
applications and virtual models
12
Impact Analysis: Leverage Graph Relationships and Dependencies
Example: adding a new field to a data source
1
Views affected
by the change
2
Web Services
affected by the
change
3
Option to propagate
new field individually
per view
4
Preview of the
Tree view of the
affected views
13
Data Fabric: Active / Augmented Data Catalog
Source: Forrester Enterprise Data Fabric Wave,
June 2020
Source: Demystifying the Data Fabric Gartner,,
September 2020
MY RECOMENDATIONS
Data Marketplace for the Business:
Discover and contextualize interesting
datasets
Search, query and Prepare Data.
Consume with any visualization /
reporting tool
Personalized recommendations
and shortcuts to most
used datasets. Think Netflix, but your data
15
Denodo Data Catalog: Datasets Usage
16
Data Fabric: ML for Task Automation
Source: Forrester Enterprise Data Fabric Wave,
June 2020
Source: Demystifying the Data Fabric Gartner,,
September 2020
17
Automatically Recommend Caching / DM Strategies for Enhanced Performance
Denodo 8: ML for Smart Query Acceleration
SELECT PROD, SUM(PRICE) FROM…
SELECT CUST, COUNT(PRICE) FROM…
SELECT PROD, MIN (SALE_DATE) FROM…
SELECT PROD, SHOP, SUM (PRICE)
…
Previous Queries
SELECT PROD, CUST, SUM(PRICE),
COUNT(PRICE) …
SELECT PROD, SHOP, MIN(SALE_DATE),
SUM(PRICE)
…
Caching Expressions
(Intermediate Aggregates)
Cache Database
(or Data Source) Data Sources
18
Key Takeaways
The Data Fabric plays a crucial role in the most pressing data management challenges today,
enabling agile delivery of trusted and governed data to any consumer
Major market analysts have identified the main components/pillars the Fabric needs to support
Denodo’s Logical Data Fabric allows to create a Data Fabric today accelerating the delivery of
trusted data from any location and to any consumer up to 80%
© Copyright Denodo Technologies. All rights reserved
Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written
authorization from Denodo Technologies.

Logical Data Fabric: Architectural Components

  • 1.
    #DenodoDataFest Logical Data Fabric:Architectural Components Executive VP & CTO, Denodo AL B ERTO PAN
  • 2.
    Agenda 1. Why aData Fabric ? 2. Components of Data Fabric Architectures 3. Denodo Logical Data Fabric
  • 3.
    3 The core ofthe matter is being able to consolidate many diverse data sources in an efficient manner by allowing trusted data to be delivered from all relevant data sources to all relevant data consumers through one common layer. Source: Demystifying the Data Fabric, Gartner, September 2020 The Data fabric focuses on automating the process integration, transformation, preparation, curation, security, governance, and orchestration to enable analytics and insights quickly for business success. Source: Enterprise Data Fabric Wave, Forrester, June 2020
  • 4.
    4 Data Fabric: Supportedby the Major Analysts Source: Forrester Enterprise Data Fabric Wave, June 2020 Source: Demystifying the Data Fabric Gartner,, September 2020
  • 5.
    5 Data Fabric: DataIntegration and Delivery
  • 6.
    6 Data Virtualization: LogicalData Delivery for the Business Development Lifecycle Monitoring & Audit Governance Security Development Tools / SDK Scheduler Cache Optimiser JDBC/ODBC/ADO.Net REST / GraphQL / OData U LoB View Mart View J Application Layer Business Layer Unified View Unified View Unified View Unified View A J J Derived View Derived View J J S Transformation & Cleansing Data Source Layer Base View Base View Base View Base View Base View Base View Base View Abstraction
  • 7.
    Query Execution Source Abstraction Virtual Modelling Business Delivery Query Optimizer Security& Governance Query Engine Delegate processing to data sources ▪ Transparently switch workloads according to cost or performance Most advanced execution engine for distributed scenarios ▪ Unique techniques automatically rewrite user queries to maximize pushdown ▪ Leverage MPP capabilities of data sources to deal with large data volumes Advancing Caching / Acceleration Mechanisms ▪ Selectively materialize subsets of the data for protecting data sources and query acceleration
  • 8.
    8 Integrated ETL /ELT Support: Remote Tables Create table in any location Load with data from any other data source Examples: • Data Lake management • Replicate data in lake when needed • Data Science • Move data to Spark after initial analysis • Cloud Migrations • Replicate and update data to cloud system SCH Often triggered from Scheduler
  • 9.
    9 Data Fabric: Graph-BasedSemantics and Governance Source: Forrester Enterprise Data Fabric Wave, June 2020 Source: Demystifying the Data Fabric Gartner,, September 2020
  • 10.
    10 Data Virtualization forData Governance Single Entry Point for Enforcing Security and Governance Policies Single Source of Truth / Canonical Views Who is Doing / Accessing What, When and How Fewer copies of personal data. Lineage of copies is available. Data on-premises and off, combined through the same governed virtual layer
  • 11.
    11 Conformance with SemanticModels 11 Create or Import Semantic Models: • RDF/OWL, Power Designer, ER Studio, Erwin, IBM Data Architect… “Contract” between modelers and developers • Ensures data complies with standard data models and governance rules Allows parallel development of consuming applications and virtual models
  • 12.
    12 Impact Analysis: LeverageGraph Relationships and Dependencies Example: adding a new field to a data source 1 Views affected by the change 2 Web Services affected by the change 3 Option to propagate new field individually per view 4 Preview of the Tree view of the affected views
  • 13.
    13 Data Fabric: Active/ Augmented Data Catalog Source: Forrester Enterprise Data Fabric Wave, June 2020 Source: Demystifying the Data Fabric Gartner,, September 2020
  • 14.
    MY RECOMENDATIONS Data Marketplacefor the Business: Discover and contextualize interesting datasets Search, query and Prepare Data. Consume with any visualization / reporting tool Personalized recommendations and shortcuts to most used datasets. Think Netflix, but your data
  • 15.
  • 16.
    16 Data Fabric: MLfor Task Automation Source: Forrester Enterprise Data Fabric Wave, June 2020 Source: Demystifying the Data Fabric Gartner,, September 2020
  • 17.
    17 Automatically Recommend Caching/ DM Strategies for Enhanced Performance Denodo 8: ML for Smart Query Acceleration SELECT PROD, SUM(PRICE) FROM… SELECT CUST, COUNT(PRICE) FROM… SELECT PROD, MIN (SALE_DATE) FROM… SELECT PROD, SHOP, SUM (PRICE) … Previous Queries SELECT PROD, CUST, SUM(PRICE), COUNT(PRICE) … SELECT PROD, SHOP, MIN(SALE_DATE), SUM(PRICE) … Caching Expressions (Intermediate Aggregates) Cache Database (or Data Source) Data Sources
  • 18.
    18 Key Takeaways The DataFabric plays a crucial role in the most pressing data management challenges today, enabling agile delivery of trusted and governed data to any consumer Major market analysts have identified the main components/pillars the Fabric needs to support Denodo’s Logical Data Fabric allows to create a Data Fabric today accelerating the delivery of trusted data from any location and to any consumer up to 80%
  • 19.
    © Copyright DenodoTechnologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.