Watch full webinar here: https://bit.ly/3fBpO2M
Data Fabric has been a hot topic in town and Gartner has termed it as one of the top strategic technology trends for 2022. Noticeably, many mid-to-large organizations are also starting to adopt this logical data fabric architecture while others are still curious about how it works.
With a better understanding of data fabric, you will be able to architect a logical data fabric to enable agile data solutions that honor enterprise governance and security, support operations with automated recommendations, and ultimately, reduce the cost of maintaining hybrid environments.
In this on-demand session, you will learn:
- What is a data fabric?
- How is a physical data fabric different from a logical data fabric?
- Which one should you use and when?
- What’s the underlying technology that makes up the data fabric?
- Which companies are successfully using it and for what use case?
- How can I get started and what are the best practices to avoid pitfalls?
3. Agenda
1. What is a Data Fabric?
2. Physical vs. Logical Data Fabric
3. Which one to use and when?
4. Underlying technology of a Data Fabric
5. Successful Customer Use Cases
6. Q&A
7. Next Steps
5. 5
A data fabric is an architecture pattern that informs and automates the design,
integration and deployment of data objects regardless of deployment platforms and
architectural approaches.
It utilizes continuous analytics and AI/ML over all metadata assets to provide actionable
insights and recommendations on data management and integration design and
deployment patterns.
This results in faster, informed and, in some cases, completely automated data access
and sharing.
Data Fabric Definition
6. 6
Pictorial View of a Data Fabric – from Gartner
Data Fabric Net
Compounds Customers Products Claims
RDBMS/OLTP Traditional Analytics/BI Data Lakes Cloud Data Stores Apps and Document
Repositories
Flat Files
Third Party
Legacy
Mart
Data Warehouse
Mart
ETL ETL
XML • JSON • PDF
DOC • WEB
7. 7
What is a Data Fabric?
In Layman Terms
1. “Integrate data” from disparate data sources
2. Securely deliver an “integrated view” of the different data objects
3. Consume the “integrated data” for analytics and operational purposes
4. Automate the entire process using AI/ML
8. 8
Gartner: Data Fabric Benefits
By 2023, organizations utilizing data fabrics to dynamically connect, optimize and automate data
management processes will:
Reduce time to data delivery by 30% and
Automate manual transformations by 45%.
Gartner - 2019 Magic Quadrant for Data Integration tools
Gartner - 2020 Magic Quadrant for Data and Analytics Service Providers
By 2023, data fabric enabled automation in data management and integration will:
Reduce dependency for IT specialists by 20%
Reduce integration costs by 45%.
11. 11
Gartner: Data Fabric Architecture
Data
Consumers
Data
Sources
Final Data Integration and Orchestration Layer
Insights and Automation Layer
Active Metadata
Knowledge Graph Enriched With Semantics
Augmented Data Catalog
Data
Consumers
Data
Sources
Data Fabric
12. 12
A logical data layer – a “data fabric” – that provides high-performant, real-time, and secure
access to integrated business views of disparate data across the enterprise.
Data Virtualization: Logical Data Fabric
• Data Abstraction: decoupling
applications/data usage from data
sources
• Data Integration without
replication or relocation of physical
data
• Easy Access to Any Data, high
performant and real-time/ right-
time
• Data Catalog for self-service
data services and easy discovery
• Unified metadata, security &
governance across all data
assets
• Data Delivery in any format
with intelligent query
optimization that leverages new
and existing physical data
platforms
14. 14
Based on Use Case Requirements
Which One to Use – Physical vs Logical Data Fabric?
Physical Data Fabric
§ When analytical needs demand storing
data in a different schema
§ When operational needs demand
storing data for data management
needs. e.g., MDM
§ When NOT to use – Not as an enterprise
data layer
Logical Data Fabric
§ When there’s a need to unify of all enterprise
data
§ When regulations prevent replicating data
into a separate repository. e.g. government,
pharma
§ When NOT to use – When data needs heavy
transformation and subsequent persistence
15. 15
Physical Data Fabric
§ Benefits
§ Data is readily available to use
§ Costs
§ Replication is takes time and increases
storage costs
§ Data can get out of sync between the
sources and the physical data fabric
Logical Data Fabric
§ Benefits
§ Data delivery is faster – no need to
replicate the data
§ Data is delivered real-time, fresh from the
source
§ Costs
§ Repeated transformations take time
§ Myth: Logical Data Fabric is slow!
Benefits vs Costs
Which One to Use – Physical vs Logical Data Fabric?
16. 16
Data Fabrics Will Combine Both Logical and Physical Data
Data Fabric Net
Compounds Customers Products Claims
RDBMS/OLTP Traditional Analytics/BI Data Lakes Cloud Data Stores Apps and Document
Repositories
Flat Files
Third Party
Legacy
Mart
Data Warehouse
Mart
ETL ETL
XML • JSON • PDF
DOC • WEB
19. 19
Denodo Data Fabric Architecture in Azure
DATA FLOW
• Static data residing on-premise in Microsoft data sources
(MS SQL databases, Excel spreadsheets etc.) extracted,
transformed, and moved to Azure Cloud data
repositories (Azure Synapse Analytics, Azure Cosmos DB,
etc.) by Azure Data Factory.
• Real-time data (e.g. web-logs) loaded to Azure
Databricks with Azure HDInsights.
• Third-party on-premise (SAP, Oracle) and cloud data
(Salesforce etc.) connected to Denodo Data
Virtualization platform.
• Azure-based data sources (Azure Synapse Analytics,
Azure Data Lake Storage, Azure Databricks) connected to
Denodo Data Virtualization platform providing unified
hybrid abstraction layer.
• Data from Kafka topics can be virtualized in Denodo
platform for real-time alerting and dashboarding.
• All connected data sources are combined, secured, and
exposed as Data Services over SQL (6) and API (7)
interfaces.
• Exposed virtual datasets consumed by Power BI,
analytical applications, Data Science tools,
Enterprise Marketplace portals, any real-time and
mobile applications.
2
1
3
4
5
6
7
20. 20
Denodo Data Fabric Architecture in AWS
DATA FLOW
• Data from cloud and on-prem sources is loaded into
AWS relational and Hadoop-based stores for analytical
and operational processing.
• On premise data from applications, databases, files,
and other sources are virtualized by the Denodo engine
providing unified and secure access point.
• Structured, semi-structured, and unstructured data
residing in AWS stores is combined with the data
coming from the cloud applications and on-prem data
sources delivering the real-time gateway for the end-
user consumption. Required virtual data marts are built
inside Denodo Platform for AWS.
• Data is consumed by Amazon QuickSight or any other
BI or analytical tools through SQL-based interfaces or
by applications and other tools through REST and
OData APIs.
2
1
3
4
22. 22
Ultra Mobile – U.S.-based MVNO
UVNV is the parent company of
| Ultra Mobile
Ultra Mobile launched in 2012 w/focus on expatriates living
in the US who needed inexpensive mobile service. Ultra
was honored as the fastest growing company in the US!
| Mint Mobile
Mint Mobile launched in 2016 as the first completely on-line
sales model in the US for wireless. It has grown incredibly
fast and continues to build momentum.
| Plum Mobile
Plum Mobile launched in late 2020 and was established as
a wholesaler to capture business moving off Sprint, AT&T
and Verizon.
23. 23
Ultra Mobile – Logical Data Fabric
Requirements
• Expectations that customers moving to PLUM from
other MVNOs have AS-IS requirements differentiated
from our other brands (Restful APIs)
• Absolute requirement to keep customer’s data and
reporting separate from ‘House’ brands
(Data Security and Governance)
• Similar to doing many M&As a month with pace
increasing every month
(Data Virtualization to speed on-boarding)
• Need to provide consistent integration with multiple
back-end systems contributing
(Data Virtualization for consistent on-boarding)
Example Use Cases
• Customer Information Portal (individual and aggregated)
• Disaster Management Services
• ‘Port Out’ Security Risk Management
24. 24
Total EP – Oil & Gas, Exploration and Production
Challenges
• Point-to-point application data exchange
• Growing application portfolio made this too
inflexible and slow
• Different teams creating their own P2P
integrations
• No overall architecture or governance
Solution
• Implemented Data Fabric based on Data
Virtualization
• Provided common view of data entities
(Standardized Data)
• Single layer to monitor data quality and usage
(Improved Data Governance)
• Accelerated time-to-value from data - project
teams not build their own solutions
26. 26
Challenges
• Maintaining separate systems for such functions as back-
office operations, data warehousing, and loan origination
• Series of mergers and acquisitions were adding to the
complexity
• Ad hoc, manual reporting process
27. 27
Solution
Business Gains
• Deposit and Loan Operations, to make timely,
accurate decisions.
• Meet the operational and analytical needs of
multiple business units within the organization.
• Reduce reporting time from up to 3 days for static
reports to as little as 2 hours.
• Perform critical business operations, such as
loan processing, in real time.
29. 29
Best Practices to Get Started with Data Fabric
1. Determine your business requirements
§ Do the business users need data fast?
§ Do they need up-to-date data from the sources?
2. Decide which data fabric is the best option – balance benefits vs costs
§ Physical data fabric when heavy transformations need to be persisted
§ Logical data fabric when integrating all enterprise data – faster and fresh
§ Best option: Logical data fabric with cache
3. Start small with data fabric and grow big
§ Don’t boil the ocean!