SlideShare a Scribd company logo
1 of 37
Download to read offline
DENODO LUNCH AND LEARN ASEAN
Data Lakes: A Logical Approach
for Faster Unified Insights
Speakers
Elaine Chan
Regional Vice President,
ASEAN & Korea
Chris Day
Director,
APAC Sales Engineering
Agenda
DENODO LUNCH AND LEARN ASEAN
1. What is a Data Lake?
2. Why Do They Exist ?
3. Some of the Challenges of Data Lakes
4. The Benefits of a Logical Approach to Data Lakes
5. Customer Case Study
6. Demo
7. Conclusion
8. Q&A
9. Next Steps
4
DENODO LUNCH AND LEARN ASEAN
A Brief History
Data Lake
5
DENODO LUNCH AND LEARN ASEAN
Etymology of “Data Lake”
Pentaho’s CTO James Dixon is credited with coining the
term "data lake". He described it in his blog in 2010:
"If you think of a data mart as a store of bottled water – cleansed
and packaged and structured for easy consumption – the data
lake is a large body of water in a more natural state. The contents
of the data lake stream in from a source to fill the lake, and
various users of the lake can come to examine, dive in, or take
samples."
https://jamesdixon.wordpress.com/2010/10/14/pentaho-hadoop-and-data-lakes/
6
DENODO LUNCH AND LEARN ASEAN
Data lakes were born to efficiently address
the challenge of cost reduction.
Data lakes allow for cheap, efficient
storage of very large amounts of data.
Cloud implementation simplified the
complexity of managing a large data lake.
7
The Data Lake – Architecture I
Distributed File System
Cheap storage for large data volumes
• Support for multiple file formats (Parquet, CSV,
JSON, etc)
• Examples:
• On-prem: HDFS
• Cloud native: AWS S3, Azure ADLS, Google GCS
8
The Data Lake – Architecture II
Distributed File System
Execution Engine
Massively parallel & scalable
execution engine
• Cheaper execution than traditional EDW
architectures
• Decoupled from storage
• Doesn’t require specialized HW
• Examples:
• SQL-on-Hadoop engines: Spark, Hive, Impala,
Drill, Dremio, Presto, etc.
• Cloud native: AWS Redshift, Snowflake, AWS
Athena, Delta Lake, GCP BigQuery
9
The Data Lake – Architecture III
Adoption of new transformation
techniques
• Data ingested is normally raw and unusable by end
users
• Data is transformed and moved to different “zones”
with different levels of curation
• End users only access the refined zone
• Use of ELT as a cheaper transformation technique
than ETL
• Use of the engine and storage of the lake for data
transformation instead of external ETL flows
• Removes the need for additional staging HW
Raw zone Trusted zone Refined Zone
Distributed File System
Execution Engine
10
Data Lake Example – AWS
§ Data ingested using AWS Glue (or other ETL tools)
§ Raw data stored in S3 object store
§ Maintain fidelity and structure of data
§ Metadata extracted/enriched using Glue Data
Catalog
§ Business rules/DQ rules applied to S3 data as
copied to Trusted Zone data stores
§ Trusted Zone contains more than one data store –
select best data store for data and data processing
§ Refined Zone contains data for consumer – curated
data sets (data marts?)
§ Refined Zone data stores differ – Redshift, Athena,
Snowflake, …
TRUSTED ZONE
RAW ZONE
S3 for raw data
INGESTION
Data Sources
Internal
&
External
AWS Glue
Consumers
Data Portals
BI – Visualization
Analytic
Workbench
Mobile Apps
Etc.
REFINED ZONE
11
Hadoop-Based Data Lakes – A Data Scientist’s Playground
§ The early data scientists saw Hadoop as their
personal supercomputer.
§ Hadoop-based Data Lakes helped democratize
access to state-of-the-art supercomputing with off-
the-shelf HW (and later cloud)
§ The industry push for BI made Hadoop–based
solutions the standard to bring modern analytics to
any corporation.
Hadoop-based Data Lakes became
“data science silos”
12
DENODO LUNCH AND LEARN ASEAN
Can data lakes also address
the other data management
challenges?
Can they provide fast
decision making with proper
governance and security?
13
Changing the Data Lake Goals
“The popular view is that a
data lake will be the one
destination for all the data
in their enterprise and the
optimal platform for all
their analytics.”
Nick Heudecker, Gartner
14
DENODO LUNCH AND LEARN ASEAN
Rick Van der Lans, R20 Consultancy
Multi-purpose data lakes are data delivery environments
developed to support a broad range of users, from traditional
self-service BI users (e.g. finance, marketing, human resource,
transport) to sophisticated data scientists.
Multi-purpose data lakes allow a broader and deeper use of the
data lake investment without minimizing the potential value for
data science and without making it an inflexible environment.
15
DENODO LUNCH AND LEARN ASEAN
The Data Lake as the Repository of All Data
Is that realistic? And even, if possible, it comes with multiple trade-offs:
COST
GOVERNANCE
• Huge up-front investment
Creating ingestion pipelines for all company datasets into the lake is costly.
• Large recurrent maintenance costs
Those pipelines need to be constantly modified as data structures change in the sources
Efficient use of the data lake to accelerate insights comes at the cost of price, time-to-market and governance
• Risk of inconsistencies
Data needs to be frequently synchronized to avoid stale datasets
• Loss of capabilities
Data lake capabilities may differ from those of original sources, e.g. quick access by ID in
operational RDBMS
16
DENODO LUNCH AND LEARN ASEAN
Restricting the use of the data lake to a specific use case (eg: Data Science)
Purpose-specific Data Lakes
TTM
SECURITY
An environment with multiple purpose-specific systems slows down TTM and jeopardizes security and governance
• Higher Complexity
End Users need to find where data is and how to use it
• Risk of Inconsistencies
Data may be in multiple places, in different formats and calculated at different times
• Loss of Security
Frustrations increase the use of Shadow IT, “personal” extracts, uncontrolled data prep
flows, etc.
17
Data Lakes in the ‘Pit of Despair’
Data Lakes are 2-5 years
from Plateau of Productivity
and are deep in the Trough
of Disillusionment
Gartner – Hype Cycle Data Management July 2021
18
Gartner – The Evolution of Analytical Environments
This is a Second Major Cycle of Analytical Consolidation
Operational Application
Operational Application
Operational Application
IoT Data
Other NewData
Operational
Application
Operational
Application Cube
Operational
Application Cube
? Operational Application
Operational Application
Operational Application
IoT Data
Other NewData
1980s
Pre EDW
1990s
EDW
2010s
2000s
Post EDW
Time
LDW
Operational
Application
Operational
Application
Operational
Application
Data
Warehouse
Data
Warehouse
Data
Lake
?
LDW
Data Warehouse
Data Lake
Marts
ODS
Staging/Ingest
Unified analysis
› Consolidated data
› "Collect the data"
› Single server, multiple nodes
› More analysis than any
one server can provide
©2018 Gartner, Inc.
Unified analysis
› Logically consolidated view of all data
› "Connect and collect"
› Multiple servers, of multiple nodes
› More analysis than any one system can provide
ID: 342254
Fragmented/
nonexistent analysis
› Multiple sources
› Multiple structured sources
Fragmented analysis
› "Collect the data" (Into
› different repositories)
› New data types,
› processing, requirements
› Uncoordinated views
“Adopt the Logical Data Warehouse Architecture to Meet Your Modern Analytical Needs”. Henry Cook, Gartner April 2018
19
Gartner – Logical Data Warehouse
“Adopt the Logical Data Warehouse Architecture to Meet Your Modern Analytical Needs”. Henry Cook, Gartner April 2018
DATA VIRTUALIZATION
20
…Data lakes lack semantic consistency and governed
metadata. Meeting the needs of wider audiences require
curated repositories with governance, semantic
consistency and access controls.”
21
DENODO LUNCH AND LEARN ASEAN
How can a logical data
fabric approach help?
22
DENODO LUNCH AND LEARN ASEAN
Faster Time-to-Market for Data Projects
Why?
• The Data Virtualization Platform allows you to connect directly to all kinds of data sources (EDW, application
databases, SaaS applications, etc.)
• Thus not all data needs to be replicated to the data lake for consumers to access it from a single (virtual)
repository.
• In some cases, it makes sense to replicate in the lake, for others it doesn’t. Data Virtualization opens that door
Capabilities
• Data can be accessed immediately, easily improving TTM and ROI of the lake
• If data is not useful, time was not lost preparing pipelines and copying data
• Can ingest and synchronize data into the lake efficiently when needed
• Denodo can load and update data into the data lake natively, using Parquet, and parallel
loads
• Execution is pushed down to original sources, taking advantage of their capabilities
• Especially significant in the case of EDW with strong processing capabilities
TTM
COST
23
DENODO LUNCH AND LEARN ASEAN
Easier Self-Service through a Single Data Delivery Layer
Why?
• From an end user perspective, access to all data is done through a single layer, regardless of data formats and its
actual physical location.
• A single delivery layer also allows you to enforce security and governance policies
• The virtual layer becomes the “delivery zone” of the data lake, offering modeling and caching capabilities,
documentation and output in multiple formats
Capabilities
• Built-in rich modeling capabilities to tailor data models to end users
• Integrated catalog, search and documentation capabilities
• Access via SQL, REST, OData and GraphQL with no additional coding
• Advanced security controls, SSO, workload management, monitoring, etc.
GOVERNANCE
24
DENODO LUNCH AND LEARN ASEAN
Accelerates Query Execution
Why?
Controlling data delivery separately from storage allows a virtual layer to accelerate query execution,
providing faster response than the sources alone.
Capabilities
• Aggregate-aware capabilities to accelerate execution of analytical queries
• Flexible caching options to materialize frequently used data:
• Full datasets
• Partial results
• Hybrid (cached content + updates from source in real time)
• Powerful optimization capabilities for multi-source federated queries PERFORMANCE
25
Denodo’s Logical Data Lake
ETL
Data Warehouse
Kafka
Physical Data
Lake
Logical Data Lake
Files
ETL
Data Warehouse
Kafka
Physical Data Lake
Files
IT Storage and Processing
BI & Reporting
Mobile
Applications
Predictive Analytics
AI/ML
Real time dashboards
Consuming Tools
Query
Engine
Business
Delivery
Source
Abstraction
Business Catalog
Security and Governance
Delivery Zone
DENODO LUNCH AND LEARN ASEAN
Case Study
Problem Solution Results
Case Study
27
DENODO LUNCH AND LEARN ASEAN
Leading Construction Manufacturer Improves
Service Delivery and Revenue
§ Telemetry (IoT) data from sensors
embedded in the equipment is stored in
Hadoop to perform predictive analytics
§ Denodo integrates analytics data with
parts, maintenance, and dealer
information stored in traditional systems
§ It then feeds the predictive maintenance
information to a customer dashboard
In business for over 90 years and is the world’s leading manufacturer of construction
and mining equipment, diesel and natural gas engines, industrial gas turbines and
diesel-electric locomotive.
§ Phased rollout systematically improved
asset performance and proactive
maintenance
§ Increased revenue from sale of services
and parts
§ Reduced warranty costs of parts failure
§ Future – optimize pricing for services
and parts among global service providers
27
§ Competitive pressure from low-cost
Chinese manufacturers
§ Needed a proactive approach to
customer service to differentiate
§ Sought to improve equipment and
services delivery through predictive
maintenance
28
DENODO LUNCH AND LEARN ASEAN
Architectural Diagram
29
DENODO LUNCH AND LEARN ASEAN
Demo
DENODO LUNCH AND LEARN ASEAN
Conclusion
DENODO LUNCH AND LEARN ASEAN
Key Takeaways
1. In most cases, not all the data is going to be in the
data lake
2. Large data lake projects are complex environments
that will benefit from a virtual ‘consumption’ layer
3. Data virtualization provides a governance and
management infrastructure required for successful
data lake implementation
4. Data Virtualization is more than just a data access or
services layer, it is a key component for a Data Lake
DENODO LUNCH AND LEARN ASEAN
Q&A
DENODO LUNCH AND LEARN ASEAN
Next Steps
35
Get Started Today
Try Denodo for a Test Drive with a 30-day
free trial in the cloud marketplaces
CHOICE
Under your cloud account
SUPPORT
Community forum AND remote sales
engineer
OPPORTUNITY
30 minutes free consultation with
Denodo Cloud specialist
denodo.link/drive22
36
DENODO LUNCH AND LEARN ASEAN
Logical Data Fabric to the Rescue:
Integrating Data Warehouses,
Data Lakes, and Data Hubs
ACCESS YOUR REPORT
denodo.link/LDF21
37
DENODO LUNCH AND LEARN ASEAN
Data Democratization with
a Logical Data Fabric
REGISTER NOW
denodo.link/FD22
APAC | April 27 | 9:30 am SGT
Thanks!
www.denodo.com info@denodo.com
© Copyright Denodo Technologies. All rights reserved
Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm,
without prior the written authorization from Denodo Technologies.

More Related Content

Similar to Data Lakes: A Logical Approach for Faster Unified Insights (ASEAN)

Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...Denodo
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Denodo
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationDenodo
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItDenodo
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Denodo
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesDATAVERSITY
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesDenodo
 
What is Data Lake and its Benefits?
What is Data Lake and its Benefits?What is Data Lake and its Benefits?
What is Data Lake and its Benefits?V2Soft
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureShaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureDenodo
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data LakeIRJET Journal
 
Why a Data Services Marketplace is Critical for a Successful Data-Driven Ente...
Why a Data Services Marketplace is Critical for a Successful Data-Driven Ente...Why a Data Services Marketplace is Critical for a Successful Data-Driven Ente...
Why a Data Services Marketplace is Critical for a Successful Data-Driven Ente...Denodo
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 

Similar to Data Lakes: A Logical Approach for Faster Unified Insights (ASEAN) (20)

Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business Outcomes
 
What is Data Lake and its Benefits?
What is Data Lake and its Benefits?What is Data Lake and its Benefits?
What is Data Lake and its Benefits?
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureShaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data Lake
 
Why a Data Services Marketplace is Critical for a Successful Data-Driven Ente...
Why a Data Services Marketplace is Critical for a Successful Data-Driven Ente...Why a Data Services Marketplace is Critical for a Successful Data-Driven Ente...
Why a Data Services Marketplace is Critical for a Successful Data-Driven Ente...
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 

More from Denodo

Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoDenodo
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachDenodo
 
Achieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerAchieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerDenodo
 
What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?Denodo
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeDenodo
 
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Denodo
 
Drive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDrive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDenodo
 
Знакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхЗнакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхDenodo
 
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationData Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationDenodo
 
Denodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo
 
Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Denodo
 
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardIt’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardDenodo
 
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Denodo
 
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Denodo
 
How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?Denodo
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsDenodo
 
Enabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityEnabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityDenodo
 
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo
 
GenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesGenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesDenodo
 

More from Denodo (20)

Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in Denodo
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
 
Achieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerAchieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services Layer
 
What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business Landscape
 
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
 
Drive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDrive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory Compliance
 
Знакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхЗнакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данных
 
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationData Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
 
Denodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me Anything
 
Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!
 
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardIt’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
 
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
 
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
 
How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
 
Enabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityEnabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usability
 
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
 
GenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesGenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidades
 

Recently uploaded

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 

Recently uploaded (20)

Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 

Data Lakes: A Logical Approach for Faster Unified Insights (ASEAN)

  • 1. DENODO LUNCH AND LEARN ASEAN Data Lakes: A Logical Approach for Faster Unified Insights
  • 2. Speakers Elaine Chan Regional Vice President, ASEAN & Korea Chris Day Director, APAC Sales Engineering
  • 3. Agenda DENODO LUNCH AND LEARN ASEAN 1. What is a Data Lake? 2. Why Do They Exist ? 3. Some of the Challenges of Data Lakes 4. The Benefits of a Logical Approach to Data Lakes 5. Customer Case Study 6. Demo 7. Conclusion 8. Q&A 9. Next Steps
  • 4. 4 DENODO LUNCH AND LEARN ASEAN A Brief History Data Lake
  • 5. 5 DENODO LUNCH AND LEARN ASEAN Etymology of “Data Lake” Pentaho’s CTO James Dixon is credited with coining the term "data lake". He described it in his blog in 2010: "If you think of a data mart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples." https://jamesdixon.wordpress.com/2010/10/14/pentaho-hadoop-and-data-lakes/
  • 6. 6 DENODO LUNCH AND LEARN ASEAN Data lakes were born to efficiently address the challenge of cost reduction. Data lakes allow for cheap, efficient storage of very large amounts of data. Cloud implementation simplified the complexity of managing a large data lake.
  • 7. 7 The Data Lake – Architecture I Distributed File System Cheap storage for large data volumes • Support for multiple file formats (Parquet, CSV, JSON, etc) • Examples: • On-prem: HDFS • Cloud native: AWS S3, Azure ADLS, Google GCS
  • 8. 8 The Data Lake – Architecture II Distributed File System Execution Engine Massively parallel & scalable execution engine • Cheaper execution than traditional EDW architectures • Decoupled from storage • Doesn’t require specialized HW • Examples: • SQL-on-Hadoop engines: Spark, Hive, Impala, Drill, Dremio, Presto, etc. • Cloud native: AWS Redshift, Snowflake, AWS Athena, Delta Lake, GCP BigQuery
  • 9. 9 The Data Lake – Architecture III Adoption of new transformation techniques • Data ingested is normally raw and unusable by end users • Data is transformed and moved to different “zones” with different levels of curation • End users only access the refined zone • Use of ELT as a cheaper transformation technique than ETL • Use of the engine and storage of the lake for data transformation instead of external ETL flows • Removes the need for additional staging HW Raw zone Trusted zone Refined Zone Distributed File System Execution Engine
  • 10. 10 Data Lake Example – AWS § Data ingested using AWS Glue (or other ETL tools) § Raw data stored in S3 object store § Maintain fidelity and structure of data § Metadata extracted/enriched using Glue Data Catalog § Business rules/DQ rules applied to S3 data as copied to Trusted Zone data stores § Trusted Zone contains more than one data store – select best data store for data and data processing § Refined Zone contains data for consumer – curated data sets (data marts?) § Refined Zone data stores differ – Redshift, Athena, Snowflake, … TRUSTED ZONE RAW ZONE S3 for raw data INGESTION Data Sources Internal & External AWS Glue Consumers Data Portals BI – Visualization Analytic Workbench Mobile Apps Etc. REFINED ZONE
  • 11. 11 Hadoop-Based Data Lakes – A Data Scientist’s Playground § The early data scientists saw Hadoop as their personal supercomputer. § Hadoop-based Data Lakes helped democratize access to state-of-the-art supercomputing with off- the-shelf HW (and later cloud) § The industry push for BI made Hadoop–based solutions the standard to bring modern analytics to any corporation. Hadoop-based Data Lakes became “data science silos”
  • 12. 12 DENODO LUNCH AND LEARN ASEAN Can data lakes also address the other data management challenges? Can they provide fast decision making with proper governance and security?
  • 13. 13 Changing the Data Lake Goals “The popular view is that a data lake will be the one destination for all the data in their enterprise and the optimal platform for all their analytics.” Nick Heudecker, Gartner
  • 14. 14 DENODO LUNCH AND LEARN ASEAN Rick Van der Lans, R20 Consultancy Multi-purpose data lakes are data delivery environments developed to support a broad range of users, from traditional self-service BI users (e.g. finance, marketing, human resource, transport) to sophisticated data scientists. Multi-purpose data lakes allow a broader and deeper use of the data lake investment without minimizing the potential value for data science and without making it an inflexible environment.
  • 15. 15 DENODO LUNCH AND LEARN ASEAN The Data Lake as the Repository of All Data Is that realistic? And even, if possible, it comes with multiple trade-offs: COST GOVERNANCE • Huge up-front investment Creating ingestion pipelines for all company datasets into the lake is costly. • Large recurrent maintenance costs Those pipelines need to be constantly modified as data structures change in the sources Efficient use of the data lake to accelerate insights comes at the cost of price, time-to-market and governance • Risk of inconsistencies Data needs to be frequently synchronized to avoid stale datasets • Loss of capabilities Data lake capabilities may differ from those of original sources, e.g. quick access by ID in operational RDBMS
  • 16. 16 DENODO LUNCH AND LEARN ASEAN Restricting the use of the data lake to a specific use case (eg: Data Science) Purpose-specific Data Lakes TTM SECURITY An environment with multiple purpose-specific systems slows down TTM and jeopardizes security and governance • Higher Complexity End Users need to find where data is and how to use it • Risk of Inconsistencies Data may be in multiple places, in different formats and calculated at different times • Loss of Security Frustrations increase the use of Shadow IT, “personal” extracts, uncontrolled data prep flows, etc.
  • 17. 17 Data Lakes in the ‘Pit of Despair’ Data Lakes are 2-5 years from Plateau of Productivity and are deep in the Trough of Disillusionment Gartner – Hype Cycle Data Management July 2021
  • 18. 18 Gartner – The Evolution of Analytical Environments This is a Second Major Cycle of Analytical Consolidation Operational Application Operational Application Operational Application IoT Data Other NewData Operational Application Operational Application Cube Operational Application Cube ? Operational Application Operational Application Operational Application IoT Data Other NewData 1980s Pre EDW 1990s EDW 2010s 2000s Post EDW Time LDW Operational Application Operational Application Operational Application Data Warehouse Data Warehouse Data Lake ? LDW Data Warehouse Data Lake Marts ODS Staging/Ingest Unified analysis › Consolidated data › "Collect the data" › Single server, multiple nodes › More analysis than any one server can provide ©2018 Gartner, Inc. Unified analysis › Logically consolidated view of all data › "Connect and collect" › Multiple servers, of multiple nodes › More analysis than any one system can provide ID: 342254 Fragmented/ nonexistent analysis › Multiple sources › Multiple structured sources Fragmented analysis › "Collect the data" (Into › different repositories) › New data types, › processing, requirements › Uncoordinated views “Adopt the Logical Data Warehouse Architecture to Meet Your Modern Analytical Needs”. Henry Cook, Gartner April 2018
  • 19. 19 Gartner – Logical Data Warehouse “Adopt the Logical Data Warehouse Architecture to Meet Your Modern Analytical Needs”. Henry Cook, Gartner April 2018 DATA VIRTUALIZATION
  • 20. 20 …Data lakes lack semantic consistency and governed metadata. Meeting the needs of wider audiences require curated repositories with governance, semantic consistency and access controls.”
  • 21. 21 DENODO LUNCH AND LEARN ASEAN How can a logical data fabric approach help?
  • 22. 22 DENODO LUNCH AND LEARN ASEAN Faster Time-to-Market for Data Projects Why? • The Data Virtualization Platform allows you to connect directly to all kinds of data sources (EDW, application databases, SaaS applications, etc.) • Thus not all data needs to be replicated to the data lake for consumers to access it from a single (virtual) repository. • In some cases, it makes sense to replicate in the lake, for others it doesn’t. Data Virtualization opens that door Capabilities • Data can be accessed immediately, easily improving TTM and ROI of the lake • If data is not useful, time was not lost preparing pipelines and copying data • Can ingest and synchronize data into the lake efficiently when needed • Denodo can load and update data into the data lake natively, using Parquet, and parallel loads • Execution is pushed down to original sources, taking advantage of their capabilities • Especially significant in the case of EDW with strong processing capabilities TTM COST
  • 23. 23 DENODO LUNCH AND LEARN ASEAN Easier Self-Service through a Single Data Delivery Layer Why? • From an end user perspective, access to all data is done through a single layer, regardless of data formats and its actual physical location. • A single delivery layer also allows you to enforce security and governance policies • The virtual layer becomes the “delivery zone” of the data lake, offering modeling and caching capabilities, documentation and output in multiple formats Capabilities • Built-in rich modeling capabilities to tailor data models to end users • Integrated catalog, search and documentation capabilities • Access via SQL, REST, OData and GraphQL with no additional coding • Advanced security controls, SSO, workload management, monitoring, etc. GOVERNANCE
  • 24. 24 DENODO LUNCH AND LEARN ASEAN Accelerates Query Execution Why? Controlling data delivery separately from storage allows a virtual layer to accelerate query execution, providing faster response than the sources alone. Capabilities • Aggregate-aware capabilities to accelerate execution of analytical queries • Flexible caching options to materialize frequently used data: • Full datasets • Partial results • Hybrid (cached content + updates from source in real time) • Powerful optimization capabilities for multi-source federated queries PERFORMANCE
  • 25. 25 Denodo’s Logical Data Lake ETL Data Warehouse Kafka Physical Data Lake Logical Data Lake Files ETL Data Warehouse Kafka Physical Data Lake Files IT Storage and Processing BI & Reporting Mobile Applications Predictive Analytics AI/ML Real time dashboards Consuming Tools Query Engine Business Delivery Source Abstraction Business Catalog Security and Governance Delivery Zone
  • 26. DENODO LUNCH AND LEARN ASEAN Case Study
  • 27. Problem Solution Results Case Study 27 DENODO LUNCH AND LEARN ASEAN Leading Construction Manufacturer Improves Service Delivery and Revenue § Telemetry (IoT) data from sensors embedded in the equipment is stored in Hadoop to perform predictive analytics § Denodo integrates analytics data with parts, maintenance, and dealer information stored in traditional systems § It then feeds the predictive maintenance information to a customer dashboard In business for over 90 years and is the world’s leading manufacturer of construction and mining equipment, diesel and natural gas engines, industrial gas turbines and diesel-electric locomotive. § Phased rollout systematically improved asset performance and proactive maintenance § Increased revenue from sale of services and parts § Reduced warranty costs of parts failure § Future – optimize pricing for services and parts among global service providers 27 § Competitive pressure from low-cost Chinese manufacturers § Needed a proactive approach to customer service to differentiate § Sought to improve equipment and services delivery through predictive maintenance
  • 28. 28 DENODO LUNCH AND LEARN ASEAN Architectural Diagram
  • 29. 29 DENODO LUNCH AND LEARN ASEAN Demo
  • 30. DENODO LUNCH AND LEARN ASEAN Conclusion
  • 31. DENODO LUNCH AND LEARN ASEAN Key Takeaways 1. In most cases, not all the data is going to be in the data lake 2. Large data lake projects are complex environments that will benefit from a virtual ‘consumption’ layer 3. Data virtualization provides a governance and management infrastructure required for successful data lake implementation 4. Data Virtualization is more than just a data access or services layer, it is a key component for a Data Lake
  • 32. DENODO LUNCH AND LEARN ASEAN Q&A
  • 33. DENODO LUNCH AND LEARN ASEAN Next Steps
  • 34. 35 Get Started Today Try Denodo for a Test Drive with a 30-day free trial in the cloud marketplaces CHOICE Under your cloud account SUPPORT Community forum AND remote sales engineer OPPORTUNITY 30 minutes free consultation with Denodo Cloud specialist denodo.link/drive22
  • 35. 36 DENODO LUNCH AND LEARN ASEAN Logical Data Fabric to the Rescue: Integrating Data Warehouses, Data Lakes, and Data Hubs ACCESS YOUR REPORT denodo.link/LDF21
  • 36. 37 DENODO LUNCH AND LEARN ASEAN Data Democratization with a Logical Data Fabric REGISTER NOW denodo.link/FD22 APAC | April 27 | 9:30 am SGT
  • 37. Thanks! www.denodo.com info@denodo.com © Copyright Denodo Technologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.