SlideShare a Scribd company logo
1 of 28
Download to read offline
Performance in Denodo 6.0
Pablo Alvarez, Principal Technical Account
Manager
Agenda
1.Debunking the myths of virtual performance
2.Query Optimizer
3.Cache
4.Resource Management
5.Further Reading
3
It is a common assumption that a virtualized solution will be
much slower than a persisted approach via ETL:
1. There is a large amount of data moved through the
network for each query
2. Network transfer is slow
But is this really true?
4
Debunking the myths of virtual performance
1. Complex queries can be solved transferring moderate data volumes when
the right techniques are applied
 Operational queries
 Predicate delegation produces small result sets
 Logical Data Warehouse and Big Data
 Denodo uses characteristics of underlying star schemas to apply
query rewriting rules that maximize delegation to specialized sources
(especially heavy GROUP BY) and minimize data movement
2. Current networks are almost as fast as reading from disk
 10GB and 100GB Ethernet are a commodity
5
Performance Comparison
Logical Data Warehouse vs. Physical Data Warehouse
Customer Dimension
2 M rows
Sales Facts
290 M rows
Items Dimension
400 K rows
* TPC-DS is the de-facto industry
standard benchmark for measuring
the performance of decision support
solutions including, but not limited to,
Big Data systems.
• Denodo has done extensive testing using queries from the standard benchmarking test
TPC-DS* and the following scenario
• The baseline was set using the same queries with all data in a Netezza appliance
6
Performance Comparison
Logical Data Warehouse vs. Physical Data Warehouse
Query Description
Returned
Rows
Time Netezza
Time Denodo
(Federating Oracle,
Netezza & SQL Server)
Optimization Technique
(automatically selected)
Total sales by customer 1,99 M 20.9 sec. 21.4 sec. Full aggregation push-down
Total sales by customer and
year between 2000 and 2004
5,51 M 52.3 sec. 59.0 sec Full aggregation push-down
Total sales by item brand 31,35 K 4.7 sec. 5.0 sec. Partial aggregation push-down
Total sales by item where
sale price less than current
list price
17,05 K 3.5 sec. 5,2 sec On the fly data movement
7
Performance and optimizations in Denodo 6.0
Focused on 3 core concepts
Dynamic Multi-Source Query Execution Plans
Leverages processing power & architecture of data sources
Dynamic to support ad hoc queries
Uses statistics for cost-based query plans
Selective Materialization
Intelligent Caching of only the most relevant and often used
information
Optimized Resource Management
Smart allocation of resources to handle high concurrency
Throttling to control and mitigate source impact
Resource plans based on rules
8
Performance and optimizations in Denodo 6.0
Comparing optimizations in DV vs ETL
Although Data Virtualization is a data integration platform,
architecturally speaking it is more similar to a RDBMs
Uses relational logic
Metadata is equivalent to that of a database
Enables ad hoc querying
Key difference between ETL engines and DV:
ETL engines are optimized for static bulk movements
Fixed data flows
Data virtualization is optimized for queries
Dynamic execution plan per query
Therefore, the performance architecture presented here
resembles that of a RDBMS
Query Optimizer
10
Step by Step
Metadata
Query Tree
• Maps query entities (tables, fields) to actual metadata
• Retrieves execution capabilities and restrictions for views involved
in the query
Static
Optimizer
• Query delegation
• SQL rewriting rules (removal of redundant filters, tree pruning, join
reordering, transformation push-up, star-schema rewritings, etc.)
• Data movement query plans
Cost Based
Optimizer
• Picks optimal JOIN methods and orders based on data distribution
statistics, indexes, transfer rates, etc.
Physical
Execution Plan
• Creates the calls to the underlying systems in their corresponding
protocols and dialects (SQL, MDX, WS calls, etc.)
How Dynamic Query Optimizer Works
11
How Dynamic Query Optimizer Works
Example: Logical Data Warehouse
Total sales by retailer and product during the last month for the brand ACME
Time Dimension Fact table
(sales) Product Dimension
Retailer
Dimension
EDW MDM
SELECT retailer.name,
product.name,
SUM(sales.amount)
FROM
sales JOIN retailer ON
sales.retailer_fk = retailer.id
JOIN product ON sales.product_fk =
product.id
JOIN time ON sales.time_fk = time.id
WHERE time.date < ADDMONTH(NOW(),-1)
AND product.brand = ‘ACME’
GROUP BY product.name, retailer.name
12
How Dynamic Query Optimizer Works
Example: Non-optimized
1,000,000,0
00 rows
JOIN
JOIN
JOIN
GROUP BY
product.name,
retailer.name
100 rows 10 rows 30 rows
10,000,000
rows
SELECT
sales.retailer_fk,
sales.product_fk,
sales.time_fk,
sales.amount
FROM sales
SELECT
retailer.name,
retailer.id
FROM retailer
SELECT
product.name,
product.id
FROM product
WHERE
produc.brand =
‘ACME’
SELECT time.date,
time.id
FROM time
WHERE time.date <
add_months(CURRENT_
TIMESTAMP, -1)
13
How Dynamic Query Optimizer Works
Step 1: Applies JOIN reordering to maximize delegation
100,000,000
rows
JOIN
JOIN
100 rows 10 rows
10,000,000
rows
GROUP BY
product.name,
retailer.name
SELECT sales.retailer_fk,
sales.product_fk,
sales.amount
FROM sales JOIN time ON
sales.time_fk = time.id WHERE
time.date <
add_months(CURRENT_TIMESTAMP, -1)
SELECT
retailer.name,
retailer.id
FROM retailer
SELECT product.name,
product.id
FROM product
WHERE
produc.brand = ‘ACME’
14
How Dynamic Query Optimizer Works
Step 2
10,000 rows
JOIN
JOIN
100 rows 10 rows
1,000 rows
GROUP BY
product.name,
retailer.name
Since the JOIN is on foreign keys
(1-to-many), and the GROUP BY is
on attributes from the dimensions,
it applies the partial aggregation
push down optimization
SELECT sales.retailer_fk,
sales.product_fk,
SUM(sales.amount)
FROM sales JOIN time ON
sales.time_fk = time.id WHERE
time.date <
add_months(CURRENT_TIMESTAMP, -1)
GROUP BY sales.retailer_fk,
sales.product_fk
SELECT
retailer.name,
retailer.id
FROM retailer
SELECT product.name,
product.id
FROM product
WHERE
produc.brand = ‘ACME’
15
How Dynamic Query Optimizer Works
Step 3
Selects the right JOIN
strategy based on costs for
data volume estimations
1,000 rows
NESTED
JOIN
HASH
JOIN
100 rows10 rows
1,000 rows
GROUP BY
product.name,
retailer.name
SELECT sales.retailer_fk,
sales.product_fk,
SUM(sales.amount)
FROM sales JOIN time ON
sales.time_fk = time.id WHERE
time.date <
add_months(CURRENT_TIMESTAMP, -1)
GROUP BY sales.retailer_fk,
sales.product_fk
WHERE product.id IN (1,2,…)
SELECT
retailer.name,
retailer.id
FROM retailer
SELECT product.name,
product.id
FROM product
WHERE
produc.brand = ‘ACME’
16
How Dynamic Query Optimizer Works
The use of Automatic JOIN reordering groups branches that go to the same source to
maximize query delegation and reduce processing in the DV layer
 End users don’t need to worry about the optimal “pairing” of the tables
The Partial Aggregation push-down optimization is key in those scenarios. Based on PK-
FK restrictions, pushes the aggregation (for the PKs) to the DW
 Leverages the processing power of the DW, optimized for these aggregations
 Reduces significantly the data transferred through the network (from 1 b to 10 k)
The Cost-based Optimizer picks the right JOIN strategies based on estimations on data
volumes, existence of indexes, transfer rates, etc.
 Denodo estimates costs in a different way for parallel databases (Vertica, Netezza, Teradata)
than for regular databases to take into consideration the different way those systems operate
(distributed data, parallel processing, different aggregation techniques, etc.)
Summary
17
How Dynamic Query Optimizer Works
Pruning of unnecessary JOIN branches (based on 1 to + associations) when the
attributes of the 1-side are not projected
 Relevant for horizontal partitioning and “fat” semantic models when queries do not need
attributes for all the tables
 Unnecessary tables are removed from the query (even for single-source models)
Pruning of UNION branches based on incompatible filters
 Enables detection of unnecessary UNION branches in vertical partitioning scenarios
Automatic data movement
 Creation of temp tables in one of the systems to enable complete delegation of a federated
branch.
 The target source needs to have the “data movement” option enabled for this option to be
taken into account
Other relevant optimization techniques for LDW and Big Data
Caching
18
19
Caching
Sometimes, real time access & federation not a good fit:
 Sources are slow (ex. text files, cloud apps. like Salesforce.com)
 A lot of data processing needed (ex. complex combinations, transformations,
matching, cleansing, etc.)
 Limited access or have to mitigate impact on the sources
For these scenarios, Denodo can replicate just the relevant data in
the cache
Real time vs. caching
20
Caching
Denodo’s cache system is based on an external relational database
 Traditional (Oracle, SLQServer, DB2, MySQL, etc.)
 MPP (Teradata, Netezza, Vertica, Redshift, etc.)
 In-memory storage (Oracle TimesTen, SAP HANA)
Works at view level.
 Allows hybrid access (real-time / cached) of an execution tree
Cache Control (population / maintenance)
 Manually – user initiated at any time
 Time based - using the TTL or the Denodo Scheduler
 Event based - e.g. using JMS messages triggered in the DB
Overview
21
Caching
Denodo offers two different types of cache
 Partial:
 Query-by-query cache
 Useful for caching only the most commonly requested data
 More adequate to represent the capabilities of non-relational sources, like web
services or APIs with input parameters
 Full:
 Similar to the concept of materialized view
 Incrementally updateable at row level to avoid unnecessary full refresh loads
 Offers full push-down capabilities to the source, including group by and join
operations
Caching options
22
Hybrid Performance for SaaS sources
Incremental Queries (Available July 2016)
Merge cached data and fresh data to provide fully up-to-date results with minimum latency
Get Leads
changed / added
since 1:00AM
CACHE
Leads updated
at 1:00AM
Up-to-date Leads
data
1. Salesforce ‘Leads’ data
cached in VDP at 1:00
AM
2. Query needing Leads
data arrives at 11:00 AM
3. Only new/changed leads
are retrieved through
the WAN
4. Response is up-to-date
but query is much faster
Resource Management
24
Resource Management
Advanced Memory Management
 Dynamic data buffers to control source federation with different data retrieval speeds, which
guarantees a low memory footprint
 All operations are memory-constrained to prevent monopolization of resources by a single query.
The constraints are adjustable.
 Swapping data to disk to handle large data sets so as not to overload the memory
 On-the-fly modification of execution plans to prevent exceeding memory thresholds
Server Throttling Mechanisms
 Control settings to limit concurrency (max queries, max. threads…)
 Waiting queues for inbound connections
 Connection pools for data sources
25
Resource Management
Enterprise Resource Manager
 Apply resource restrictions based on a set of rules
 Rules Classify Sessions into Groups (e.g. by user, role, application, source IP…)
 E.g. Sessions from application ‘single customer view’ are assigned to group called ‘high
priority transactional’
 Apply Restrictions for Each Group.
 Change priority, change concurrency settings, change max timeouts, etc
Further Reading
27
Further Reading
Check also the following articles written by our CTO Alberto Pan in our blog:
• Myths in data virtualization performance
• http://www.datavirtualizationblog.com/myths-in-data-virtualization-
performance/
• Performance of Data Virtualization in Logical Data Warehouse scenarios
• http://www.datavirtualizationblog.com/performance-data-virtualization-logical-
data-warehouse-scenarios/
• Physical vs Logical Data Warehouse: the numbers
• http://www.datavirtualizationblog.com/physical-logical-data-warehouse-
performance-numbers/
Thanks!
www.denodo.com info@denodo.com
© Copyright Denodo Technologies. All rights reserved
Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical,
including photocopying and microfilm, without prior the written authorization from Denodo Technologies.

More Related Content

What's hot

Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachDatabricks
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseDatabricks
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsDatabricks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Data Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data ArchitectureData Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data ArchitectureZaloni
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDATAVERSITY
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheDremio Corporation
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark Aakashdata
 

What's hot (20)

Data mesh
Data meshData mesh
Data mesh
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT Approach
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Data Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data ArchitectureData Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data Architecture
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
FLiP Into Trino
FLiP Into TrinoFLiP Into Trino
FLiP Into Trino
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 

Viewers also liked

Data Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Data Virtualization in the Cloud: Accelerating Data Virtualization AdoptionData Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Data Virtualization in the Cloud: Accelerating Data Virtualization AdoptionDenodo
 
Denodo Data Virtualization Platform: Scalability (session 3 from Architect to...
Denodo Data Virtualization Platform: Scalability (session 3 from Architect to...Denodo Data Virtualization Platform: Scalability (session 3 from Architect to...
Denodo Data Virtualization Platform: Scalability (session 3 from Architect to...Denodo
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationDenodo
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data PipelineJesus Rodriguez
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesDenodo
 
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...Denodo
 

Viewers also liked (6)

Data Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Data Virtualization in the Cloud: Accelerating Data Virtualization AdoptionData Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Data Virtualization in the Cloud: Accelerating Data Virtualization Adoption
 
Denodo Data Virtualization Platform: Scalability (session 3 from Architect to...
Denodo Data Virtualization Platform: Scalability (session 3 from Architect to...Denodo Data Virtualization Platform: Scalability (session 3 from Architect to...
Denodo Data Virtualization Platform: Scalability (session 3 from Architect to...
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data Virtualization
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solves
 
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
 

Similar to How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and Operational Scenarios

Performance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehousePerformance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehouseDenodo
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationDenodo
 
Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Denodo
 
In Memory Parallel Processing for Big Data Scenarios
In Memory Parallel Processing for Big Data ScenariosIn Memory Parallel Processing for Big Data Scenarios
In Memory Parallel Processing for Big Data ScenariosDenodo
 
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Denodo
 
Demystifying Data Virtualization (ASEAN)
Demystifying Data Virtualization (ASEAN)Demystifying Data Virtualization (ASEAN)
Demystifying Data Virtualization (ASEAN)Denodo
 
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATIONBig Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATIONMatt Stubbs
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0Denodo
 
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...Denodo
 
Waters Grid & HPC Course
Waters Grid & HPC CourseWaters Grid & HPC Course
Waters Grid & HPC Coursejimliddle
 
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft Private Cloud
 
Cosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopCosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopDatabricks
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessJawaherAlbaddawi
 
Connecting Silos in Real Time with Data Virtualization
Connecting Silos in Real Time with Data VirtualizationConnecting Silos in Real Time with Data Virtualization
Connecting Silos in Real Time with Data VirtualizationDenodo
 
Conspectus data warehousing appliances – fad or future
Conspectus   data warehousing appliances – fad or futureConspectus   data warehousing appliances – fad or future
Conspectus data warehousing appliances – fad or futureDavid Walker
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom IndustryCloudera, Inc.
 

Similar to How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and Operational Scenarios (20)

Performance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehousePerformance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data Warehouse
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
 
Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?
 
In Memory Parallel Processing for Big Data Scenarios
In Memory Parallel Processing for Big Data ScenariosIn Memory Parallel Processing for Big Data Scenarios
In Memory Parallel Processing for Big Data Scenarios
 
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
 
Demystifying Data Virtualization (ASEAN)
Demystifying Data Virtualization (ASEAN)Demystifying Data Virtualization (ASEAN)
Demystifying Data Virtualization (ASEAN)
 
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATIONBig Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0
 
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
 
Waters Grid & HPC Course
Waters Grid & HPC CourseWaters Grid & HPC Course
Waters Grid & HPC Course
 
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse Presentation
 
Cosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopCosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics Workshop
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
 
Connecting Silos in Real Time with Data Virtualization
Connecting Silos in Real Time with Data VirtualizationConnecting Silos in Real Time with Data Virtualization
Connecting Silos in Real Time with Data Virtualization
 
Conspectus data warehousing appliances – fad or future
Conspectus   data warehousing appliances – fad or futureConspectus   data warehousing appliances – fad or future
Conspectus data warehousing appliances – fad or future
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
 
Database as a Service - Tutorial @ICDE 2010
Database as a Service - Tutorial @ICDE 2010Database as a Service - Tutorial @ICDE 2010
Database as a Service - Tutorial @ICDE 2010
 

More from Denodo

Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoDenodo
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachDenodo
 
Achieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerAchieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerDenodo
 
What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?Denodo
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeDenodo
 
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Denodo
 
Drive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDrive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDenodo
 
Знакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхЗнакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхDenodo
 
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationData Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationDenodo
 
Denodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo
 
Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Denodo
 
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardIt’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardDenodo
 
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Denodo
 
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Denodo
 
How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?Denodo
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsDenodo
 
Enabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityEnabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityDenodo
 
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo
 
GenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesGenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesDenodo
 

More from Denodo (20)

Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in Denodo
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
 
Achieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerAchieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services Layer
 
What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business Landscape
 
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
 
Drive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDrive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory Compliance
 
Знакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхЗнакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данных
 
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationData Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
 
Denodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me Anything
 
Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!
 
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardIt’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
 
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
 
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
 
How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
 
Enabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityEnabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usability
 
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
 
GenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesGenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidades
 

Recently uploaded

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 

Recently uploaded (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and Operational Scenarios

  • 1. Performance in Denodo 6.0 Pablo Alvarez, Principal Technical Account Manager
  • 2. Agenda 1.Debunking the myths of virtual performance 2.Query Optimizer 3.Cache 4.Resource Management 5.Further Reading
  • 3. 3 It is a common assumption that a virtualized solution will be much slower than a persisted approach via ETL: 1. There is a large amount of data moved through the network for each query 2. Network transfer is slow But is this really true?
  • 4. 4 Debunking the myths of virtual performance 1. Complex queries can be solved transferring moderate data volumes when the right techniques are applied  Operational queries  Predicate delegation produces small result sets  Logical Data Warehouse and Big Data  Denodo uses characteristics of underlying star schemas to apply query rewriting rules that maximize delegation to specialized sources (especially heavy GROUP BY) and minimize data movement 2. Current networks are almost as fast as reading from disk  10GB and 100GB Ethernet are a commodity
  • 5. 5 Performance Comparison Logical Data Warehouse vs. Physical Data Warehouse Customer Dimension 2 M rows Sales Facts 290 M rows Items Dimension 400 K rows * TPC-DS is the de-facto industry standard benchmark for measuring the performance of decision support solutions including, but not limited to, Big Data systems. • Denodo has done extensive testing using queries from the standard benchmarking test TPC-DS* and the following scenario • The baseline was set using the same queries with all data in a Netezza appliance
  • 6. 6 Performance Comparison Logical Data Warehouse vs. Physical Data Warehouse Query Description Returned Rows Time Netezza Time Denodo (Federating Oracle, Netezza & SQL Server) Optimization Technique (automatically selected) Total sales by customer 1,99 M 20.9 sec. 21.4 sec. Full aggregation push-down Total sales by customer and year between 2000 and 2004 5,51 M 52.3 sec. 59.0 sec Full aggregation push-down Total sales by item brand 31,35 K 4.7 sec. 5.0 sec. Partial aggregation push-down Total sales by item where sale price less than current list price 17,05 K 3.5 sec. 5,2 sec On the fly data movement
  • 7. 7 Performance and optimizations in Denodo 6.0 Focused on 3 core concepts Dynamic Multi-Source Query Execution Plans Leverages processing power & architecture of data sources Dynamic to support ad hoc queries Uses statistics for cost-based query plans Selective Materialization Intelligent Caching of only the most relevant and often used information Optimized Resource Management Smart allocation of resources to handle high concurrency Throttling to control and mitigate source impact Resource plans based on rules
  • 8. 8 Performance and optimizations in Denodo 6.0 Comparing optimizations in DV vs ETL Although Data Virtualization is a data integration platform, architecturally speaking it is more similar to a RDBMs Uses relational logic Metadata is equivalent to that of a database Enables ad hoc querying Key difference between ETL engines and DV: ETL engines are optimized for static bulk movements Fixed data flows Data virtualization is optimized for queries Dynamic execution plan per query Therefore, the performance architecture presented here resembles that of a RDBMS
  • 10. 10 Step by Step Metadata Query Tree • Maps query entities (tables, fields) to actual metadata • Retrieves execution capabilities and restrictions for views involved in the query Static Optimizer • Query delegation • SQL rewriting rules (removal of redundant filters, tree pruning, join reordering, transformation push-up, star-schema rewritings, etc.) • Data movement query plans Cost Based Optimizer • Picks optimal JOIN methods and orders based on data distribution statistics, indexes, transfer rates, etc. Physical Execution Plan • Creates the calls to the underlying systems in their corresponding protocols and dialects (SQL, MDX, WS calls, etc.) How Dynamic Query Optimizer Works
  • 11. 11 How Dynamic Query Optimizer Works Example: Logical Data Warehouse Total sales by retailer and product during the last month for the brand ACME Time Dimension Fact table (sales) Product Dimension Retailer Dimension EDW MDM SELECT retailer.name, product.name, SUM(sales.amount) FROM sales JOIN retailer ON sales.retailer_fk = retailer.id JOIN product ON sales.product_fk = product.id JOIN time ON sales.time_fk = time.id WHERE time.date < ADDMONTH(NOW(),-1) AND product.brand = ‘ACME’ GROUP BY product.name, retailer.name
  • 12. 12 How Dynamic Query Optimizer Works Example: Non-optimized 1,000,000,0 00 rows JOIN JOIN JOIN GROUP BY product.name, retailer.name 100 rows 10 rows 30 rows 10,000,000 rows SELECT sales.retailer_fk, sales.product_fk, sales.time_fk, sales.amount FROM sales SELECT retailer.name, retailer.id FROM retailer SELECT product.name, product.id FROM product WHERE produc.brand = ‘ACME’ SELECT time.date, time.id FROM time WHERE time.date < add_months(CURRENT_ TIMESTAMP, -1)
  • 13. 13 How Dynamic Query Optimizer Works Step 1: Applies JOIN reordering to maximize delegation 100,000,000 rows JOIN JOIN 100 rows 10 rows 10,000,000 rows GROUP BY product.name, retailer.name SELECT sales.retailer_fk, sales.product_fk, sales.amount FROM sales JOIN time ON sales.time_fk = time.id WHERE time.date < add_months(CURRENT_TIMESTAMP, -1) SELECT retailer.name, retailer.id FROM retailer SELECT product.name, product.id FROM product WHERE produc.brand = ‘ACME’
  • 14. 14 How Dynamic Query Optimizer Works Step 2 10,000 rows JOIN JOIN 100 rows 10 rows 1,000 rows GROUP BY product.name, retailer.name Since the JOIN is on foreign keys (1-to-many), and the GROUP BY is on attributes from the dimensions, it applies the partial aggregation push down optimization SELECT sales.retailer_fk, sales.product_fk, SUM(sales.amount) FROM sales JOIN time ON sales.time_fk = time.id WHERE time.date < add_months(CURRENT_TIMESTAMP, -1) GROUP BY sales.retailer_fk, sales.product_fk SELECT retailer.name, retailer.id FROM retailer SELECT product.name, product.id FROM product WHERE produc.brand = ‘ACME’
  • 15. 15 How Dynamic Query Optimizer Works Step 3 Selects the right JOIN strategy based on costs for data volume estimations 1,000 rows NESTED JOIN HASH JOIN 100 rows10 rows 1,000 rows GROUP BY product.name, retailer.name SELECT sales.retailer_fk, sales.product_fk, SUM(sales.amount) FROM sales JOIN time ON sales.time_fk = time.id WHERE time.date < add_months(CURRENT_TIMESTAMP, -1) GROUP BY sales.retailer_fk, sales.product_fk WHERE product.id IN (1,2,…) SELECT retailer.name, retailer.id FROM retailer SELECT product.name, product.id FROM product WHERE produc.brand = ‘ACME’
  • 16. 16 How Dynamic Query Optimizer Works The use of Automatic JOIN reordering groups branches that go to the same source to maximize query delegation and reduce processing in the DV layer  End users don’t need to worry about the optimal “pairing” of the tables The Partial Aggregation push-down optimization is key in those scenarios. Based on PK- FK restrictions, pushes the aggregation (for the PKs) to the DW  Leverages the processing power of the DW, optimized for these aggregations  Reduces significantly the data transferred through the network (from 1 b to 10 k) The Cost-based Optimizer picks the right JOIN strategies based on estimations on data volumes, existence of indexes, transfer rates, etc.  Denodo estimates costs in a different way for parallel databases (Vertica, Netezza, Teradata) than for regular databases to take into consideration the different way those systems operate (distributed data, parallel processing, different aggregation techniques, etc.) Summary
  • 17. 17 How Dynamic Query Optimizer Works Pruning of unnecessary JOIN branches (based on 1 to + associations) when the attributes of the 1-side are not projected  Relevant for horizontal partitioning and “fat” semantic models when queries do not need attributes for all the tables  Unnecessary tables are removed from the query (even for single-source models) Pruning of UNION branches based on incompatible filters  Enables detection of unnecessary UNION branches in vertical partitioning scenarios Automatic data movement  Creation of temp tables in one of the systems to enable complete delegation of a federated branch.  The target source needs to have the “data movement” option enabled for this option to be taken into account Other relevant optimization techniques for LDW and Big Data
  • 19. 19 Caching Sometimes, real time access & federation not a good fit:  Sources are slow (ex. text files, cloud apps. like Salesforce.com)  A lot of data processing needed (ex. complex combinations, transformations, matching, cleansing, etc.)  Limited access or have to mitigate impact on the sources For these scenarios, Denodo can replicate just the relevant data in the cache Real time vs. caching
  • 20. 20 Caching Denodo’s cache system is based on an external relational database  Traditional (Oracle, SLQServer, DB2, MySQL, etc.)  MPP (Teradata, Netezza, Vertica, Redshift, etc.)  In-memory storage (Oracle TimesTen, SAP HANA) Works at view level.  Allows hybrid access (real-time / cached) of an execution tree Cache Control (population / maintenance)  Manually – user initiated at any time  Time based - using the TTL or the Denodo Scheduler  Event based - e.g. using JMS messages triggered in the DB Overview
  • 21. 21 Caching Denodo offers two different types of cache  Partial:  Query-by-query cache  Useful for caching only the most commonly requested data  More adequate to represent the capabilities of non-relational sources, like web services or APIs with input parameters  Full:  Similar to the concept of materialized view  Incrementally updateable at row level to avoid unnecessary full refresh loads  Offers full push-down capabilities to the source, including group by and join operations Caching options
  • 22. 22 Hybrid Performance for SaaS sources Incremental Queries (Available July 2016) Merge cached data and fresh data to provide fully up-to-date results with minimum latency Get Leads changed / added since 1:00AM CACHE Leads updated at 1:00AM Up-to-date Leads data 1. Salesforce ‘Leads’ data cached in VDP at 1:00 AM 2. Query needing Leads data arrives at 11:00 AM 3. Only new/changed leads are retrieved through the WAN 4. Response is up-to-date but query is much faster
  • 24. 24 Resource Management Advanced Memory Management  Dynamic data buffers to control source federation with different data retrieval speeds, which guarantees a low memory footprint  All operations are memory-constrained to prevent monopolization of resources by a single query. The constraints are adjustable.  Swapping data to disk to handle large data sets so as not to overload the memory  On-the-fly modification of execution plans to prevent exceeding memory thresholds Server Throttling Mechanisms  Control settings to limit concurrency (max queries, max. threads…)  Waiting queues for inbound connections  Connection pools for data sources
  • 25. 25 Resource Management Enterprise Resource Manager  Apply resource restrictions based on a set of rules  Rules Classify Sessions into Groups (e.g. by user, role, application, source IP…)  E.g. Sessions from application ‘single customer view’ are assigned to group called ‘high priority transactional’  Apply Restrictions for Each Group.  Change priority, change concurrency settings, change max timeouts, etc
  • 27. 27 Further Reading Check also the following articles written by our CTO Alberto Pan in our blog: • Myths in data virtualization performance • http://www.datavirtualizationblog.com/myths-in-data-virtualization- performance/ • Performance of Data Virtualization in Logical Data Warehouse scenarios • http://www.datavirtualizationblog.com/performance-data-virtualization-logical- data-warehouse-scenarios/ • Physical vs Logical Data Warehouse: the numbers • http://www.datavirtualizationblog.com/physical-logical-data-warehouse- performance-numbers/
  • 28. Thanks! www.denodo.com info@denodo.com © Copyright Denodo Technologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.