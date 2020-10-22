Successfully reported this slideshow.
Allen Keyte Director - Mero Data Virtualization for Data Architects 21 October 2020 Chris Day Director Sales Engineering - Denodo
Data Virtualization for Data Architects Mero & Denodo Extending your Data Architecture Questions Next Steps
Leader in data virtualization Combine disparate sources Consume with a data catalog Data engineering & analytics consulting
Denodo Data Virtualization
5 Gartner – The Rise of Logical Architectures This is a Second Major Cycle of Analytical Consolidation Operational Applica...
DATA CONSUMERS DISPARATE DATA SOURCES SQL Queries (JDBC, ODBC, ADO.NET) Web Services (SOAP, REST, OData) Web-based catalog...
Platform Demonstration
10 Demo Scenario ▪ Historical sales data offloaded to Hadoop cluster for cheaper storage ▪ Marketing campaigns managed in ...
11 Personas Denodo Developer Business User & BI Analyst Data Scientist Application-to-Application Administration & Operati...
Unified Web Administration: Central Web Portal Entry point for all users to all Denodo Environments. SSO to all tools with...
Data Virtualization: 1. Enables data re-use reducing costs & increasing collaboration 2. Unifies disparate data sources in...
Data Virtualization for Data Architects Questions
Wed Nov 11 | Data Virtualization for Business Consumption Workshop | Hands-on virtual workshops - greg.laws@mero.co.nz | +...
16 What is the optimizer doing? SELECT c.state, AVG(s.amount) FROM customer c JOIN sales s ON c.id = s.customer_id GROUP B...
17 Why is this so important? SELECT c.name, AVG(s.amount) FROM customer c JOIN sales s ON c.id = s.customer_id GROUP BY c....
18 Denodo Performance Strategies • Post-processing and Federation in the DV engine • Delegation ▪ Process as much as possi...
Data Virtualization for Data Architects (New Zealand)

Watch full webinar here: https://bit.ly/3ogCJKC

Success or failure in the digital age will be determined by how effectively organisations manage their data. The speed, diversity and volume of data present today can overwhelm older data architectures, leaving business leaders lacking the insight and operational agility needed to respond to market opportunity or competitive challenges.

With the pace of today’s business, modernisation of a data architecture must be seamless, and ideally, built on existing capabilities. This webinar explores how data virtualization can help provide a seamless evolution to the capabilities of an existing data architecture without business disruption.

You will discover:
How to modernise your data architectures without disturbing the existing analytical workload
- How to extend your data architecture to more quickly exploit existing, and new sources of data
- How to enable your data architecture to present more low latency data

Data Virtualization for Data Architects (New Zealand)

  1. 1. Allen Keyte Director - Mero Data Virtualization for Data Architects 21 October 2020 Chris Day Director Sales Engineering - Denodo
  2. 2. Data Virtualization for Data Architects Mero & Denodo Extending your Data Architecture Questions Next Steps This Webinar - agenda
  3. 3. Leader in data virtualization Combine disparate sources Consume with a data catalog Data engineering & analytics consulting Over 100 active clients Modern data platforms Data Virtualization for Data Architects New Zealand partnership
  4. 4. Denodo Data Virtualization
  5. 5. 5 Gartner – The Rise of Logical Architectures This is a Second Major Cycle of Analytical Consolidation Operational Application Operational Application Operational Application IoT Data Other NewData Operational Application Operational Application Cube Operational Application Cube ? Operational Application Operational Application Operational Application IoT Data Other NewData 1980s Pre EDW 1990s EDW 2010s2000s Post EDW Time LDW Operational Application Operational Application Operational Application Data Warehouse Data Warehouse Data Lake ? Logical Data Warehouse Data Warehouse Data Lake Marts ODS Staging/Ingest Unified analysis › Consolidated data › "Collect the data" › Single server, multiple nodes › More analysis than any one server can provide ©2018 Gartner, Inc. Unified analysis › Logically consolidated view of all data › "Connect and collect" › Multiple servers, of multiple nodes › More analysis than any one system can provide ID: 342254 Fragmented/ nonexistent analysis › Multiple sources › Multiple structured sources Fragmented analysis › "Collect the data" (Into › different repositories) › New data types, › processing, requirements › Uncoordinated views
  6. 6. 6 Gartner – The Rise of Logical Architectures This is a Second Major Cycle of Analytical Consolidation Operational Application Operational Application Operational Application IoT Data Other NewData Operational Application Operational Application Cube Operational Application Cube ? Operational Application Operational Application Operational Application IoT Data Other NewData 1980s Pre EDW 1990s EDW 2010s2000s Post EDW Time LDW Operational Application Operational Application Operational Application Data Warehouse Data Warehouse Data Lake ? Unified analysis › Consolidated data › "Collect the data" › Single server, multiple nodes › More analysis than any one server can provide ©2018 Gartner, Inc. Unified analysis › Logically consolidated view of all data › "Connect and collect" › Multiple servers, of multiple nodes › More analysis than any one system can provide ID: 342254 Fragmented/ nonexistent analysis › Multiple sources › Multiple structured sources Fragmented analysis › "Collect the data" (Into › different repositories) › New data types, › processing, requirements › Uncoordinated views Operational Application Operational Application Operational Application IoT Data Other NewData Logical Data Warehouse Data Warehouse Data Lake Marts ODS Staging/Ingest Data Virtualization √ Improved Time to Market by 50 to 90% √ Improved Report Consistency √ Reduce Duplication of Data √ Improve Transparency √ Reduced development Cost √ Future Proof the architecture against technology changes
  7. 7. DATA CONSUMERS DISPARATE DATA SOURCES SQL Queries (JDBC, ODBC, ADO.NET) Web Services (SOAP, REST, OData) Web-based catalog & search Secure delivery (SSL/TLS) DATA CONSUMERS MPP Processing Relational Cache Corporate Security Monitoring & Auditing Metadata Repository Execution Engine & Optimizer Data Virtualization as a Data Access Layer DATA VIRTUALIZATION Consume Combine 2 3 Connect 1
  8. 8. DATA CONSUMERS DISPARATE DATA SOURCES SQL Queries (JDBC, ODBC, ADO.NET) Web Services (SOAP, REST, OData) Web-based catalog & search Secure delivery (SSL/TLS) DATA CONSUMERS Data Virtualization in Action Consume Combine 2 3 Connect 1 Base/Raw views Standardized views Customer Product Order Business viewsFinance Operations Sales Less Structured Operational Each Layer of Views provides more refined Single Views of Truth
  9. 9. Platform Demonstration
  10. 10. 10 Demo Scenario ▪ Historical sales data offloaded to Hadoop cluster for cheaper storage ▪ Marketing campaigns managed in an external cloud app ▪ Country is part of the customer details table, stored in the DW Sources Combine, Transform & Integrate Consume Base View Source Abstraction join group by state join Sales Campaign Customer SaaS solution How effective are our marketing Campaigns?
  11. 11. 11 Personas Denodo Developer Business User & BI Analyst Data Scientist Application-to-Application Administration & Operations
  12. 12. Unified Web Administration: Central Web Portal Entry point for all users to all Denodo Environments. SSO to all tools with Kerberos, SAML or OAuth
  13. 13. Data Virtualization: 1. Enables data re-use reducing costs & increasing collaboration 2. Unifies disparate data sources in real-time 3. Supports self-service & data discovery 4. Centralises governance & security of enterprise data assets Key Takeaways
  14. 14. Data Virtualization for Data Architects Questions
  15. 15. Wed Nov 11 | Data Virtualization for Business Consumption Workshop | Hands-on virtual workshops - greg.laws@mero.co.nz | +64 21 875 875 Data Virtualization for Data Architects Next Steps Webinar series continues Test Drive | Try it out on mero.co.nz/denodo/
  16. 16. 16 What is the optimizer doing? SELECT c.state, AVG(s.amount) FROM customer c JOIN sales s ON c.id = s.customer_id GROUP BY c.state Sales Customer join group by Sales Customer Create temp table join group by Option 1? Option 2? Option 3? Temp_Customer Customer and Sales are in different sources. What is the best execution plan? Naïve Strategy Temporary Data Movement 300 M 2 M 2 M 50 M Sales Customer join group by ID Group by state Partial Aggregation Pushdown 2 M 2 M ‘Cost’ ~302 M ‘Cost’ ~52 M ‘Cost’ ~4 M
  17. 17. 17 Why is this so important? SELECT c.name, AVG(s.amount) FROM customer c JOIN sales s ON c.id = s.customer_id GROUP BY c.state How Denodo works compared with other federation engines System Execution Time Data Transferred Optimization Technique Denodo 9 sec. 4 M Aggregation push-down Others 125 sec. 302 M None: full scan 300 M 2 M Sales Customer join group by 2 M 2 M Sales Customer join group by ID Group by state To maximize push down to the EDW the aggregation is split in 2 steps: • 1st by customerID • 2nd by state This significantly reduces network Traffic and processing In Denodo
  18. 18. 18 Denodo Performance Strategies • Post-processing and Federation in the DV engine • Delegation ▪ Process as much as possible in the data sources • Temporary Tables ▪ Automatically move data to the biggest data source to optimize the execution • Summaries ▪ Based on the query the Denodo optimizer can use a “summary” for accelerating the execution • MPP Integration ▪ Move processing to an external MPP system on the fly • Caching ▪ Persist data beforehand in a relational database

