Myth Busters II: BI Tools and Data Virtualization are Interchangeable

W E B I N A R S E R I E S
BI Tools and Data
Virtualization are
Interchangeable

W E B I N A R S E R I E S
BI Tools and Data
Virtualization are
Interchangeable
Paul Moxon
SVP Data Architectures & Chief Evangelist
Denodo
17nd June 2020

Paul Moxon
SVP Data Architectures & Chief
Evangelist, Denodo
Speakers

1. Today’s Myth
2. Origins of the Myth
3. Just the Facts Ma’am
4. The Proof is in the Pudding
5. Conclusions
6. Q&A
7. Next Steps
Agenda

5
Myth #2:
BI Tools and Data Virtualization
are Interchangeable

7
Welcome to my Universe
• BusinessObjects added Universe as semantic layer to
BI tool
• Special tools to design business-oriented data
objects
• Hide technical nature of physical data storage
• Initially use Data Federator to access multiple data
sources
• Multi-source Universe capability subsumed Data
Federator tool
• Made BusinessObjects the leading BI Tool vendor
• Increased usability and appeal to ‘citizen analysts’

8
Follow the Leader
• Other vendors followed this approach
• MicroStrategy, Cognos, etc.
• New entrants initially focused on visualization
and analysis of data
• Tableau, Qlik, Power BI
• Quickly added ‘data blending’ capabilities
• Support multiple data source integration
• With limitations 

9
Data Blending Everywhere
• Most reporting tools now offer capabilities to create reports with data coming from
multiple data sources
• Some in real time, with their own federation engines (e.g. Tableau, MicroStrategy,
Business Objects, etc.)
• Some based on replication in the reporting tool engine (Qlik, SiSense, ThoughtSpot,
etc.)
• Some of them also provide data modeling capabilities (Looker, Business Objects,
MicroStrategy, PowerBI, etc.)
So if I can have multi-source queries and define a logical model in my
reporting tool, why would I need Data Virtualization?

11
Source: “Gartner Market Guide for Data Virtualization, November 16, 2018”
Data virtualization can be used to create virtualized and
integrated views of data in-memory rather than executing data
movement and physically storing integrated views in a target
data structure. It provides a layer of abstraction above the
physical implementation of data, to simplify query logic.

12
What is Data Virtualization?
Consume
in business applications
Combine
related data into views
Connect
to disparate data sources
2
3
1
DATA CONSUMERS
DISPARATE DATA SOURCES
Enterprise Applications, Reporting, BI, Portals, ESB, Mobile, Web, Users
Databases & Warehouses, Cloud/Saas Applications, Big Data, NoSQL, Web, XML, Excel, PDF, Word...
Analytical Operational
Less StructuredMore Structured
CONNECT COMBINE PUBLISH
Multiple Protocols,
Formats
Query, Search,
Browse
Request/Reply,
Event Driven
Secure
Delivery
SQL,
MDX
Web
Services
Big Data
APIs
Web Automation
and Indexing
CONNECT COMBINE CONSUME
Share, Deliver,
Publish, Govern,
Collaborate
Discover, Transform,
Prepare, Improve
Quality, Integrate
Normalized views of
disparate data
“Data virtualization
integrates disparate
data sources in real
time or near-real
time to meet
demands for
analytics and
transactional data.”
– Create a Road Map For A
Real-time, Agile, Self-
Service Data Platform,
Forrester Research, Dec 16,
2015

13
What is Data Virtualization?
1. Single Access Point to all
Data at any location
2. Semantic Layer – Expose
Data in Business-Friendly
form, adapted to the
needs of each consumer
3. Abstract changes in the
underlying infrastructure
4. Single entry point to apply
security and governance
policies
5. Avoid data replication: Up
to 80% reduction in
integration costs, in terms
of resources and
technology data

14
(Almost) Any-to-Many Connectivity
Relational Databases
• MS SQL*Server (JDBC, ODBC): 2000, 2005, 2008,
2008R2, 2012, 2014, 2016
• Oracle (JDBC): 8i, 9i, 10g, 11g, 12c, 18
• Oracle E-Business Suite (JDBC): 12
• IBM DB2 (JDBC): 8, 9, 10, 11, 12 for LUW; 9,10 for z/OS
• Informix (JDBC): 7, 12
• Sybase Adaptive Server Enterprise (JDBC): 12, 15
• MySQL (JDBC): 4, 5
• PostgreSQL (JDBC): 8, 9
• Denodo Platform (JDBC): 5.5, 6.0, 7.0
- For multi-location architecture deployments
• MS Access (ODBC)
• Apache Derby (JDBC): 10
• Generic (JDBC)
In-Memory Databases
• SAP HANA (JDBC): 1
• Oracle TimesTen (JDBC): 11g
• Oracle 12c In-Memory
Parallel databases and appliances
• GreenPlum (JDBC): 4.2
• HP Vertica (JDBC): 7, 8
• Oracle Exadata (JDBC): X5-2
• ParAccel 8.0.2 (using ParAccel 2.5.0.0 JDBC3g/SSL
driver)
• Netezza (JDBC): 4.6, 5.0, 6.0, 7.0
• SybaseIQ (JDBC) 12.x, 15.x
• Teradata (JDBC): 12, 13, 14, 15
Multi-Dimensional Sources
• SAP BW (BAPI/XMLA): 3.x
• SAP BI 7.x (BAPI): 7.x
• Mondrian (XMLA): 3.x
• MS SQL Server Analysis Services 200x
• Essbase (XMLA): 9, 11
Cloud Data Warehouse
• Amazon Redshift (JDBC)
• Amazon Athena (JDBC)
• Amazon Aurora (JDBC)
• Snowflake (JDBC)
• Amazon DynamoDB
• Azure SQL Data Warehouse
• Azure CosmosDB (SQL API and MongoDB API)
Big Data/NoSQL
• Apache Hive (JDBC): 0.12, 1.1.0, 1.1.0 for Cloudera
1.2.1 for Hortonworks 2.0.0
• MapR-XD, MapR-DB, MapR-ES, Hive, and Drill for
MapR 6.1
• Impala (JDBC): 2.3
• Spark SQL (JDBC): 1.5, 1.6
• Google BigQuery (JDBC)
• Presto (JDBC)
Web Automation
• Denodo’s ITPilot automates extraction from web
pages
Indexes and unstructured content
• CMS, file systems, pdf, word, text, email servers,
knowledge bases, indexes
• Elastic Search
Web Services
• SOAP
• REST (XML, RSS, ATOM, JSON)
• OData v2 and v4
Packaged Applications
• SAP ERP/ECC (BAPIs and RFC tables)
• Oracle E-Business Suite 12
• Siebel
• SAS (SAS JDBC Driver): 7 and higher
Semantic Repositories
• Semantic repositories in Triple Stores / RDF
accessed through SPARQL endpoints.
Flat and Binary Files
• CSV, pipe-delimited, Regular expression-parsed
• MS Excel xls 97-2003
• MS Excel xlsx 2007 or later
• MS Access
• XML
• JSON
All files can be locally accessible or in remote
filesystems, through FTP/ SFTP/FTPS, and in clear,
zipped and/or encrypted format.
Active Directory as source or leveraging security
• LDAP v3
• Microsoft Active Directory 2003, 2008
Cloud, SaaS, Web Sources with Simplified OAuth
Security
• Amazon
• Google
• Facebook
• LinkedIn
• MS Azure Data Lake
• MS SharePoint (by using the OData connector)
• MS Dynamics
• ServiceNow
• Marketo
• Salesforce
• Twitter via APIs with simplified Oauth integration
(1.0, 1.0a and 2.0)
• Workday
MS Queues as data source and Delivery
• MQSeries
• SonicMQ
• ActiveMQ
• Tibco EMS
Denodo SDK for Custom Connectors
• CouchDB
• Lotus Domino
• MongoDB and Mongo Atlas DBaaS
Mainframe
• IMS
• IBM IMS native drivers: 8, 9
• IMS Universal Drivers: 11
Hierarchical databases
• Adabas (SOA Gateway and Denodo’s SOAP
connector): 5, 6
Legacy
• Microsoft FoxPro (ODBC)
The following data sources have been successfully
tested with Denodo using JDBC and ODBC drivers,
WS/SOAP and WS/REST, and DenodoConnect
adapters (not exhaustive list):
• Apache Solr
• Kafka Messages
• SAS Files
• Hadoop HBase
• Hadoop HCatalog
• Hadoop HDFS (Avro, CSV, Parquet)
• Files in Amazon S3 (incl. Parquet files)
• IBM BigInsights
• Pivotal HAWQ

15
(Almost) Any-to-Many Connectivity
Many Consumers
Protocols and Formats
• SQL Based access via JDBC, ODBC and ADO.NET
• Web Services
• SOAP (XML/JSON)
• REST (JSON/XML)
• OData
• Open API (a.k.a Swagger)
• Web Parts (for SharePoint), Portlets
• Kafka and JMS listeners for message queues
• Denodo Scheduler for batch process and ‘ETL lite’
Security Options
• Authentication using LDAP or Active Directory
• Kerberos for Single Sign-On (SSO)
• OAuth, OAuth 2.0 (JWT)
• SAML
• SSL/TLS
• WS-Security, X.509 certificates
BI/Reporting tools
• Microstrategy, Cognos, Business Objects, Oracle OBIEE
• Tableau, Qlikview, Spotfire, Microsoft PowerBI
• Excel
Analytical Tools/Languages
• SAS, Statistica, SPSS, MatLab
• R, Python, Java, Scala, etc.
• Azure ML Studio, Amazon Machine Learning
Portals
• SharePoint, Enterprise portals, Web/mobile apps
Enterprise Service Bus
• Oracle Service Bus, Azure Service Bus, TIBCO Active Matrix
Bus
ETL tools
• SAP Data Services, Informatica Powercenter, IBM Data
Stage, Talend ETL
API Management tools
• CA (Layer 7), TIBCO Mashery, Apigee

16
Data Blending – Semantic Silos

17
Data Blending Silos
Q: Is SAP planning to release SAP Universe connections for Power BI and Tableau?
A: The answer is no. No. There are no plans for this.
Gregory Botticchio, Director of Product Management, SAP BusinessObjects
Suite 360 webinar for SAP BusinessObjects 4.3 Release Preview
Beside SAP BusinessObjects, are you
using other analytics solution(s)?

18
Data Blending Limitations
Shared Dataset
(Import Mode)
Shared Dataset
(Direct Mode)
Direct mode is limited
to 1 data source
and 1 million rows

19
Francois Ajenstat, Chief Product Officer, Tableau Software
There are two flows; the ad-hoc and the operational…where we are
coming from is…I just want to integrate these two sources. It's not
formalized, per se, it's not a project. I just want to connect this and this
and I want to analyze it. How do we go from data to analysis as quickly as
possible? And when you want to formalize it, operationalize it, make it
repeatable, then [you use other tools].

21
Denodo’s Coronavirus Data Portal
File
Denodo Express
COVID-19 Edition
Data
Catalog
Data
Portal
JDBC
ODBC
API
GraphQL
GeoJSON
Sandbox
Sandbox
Sandbox

22
Connected Data Sources
Australian Bureau of Statistics Labor Force
Survey
ACAPS
Air Quality Open Data Platform
Allen Institute for AI
ArcGIS Hub
Becker Friedman Institute for Research in
Economics, University of Chicago
California Health and Human Services (CHHS)
Carnegie Mellon University
Centraal Bureau voor de Statistiek (CBS),
Netherlands
COVID19-India (covid19india.org)
Data Science for Social Impact Research Group
(DSFSI), University of Pretoria
Dipartimento della Protezione Civile, Italy
Europa Press
European Centre for Disease Prevention and
Control (ECDC)
Federal Ministry of Social Affairs, Health, Care
and Consumer Protection (BMSGPK), Austria
France GEOJSON
French Government Open Data (data.gouv.fr)
GlobalHealth 50/50
Google - COVID-19 Community Mobility
Reports
Hong Kong Department of Health
Humanitarian Data Exchange
Institute for Health Metrics and Evaluation
(IHME)
Instituto de Salud Carlos III
International Monetary Fund (IMF)
Istituto Nazionale di Statistica, Italy
Johns Hopkins University (JHU) Center for
Systems Science and Engineering (CSSE)
Junta de Castilla y Léon
Kaiser Family Foundation (KFF)
Ministerio de Sanidad, Spain
Ministry of Health of New Zealand
Ministry of Health, Brazil
Ministry of Health, Consumer Affairs and
Social Welfare, Spain
Ministry of Health, Labor and Welfare, Japan
National Institute for Health (NIH) - National
Library of Medicine (NLM)
Netherlands National Institute for Public
Health and the Environment (RIVM)
New York City Department of Health and
Mental Hygiene (DOHMH)
Office for National Statistics, UK
Organisation for Economic Co-operation and
Development (OECD)
Our World in Data
Public Health England
Robert Koch Institute (RKI)
RSS News Feeds
San Francisco Department of Public Health
(SFDPH)
Servicio Publico de Empleo Estatal (SEPE),
Spain
Statista.com
Statistics Austria
Statistics Canada
Statistics Norway
Statistics Sweden
Taiwan Centers for Disease Control
Texas Department of State, Health Services
Thailand Department for Disease Control
The COVID Tracking Project
The Economist
The Government of the Hong Kong Special
Administrative Region - Census and Statistics
Department
The New York Times
The World Bank
United Kingdom Government Open Data
(gov.uk)
United Nations Educational, Scientific and
Cultural Organization (UNESCO)
United Nations Population Division, Department
of Economic and Social Affairs
US Department of Labor
Wharton School of Business, University of
Pennsylvania
World Health Organization (WHO)

23
So, Let’s Have a Look…
https://coronavirusdataportal.com

25
Comparing Apples to Oranges
• Data Virtualization and ‘Data Blending’ serve two different purposes
• Data Blending is focused on a single vendor’s toolset
• It makes it easier for ‘citizen analysts’ to use a specific BI Tool
• It provides a semantic layer for that specific toolset
• It has limitations on real-time use
• Data Virtualization provides an enterprise-wide data fabric layer
• Supports many different consuming tools
• Creates a general purpose semantic layer for all users
• Can mix data delivery modes without limitations
• Use the right tool for the right task

26
Myth #2:
BI Tools and Data Virtualization
are Interchangeable.

Thanks!
www.denodo.com info@denodo.com
© Copyright Denodo Technologies. All rights reserved
Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm,
without prior the written authorization from Denodo Technologies.

Myth Busters II: BI Tools and Data Virtualization are Interchangeable

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Myth Busters II: BI Tools and Data Virtualization are Interchangeable

Similar to Myth Busters II: BI Tools and Data Virtualization are Interchangeable (20)

More from Denodo

More from Denodo (20)

Recently uploaded

Recently uploaded (20)

Myth Busters II: BI Tools and Data Virtualization are Interchangeable