Exploiting Fast And Operational Data For Competitive
Advantage Using Data Virtualisation
Mike Ferguson
Managing Director
Intelligent Business Strategies
Denodo Datafest,
London, October 2017
2
Copyright © Intelligent Business Strategies 1992-2017
Topics
▪ Digital transformation
▪ The impact of digital transformation on data
• Operational challenges in a digital world
• Analytical challenges in a digital world
▪ Enabling exploitation of fast data in a digital enterprise using
data virtualisation
3
Copyright © Intelligent Business Strategies 1992-2017
What Is Digitalization?
Digitalization is the process of moving to a digital
business by making use of digital technologies to
change ways of operating and to create new insights
that provide new revenue and value-producing
opportunities
4
Copyright © Intelligent Business Strategies 1992-2017
What is Digital Transformation?
- A Programme Of Transition To A Digital Enterprise
At the very least this transition includes:
1. Digital transformation of operational systems and processes
2. Digital transformation of analytical systems
3. Rapid closing of the loop between analytics and operations
4. Digitisation of content
5
Copyright © Intelligent Business Strategies 1992-2017
Sales Force
automation apps
Customer facing
bricks & mortar apps
Front-Office Operations
Customer
service apps
Customers
Improve customer
engagement
Digital channels are
generating big data
E-commerce
application
M-commerce
Mobile apps
Social commerce
applications
E-commerce
application
M-commerce
Mobile apps
Social commerce
applications
Computer
System data
customer app
Customer interaction – the new way
- Need access to low latency
operational data
- Need customer intelligent computer
systems
Computer
System data
Customer interaction – the old way
customer employee UI
New Digital Channels Are Becoming The Focal Point For
Customer Interactions In The Digital Front-Office
6
Copyright © Intelligent Business Strategies 1992-2017
Companies Also Need To Respond More Rapidly In A Market
Where Power Has Firmly Shifted To The Customer
Also customers are much
more informed before they
buy and can churn on a
single click.
This means loyalty is cheap
The “Device Generation”
Prospects and customers are now
interacting with applications and not
people and so there is very little time to
engage them
7
Copyright © Intelligent Business Strategies 1992-2017
Challenge – Processes Now Span Cloud & On-Premises Making
Transaction Data Hard To Access To Manage Operations
order
credit
check fulfil ship invoice paymentpackageschedule
Order entry
system
Credit
control
system
Production
planning &
scheduling
CAM
system
Inventory
system
Distribution
system
Billing Gen
Ledger
Orders data Customer data Product data
Order-to-Cash Process
What order changes in the last 10 mins?
What shipments are impacted by the changes
e.g. lack of inventory or shipping capacity?
Which customers are affected?
Operational reporting
is not timely
Inability to respond quickly
to problems
Problems not seen until long after they
happen e.g. incorrect shipments
Operational oversights cause processing
errors & unplanned operational cost
Inability to see across multiple instances of a
system can cause errors & duplication of effort
Business
impact
customer
app
8
Copyright © Intelligent Business Strategies 1992-2017
XYZ
Corp.
Challenge - Many Companies Have Organised Business Units
Processes And Systems Around Products & Services
Customers/
Prospects
Product/service line 1
order credit
check
fulfill ship invoice paymentpackage
Product/service line 2
Product/ service line 3
Channels/
Outlets
order credit
check
fulfill ship invoice paymentpackage
order credit
check
fulfill ship invoice paymentpackage
Order
(product line 1)
Order
(product line 2)
Order
(product line 3)
Enterprise
9
Copyright © Intelligent Business Strategies 1992-2017
New Data - Much Of The New Data Captured Is Fast Data That
Can Provide New Customer and Operational Insights
▪ Machine data
• Clickstream data, e-commerce logs
• IVR logs, App Server logs, DBMS logs
▪ Connected things (Sensor data, IoT)
• Product usage behaviour data, product performance data
• Location, temperature, light, vibration, liquid flow, pressure, RFIDs
▪ Self-service transactions
▪ Semi-structured data e.g., JSON, BSON, XML
▪ Social networks data (often unstructured e.g. Text)
▪ Open government data
Fast
data
10
Copyright © Intelligent Business Strategies 1992-2017
Challenges – New Data Is Being Ingested Into Multiple Types Of
Data Store Making It Harder To Access And Analyse
Enterprise
cloud
storage
I
D N
A G
T E
A S
T
Data.Gov
C
R
U
prod cust
asset
D
MDM
NoSQL
DBMS DW
11
Copyright © Intelligent Business Strategies 1992-2017
Challenge – Improving Profitability And Agility Is Proving Difficult
When Captured Data Is Becoming More Fractured
▪ Data in different locations
▪ Data in different data storage technologies
▪ Different APIs and query languages needed
to access data
▪ Data in different data structures
▪ Different data definitions for the same data in
different data stores
▪ Some data too big to move
▪ Excessive use of ETL to copy data
• Expensive and not agile
▪ Synchronization nightmare
<XML>Text</XML>
Digital
media
RDBMSs
Web
content
E-mail
Flat files
Packaged
applications
Office
documentsLegacy
applications
DW/BI
systems
Big Data applications
Cloud based
applications
ECMS
“Where is all the
Customer Data?”
Accessing, governing and managing
data is becoming increasingly complex
as it becomes more distributed
12
Copyright © Intelligent Business Strategies 1992-2017
Business Implications Of Product Orientation and Fractured
Customer Data In A World Where Customer Is Now King
▪ Different marketing campaigns from different divisions aimed at the same customer
▪ Different sales teams from different divisions selling to the same customer
▪ Customer service is hard
• e.g. “What is my order status for all products ordered?”
▪ Cost of operating is much higher due to duplicate processes across product lines
▪ Can’t see customer / product ownership
▪ Can’t see customer risk and customer profitability
▪ Hard to access and take advantage of new digital data about customers when it is
captured in yet another data store
▪ Higher chance of poor data quality
▪ Difficult to maintain customer data fractured across multiple applications
13
Copyright © Intelligent Business Strategies 1992-2017
Digitalisation - The Requirement Now Is To Capture, Integrate And
Analyse More Data For Deeper Customer Insights AND Do It Quickly
OMNI channel analysis – analyse all
customer interactions across all channels
identity
data
behavioural data
(on-line,
location, product
usage)
social
data
Customer “DNA”
transactional
activity
Needs to be integrated in near real-time for
maximise competitive advantage
Enabling Exploitation of Fast Data in a Digital
Enterprise using Data Virtualisation
15
Copyright © Intelligent Business Strategies 1992-2017
Data Virtualization Makes It Easy To Access And Report on Data
Across Processes To Manage Business Operations
Order-to-Cash Process
Data virtualization and Virtual Data Services
Benefits
Simplified access
Access to real-time data across the process
Agile and responsive
Avoid unplanned operational costs
See across multiple instances of apps
See across on-premises & cloud apps
cost
Agility
order credit
check
fulfil ship invoice paymentpackageschedule
customer
app
16
Copyright © Intelligent Business Strategies 1992-2017
XYZ
Corp.
Data Virtualisation - See Views Of Orders, Shipments And
Payments Across All Lines Of Business
Customers/
Prospects
Product/service line 1
order credit
check
fulfill ship invoice paymentpackage
Product/service line 2
Product/ service line 3
Channels/
Outlets
order credit
check
fulfill ship invoice paymentpackage
order credit
check
fulfill ship invoice paymentpackage
Order
(product line 1)
Order
(product line 2)
Order
(product line 3)
Enterprise
Datavirtualization
Datavirtualization
Datavirtualization
17
Copyright © Intelligent Business Strategies 1992-2017
Performance - Need Parallelism In Data Virtualisation to Speed Up
Data Access And Integration Across Hybrid Operational Processes
order credit
check
fulfil ship invoice paymentpackageschedule
customer
app
DV Slave DV Slave DV Slave DV Slave
SQL
Cost based
optimizer
DV
master
DV Slave
BI Tool Application
In memory caching PLUS
in-memory parallel
processing of aggregations
pushdown pushdown pushdown pushdown
Data
virtualisation
server
in memory
DV needs parallel pushdown and MPP
in-memory processing of cached and
aggregate data
SQL or REST
pushdown
18
Copyright © Intelligent Business Strategies 1992-2017
Data Virtualisation - Integrated Customer insight
Data Virtualisation Can Integrate Customer Insight AND Make It
Available As Services To Integrate Into All Front Office Channels
EDW
DW & marts
NoSQL DB
e.g. graph DB
mart
DW
Appliance
Advanced Analytics
(structured data)
Advanced
Analytics
Streaming
data
RT Analytics
C
R
U
prod cust
asset
master dataCustomer sentiment,
interactions,
online behaviour,
& new data
Customer
relationships*,
social network
influencers
Customer real-
time location,
product usage &
on-line behaviour
Customer
master data
Customer
purchase activity
& transaction
history
Customer predictive
analytical model
development
Sales Force
automation apps
Customer facing
bricks & mortar apps
Front-Office Operations
Customer
service apps
Customers
Improve customer
engagement
E-commerce
application
M-commerce
Mobile apps
Social commerce
applications
Digital channels are
generating big data
e.g. In-store apps
In-branch apps
19
Copyright © Intelligent Business Strategies 1992-2017
Data
sources
Performance - Parallel Processing In Data Virtualisation Speeds
Up Integration Of Customer Insights From Analytical Systems
parallel processing in the source
DV = data virtualisation
EDW
DW & marts
NoSQL DB
e.g. graph DB
mart
DW
Appliance
Advanced Analytics
(structured data)
Advanced
Analytics
Streaming
data
RT Analytics
C
R
U
prod cust
asset
master data
DV Slave DV Slave DV Slave DV Slave
SQL
Cost based
optimizer
DV
master
DV Slave
BI Tool Application
In memory caching PLUS
in-memory parallel
processing of aggregations
pushdown pushdown pushdown pushdown
Data
virtualisation
server
in memory
DV needs parallel pushdown and MPP
in-memory processing of cached and
aggregate data
SQL or REST
pushdown
20
Copyright © Intelligent Business Strategies 1992-2017
Product Example – Denodo 7 In-Memory MPP Query Processing
With Query Pushdown Optimisation
22
Denodo 7: In-memory fabric + Rules engine (aggregation pushdown) + Cost based optimizer
Obtain Total Sales By Customer Country in the Last Two Years
2M rows
(sales by customer
this year)
2M rows
(sales by customer
previous year)
Customer
(2M rows)
Cached
Current Sales
(100 million rows)
Historical Sales
(1 billion rows)
union
group by
customer ID
group by
customer ID
join
Group by
year
Partial Aggregation
push down
Already available in Denodo 6
Maximizes source processing
Reduces network traffic
On-demand Parquet generation
Generation of Parquet file
in the cluster, in streaming mode
Integration with pre-cached data
Cached data already stored in the cluster
in a Parquet file
Fast parallel execution
Support for Spark, Presto and Impala
For fast analytical processing in
inexpensive Hadoop-based solutions
Integrated with Cost Based Optimizer
Based on data volume estimation and
the cost of these particular operations,
the CBO can decide to move all or part
Of the execution tree to the MPP
In-memory + Rules engine (aggregation pushdown) + Cost based optimizer
• Optimizer can decide to move data on the fly to the fabric during query
execution for any part of the execution pipeline
• Uses pushdown to minimize network traffic with in-memory, parallel processing
• Partitioned data caching and MPP of post pushdown query processing operations
• Can combine both and leverage big data technologies like Spark, Presto, Impala, etc.)for high
performance o access fast data volumes in Big Data platforms
Query Acceleration:
21
Copyright © Intelligent Business Strategies 1992-2017
Benefits Of Parallel Processing In The Data Virtualisation Server
▪ Rapid integration of operational data across hybrid processes
▪ Rapid integration of insights across big data, fast data and data warehouse
data stores
▪ Smart customer facing applications able to access in-memory information
services that integrate data in parallel
▪ High performance operational and analytical processing in a modern digital
enterprise through parallel processing of
• In-memory aggregate data retrieved from sources
• Cached data in the data virtualisation server
• Data in some sources after pushdown
22
Copyright © Intelligent Business Strategies 1992-2017
Thank You!
www.intelligentbusiness.biz
mferguson@intelligentbusiness.biz
@mikeferguson1
(+44)1625 520700
Thank You!
Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an
independent analyst and consultant he specializes in business intelligence, analytics, data
management and big data. With over 35 years of IT experience, Mike has consulted for dozens
of companies, spoken at events all over the world and written numerous articles. Formerly he
was a principal and co-founder of Codd and Date Europe Limited – the inventors of the
Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing
Director of DataBase Associates.

Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

  • 1.
    Exploiting Fast AndOperational Data For Competitive Advantage Using Data Virtualisation Mike Ferguson Managing Director Intelligent Business Strategies Denodo Datafest, London, October 2017
  • 2.
    2 Copyright © IntelligentBusiness Strategies 1992-2017 Topics ▪ Digital transformation ▪ The impact of digital transformation on data • Operational challenges in a digital world • Analytical challenges in a digital world ▪ Enabling exploitation of fast data in a digital enterprise using data virtualisation
  • 3.
    3 Copyright © IntelligentBusiness Strategies 1992-2017 What Is Digitalization? Digitalization is the process of moving to a digital business by making use of digital technologies to change ways of operating and to create new insights that provide new revenue and value-producing opportunities
  • 4.
    4 Copyright © IntelligentBusiness Strategies 1992-2017 What is Digital Transformation? - A Programme Of Transition To A Digital Enterprise At the very least this transition includes: 1. Digital transformation of operational systems and processes 2. Digital transformation of analytical systems 3. Rapid closing of the loop between analytics and operations 4. Digitisation of content
  • 5.
    5 Copyright © IntelligentBusiness Strategies 1992-2017 Sales Force automation apps Customer facing bricks & mortar apps Front-Office Operations Customer service apps Customers Improve customer engagement Digital channels are generating big data E-commerce application M-commerce Mobile apps Social commerce applications E-commerce application M-commerce Mobile apps Social commerce applications Computer System data customer app Customer interaction – the new way - Need access to low latency operational data - Need customer intelligent computer systems Computer System data Customer interaction – the old way customer employee UI New Digital Channels Are Becoming The Focal Point For Customer Interactions In The Digital Front-Office
  • 6.
    6 Copyright © IntelligentBusiness Strategies 1992-2017 Companies Also Need To Respond More Rapidly In A Market Where Power Has Firmly Shifted To The Customer Also customers are much more informed before they buy and can churn on a single click. This means loyalty is cheap The “Device Generation” Prospects and customers are now interacting with applications and not people and so there is very little time to engage them
  • 7.
    7 Copyright © IntelligentBusiness Strategies 1992-2017 Challenge – Processes Now Span Cloud & On-Premises Making Transaction Data Hard To Access To Manage Operations order credit check fulfil ship invoice paymentpackageschedule Order entry system Credit control system Production planning & scheduling CAM system Inventory system Distribution system Billing Gen Ledger Orders data Customer data Product data Order-to-Cash Process What order changes in the last 10 mins? What shipments are impacted by the changes e.g. lack of inventory or shipping capacity? Which customers are affected? Operational reporting is not timely Inability to respond quickly to problems Problems not seen until long after they happen e.g. incorrect shipments Operational oversights cause processing errors & unplanned operational cost Inability to see across multiple instances of a system can cause errors & duplication of effort Business impact customer app
  • 8.
    8 Copyright © IntelligentBusiness Strategies 1992-2017 XYZ Corp. Challenge - Many Companies Have Organised Business Units Processes And Systems Around Products & Services Customers/ Prospects Product/service line 1 order credit check fulfill ship invoice paymentpackage Product/service line 2 Product/ service line 3 Channels/ Outlets order credit check fulfill ship invoice paymentpackage order credit check fulfill ship invoice paymentpackage Order (product line 1) Order (product line 2) Order (product line 3) Enterprise
  • 9.
    9 Copyright © IntelligentBusiness Strategies 1992-2017 New Data - Much Of The New Data Captured Is Fast Data That Can Provide New Customer and Operational Insights ▪ Machine data • Clickstream data, e-commerce logs • IVR logs, App Server logs, DBMS logs ▪ Connected things (Sensor data, IoT) • Product usage behaviour data, product performance data • Location, temperature, light, vibration, liquid flow, pressure, RFIDs ▪ Self-service transactions ▪ Semi-structured data e.g., JSON, BSON, XML ▪ Social networks data (often unstructured e.g. Text) ▪ Open government data Fast data
  • 10.
    10 Copyright © IntelligentBusiness Strategies 1992-2017 Challenges – New Data Is Being Ingested Into Multiple Types Of Data Store Making It Harder To Access And Analyse Enterprise cloud storage I D N A G T E A S T Data.Gov C R U prod cust asset D MDM NoSQL DBMS DW
  • 11.
    11 Copyright © IntelligentBusiness Strategies 1992-2017 Challenge – Improving Profitability And Agility Is Proving Difficult When Captured Data Is Becoming More Fractured ▪ Data in different locations ▪ Data in different data storage technologies ▪ Different APIs and query languages needed to access data ▪ Data in different data structures ▪ Different data definitions for the same data in different data stores ▪ Some data too big to move ▪ Excessive use of ETL to copy data • Expensive and not agile ▪ Synchronization nightmare <XML>Text</XML> Digital media RDBMSs Web content E-mail Flat files Packaged applications Office documentsLegacy applications DW/BI systems Big Data applications Cloud based applications ECMS “Where is all the Customer Data?” Accessing, governing and managing data is becoming increasingly complex as it becomes more distributed
  • 12.
    12 Copyright © IntelligentBusiness Strategies 1992-2017 Business Implications Of Product Orientation and Fractured Customer Data In A World Where Customer Is Now King ▪ Different marketing campaigns from different divisions aimed at the same customer ▪ Different sales teams from different divisions selling to the same customer ▪ Customer service is hard • e.g. “What is my order status for all products ordered?” ▪ Cost of operating is much higher due to duplicate processes across product lines ▪ Can’t see customer / product ownership ▪ Can’t see customer risk and customer profitability ▪ Hard to access and take advantage of new digital data about customers when it is captured in yet another data store ▪ Higher chance of poor data quality ▪ Difficult to maintain customer data fractured across multiple applications
  • 13.
    13 Copyright © IntelligentBusiness Strategies 1992-2017 Digitalisation - The Requirement Now Is To Capture, Integrate And Analyse More Data For Deeper Customer Insights AND Do It Quickly OMNI channel analysis – analyse all customer interactions across all channels identity data behavioural data (on-line, location, product usage) social data Customer “DNA” transactional activity Needs to be integrated in near real-time for maximise competitive advantage
  • 14.
    Enabling Exploitation ofFast Data in a Digital Enterprise using Data Virtualisation
  • 15.
    15 Copyright © IntelligentBusiness Strategies 1992-2017 Data Virtualization Makes It Easy To Access And Report on Data Across Processes To Manage Business Operations Order-to-Cash Process Data virtualization and Virtual Data Services Benefits Simplified access Access to real-time data across the process Agile and responsive Avoid unplanned operational costs See across multiple instances of apps See across on-premises & cloud apps cost Agility order credit check fulfil ship invoice paymentpackageschedule customer app
  • 16.
    16 Copyright © IntelligentBusiness Strategies 1992-2017 XYZ Corp. Data Virtualisation - See Views Of Orders, Shipments And Payments Across All Lines Of Business Customers/ Prospects Product/service line 1 order credit check fulfill ship invoice paymentpackage Product/service line 2 Product/ service line 3 Channels/ Outlets order credit check fulfill ship invoice paymentpackage order credit check fulfill ship invoice paymentpackage Order (product line 1) Order (product line 2) Order (product line 3) Enterprise Datavirtualization Datavirtualization Datavirtualization
  • 17.
    17 Copyright © IntelligentBusiness Strategies 1992-2017 Performance - Need Parallelism In Data Virtualisation to Speed Up Data Access And Integration Across Hybrid Operational Processes order credit check fulfil ship invoice paymentpackageschedule customer app DV Slave DV Slave DV Slave DV Slave SQL Cost based optimizer DV master DV Slave BI Tool Application In memory caching PLUS in-memory parallel processing of aggregations pushdown pushdown pushdown pushdown Data virtualisation server in memory DV needs parallel pushdown and MPP in-memory processing of cached and aggregate data SQL or REST pushdown
  • 18.
    18 Copyright © IntelligentBusiness Strategies 1992-2017 Data Virtualisation - Integrated Customer insight Data Virtualisation Can Integrate Customer Insight AND Make It Available As Services To Integrate Into All Front Office Channels EDW DW & marts NoSQL DB e.g. graph DB mart DW Appliance Advanced Analytics (structured data) Advanced Analytics Streaming data RT Analytics C R U prod cust asset master dataCustomer sentiment, interactions, online behaviour, & new data Customer relationships*, social network influencers Customer real- time location, product usage & on-line behaviour Customer master data Customer purchase activity & transaction history Customer predictive analytical model development Sales Force automation apps Customer facing bricks & mortar apps Front-Office Operations Customer service apps Customers Improve customer engagement E-commerce application M-commerce Mobile apps Social commerce applications Digital channels are generating big data e.g. In-store apps In-branch apps
  • 19.
    19 Copyright © IntelligentBusiness Strategies 1992-2017 Data sources Performance - Parallel Processing In Data Virtualisation Speeds Up Integration Of Customer Insights From Analytical Systems parallel processing in the source DV = data virtualisation EDW DW & marts NoSQL DB e.g. graph DB mart DW Appliance Advanced Analytics (structured data) Advanced Analytics Streaming data RT Analytics C R U prod cust asset master data DV Slave DV Slave DV Slave DV Slave SQL Cost based optimizer DV master DV Slave BI Tool Application In memory caching PLUS in-memory parallel processing of aggregations pushdown pushdown pushdown pushdown Data virtualisation server in memory DV needs parallel pushdown and MPP in-memory processing of cached and aggregate data SQL or REST pushdown
  • 20.
    20 Copyright © IntelligentBusiness Strategies 1992-2017 Product Example – Denodo 7 In-Memory MPP Query Processing With Query Pushdown Optimisation 22 Denodo 7: In-memory fabric + Rules engine (aggregation pushdown) + Cost based optimizer Obtain Total Sales By Customer Country in the Last Two Years 2M rows (sales by customer this year) 2M rows (sales by customer previous year) Customer (2M rows) Cached Current Sales (100 million rows) Historical Sales (1 billion rows) union group by customer ID group by customer ID join Group by year Partial Aggregation push down Already available in Denodo 6 Maximizes source processing Reduces network traffic On-demand Parquet generation Generation of Parquet file in the cluster, in streaming mode Integration with pre-cached data Cached data already stored in the cluster in a Parquet file Fast parallel execution Support for Spark, Presto and Impala For fast analytical processing in inexpensive Hadoop-based solutions Integrated with Cost Based Optimizer Based on data volume estimation and the cost of these particular operations, the CBO can decide to move all or part Of the execution tree to the MPP In-memory + Rules engine (aggregation pushdown) + Cost based optimizer • Optimizer can decide to move data on the fly to the fabric during query execution for any part of the execution pipeline • Uses pushdown to minimize network traffic with in-memory, parallel processing • Partitioned data caching and MPP of post pushdown query processing operations • Can combine both and leverage big data technologies like Spark, Presto, Impala, etc.)for high performance o access fast data volumes in Big Data platforms Query Acceleration:
  • 21.
    21 Copyright © IntelligentBusiness Strategies 1992-2017 Benefits Of Parallel Processing In The Data Virtualisation Server ▪ Rapid integration of operational data across hybrid processes ▪ Rapid integration of insights across big data, fast data and data warehouse data stores ▪ Smart customer facing applications able to access in-memory information services that integrate data in parallel ▪ High performance operational and analytical processing in a modern digital enterprise through parallel processing of • In-memory aggregate data retrieved from sources • Cached data in the data virtualisation server • Data in some sources after pushdown
  • 22.
    22 Copyright © IntelligentBusiness Strategies 1992-2017 Thank You! www.intelligentbusiness.biz mferguson@intelligentbusiness.biz @mikeferguson1 (+44)1625 520700 Thank You! Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an independent analyst and consultant he specializes in business intelligence, analytics, data management and big data. With over 35 years of IT experience, Mike has consulted for dozens of companies, spoken at events all over the world and written numerous articles. Formerly he was a principal and co-founder of Codd and Date Europe Limited – the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing Director of DataBase Associates.