1
Real-Time Data For
Faster, Better Insights
Automated &
Continuous Refinement
Trusted,
Enterprise-Ready Data
Change Data Capture
Pipeline Automation &
Orchestration
Smart, Integrated
Data Catalog
Universal Sources, Targets,
& Platforms
Resilient & Self Healing Security and Governance
Enterprise-Wide
Monitoring & Control
Prepare & Provision at Scale IT & Business Collaboration
QDI for Managed Data Lakes
Continuously Updated & Analytics-Ready Data
2
Summary message
The Qlik Data Integration (QDI) Platform for Managed Data Lakes enables enterprises to quickly gain return from their data lake
investments by providing continuously updated, accurate, and trusted data sets for business analytics.
The enterprise-class solution automates the entire data lake pipeline from real time data ingestion, to refinement, provisioning and
governance, driving agility in data and analytics process. Qlik solution continually ingests data, from virtually any source – legacy
systems, warehouses, enterprise applications, and more – to data lake of choice; automates schema creation, data refinement, and
provisioning while persisting history for data lineage and trust; and comes complete with metadata management and a secure, self-
service data catalog, so business users can easily find, understand, and use enterprise data for timely business insights.
• Real-Time Data For Faster, Better Insights
- Industry-leading Change Data Capture capability in the solution enables real-time data ingestion to accelerate data
movement to your data lakes from virtually any source – databases, data warehouses, and enterprise systems such as SAP
and mainframe. A fully automated interface eliminates manual coding and provides you the ability to configure, control and
monitor all streaming data pipelines across the enterprise.
• Automated & Continuous Refinement
- Qlik solution standardizes and combines change streams into a single transformation-ready data store in the lake;
automatically merges multi-table and/or multi-sourced data into a flexible format and structure; and creates operational and
historical data stores to enable provisioning of enriched data sub-sets to a target. The solution retains full change history to
help rewind, identify/remediate bugs if needed to ensure data quality and trust.
• Trusted, Enterprise-Ready Data
- Qlik solution persists entire change history of sources, targets, replication and transformation processes for end-to-end data
lineage. The solution also builds a secure, enterprise-scale catalog of all the data, not just in your lake, but also across all
your sources, providing business users a single, trusted data marketplace, they can self-sufficiently access to easily find,
understand, and use any enterprise data without IT support or reliance.
3
Qlik for Managed Data Lakes
Real-Time Data For Faster, Better Insights
Headline Why do people care? What is it? What makes ours better?
Change
Data
Capture
Data Engineers are tasked with moving and
integrating data from a variety of core transactional
systems. The traditional approach to overnight batch
processing is not meeting the business need for more
real-time data and it is not an efficient way to deliver
data to the cloud.
Qlik provides fully automated real-
time change data capture (CDC) to
identify and move just the changes to
data sets and metadata as they
occur.
• Agentless and log-based approach to CDC ensures minimal
impact to performance of production systems.
• By capturing incremental changes, the solution ensures data
and data structure updates are applied at near-zero latency
• The solution offers flexibility to automatically apply these
changes to any desired delivery target – from transactional, to
data lake/warehouse, to stream optimized.
Universal
Sources,
Targets, and
Platforms
Users want to be able to utilize data from a wider
variety of sources, formats, and locations (on-prem,
global and multi-cloud) while Enterprise Architects are
looking for a single data integration infrastructure that
supports not only today’s infrastructure but can easily
adapt to new technologies and platforms
Qlik Data Integration supports
connectivity to over 30 different data
sources, 40 different targets, and all
of the major technology stacks for the
public cloud vendors, thus enabling
real-time data ingestion into all
popular data lakes from the widest
variety of heterogeneous systems.
• We support the largest set of data sources and targets each of
which are optimized to the native APIs and formats.
• Automatic data cleansing, validation, and profiling ensure data
quality, and document the exact content, and structure of each
source. Built-in data analysis and conversion simplifies
onboarding data, including complex, legacy, and dirty data.
• No other integration offering offers the breadth and depth of
cloud platform coverage including object stores, databases,
data warehouses, data lakes and streaming
Enterprise-
wide
Monitoring
& Control
IT organizations often ingest data from hundreds or
even thousands of sources to multiple targets
simultaneously. Monitoring and managing accurate
set up for all these tasks, and then debugging errors,
broken links, incomplete tasks etc. during execution is
tedious, resource intensive, and overall an
administrative nightmare.
Qlik Data Integration comes with a
fully automated interface to design,
execute and monitor thousands of
data replication tasks through a
single console.
• Fully automated, wizard driven interface enables data
engineers and database analysts to create data endpoints,
design, execute and monitor thousands of data replication
tasks through a single console.
• User-defined alerts and KPIs enable exception monitoring
reducing the administrative overhead substantially.
Continuously Updated and Analytics-Ready Data
4
Qlik for Managed Data Lakes
Automated & Continuous Refinement
Headline Why do people care? What is it? What makes ours better?
Pipeline
Automation &
Orchestration
Enterprises are moving to cloud data lakes
as they provide greater agility and elasticity;
however, they continue to be challenged to
efficiently create analytics-ready data sets
from heterogeneous data sources. Such
integration can be a manually intensive and
complex endeavor, challenging to assemble
and often resulting in outdated data when
finally ready for business consumption
The Qlik Data Integration accelerates data
pipelines with efficient changed data transfer at
scale and automation of data transformations.
It fully automates data pipelines – from the
generation of source system data streams right
through to the creation of analytics-ready data
sets. The solution provides an orchestration
layer to abstract all of the underlying
complexity and automate the integration
processes at each step - landing, staging,
refining and provisioning.
• Qlik offers the only solution that combines real-time CDC
together with complete automation from raw to analytics-
ready data lakes.
• A unified orchestration and management console allows the
design, execution, and monitoring of integration tasks
across large and growing business landscapes, both on
premise and multi-cloud.
• Data Engineers can now rapidly add new data sources and
create purpose-built data sets to meet evolving business
needs without any coding.
This allows for improved agility, productivity and governance.
Resilient &
Self-Healing
System overloads, poorly set-up ingestion
tasks, application breakdown and other
issues can lead to corrupt, inconsistent
data, multiple schemas and other data
quality/ integrity issues between source and
target systems.
Qlik Data Integration captures changes to data
sets and metadata to ensure schema changes
are automatically propagated for source-target
schema sync. The solution retains full change
history for trusted data and lineage.
• The solution provides resiliency to source schema drift
through capturing and transferring changed metadata
• The solution persists entire change history of sources,
targets, replication and transformation processes to help
rewind, identify/remediate bugs if needed, for data quality
and end-to-end data lineage.
Prepare &
Provision At
Scale
Data is stored in multiple heterogeneous
systems and formats and difficult to
combine. Ideally IT will operationalize the
core sets of data that business needs
without pushing all of the data preparation
tasks to the users.
Qlik delivers data preparation where it’s
needed – both to IT users who can quickly
build scalable data pipelines, and to business
users if they have an ad-hoc requirement for
blending non-managed data.
• Qlik meets the needs of both IT and business users. IT
needs the power to quickly perform complex data
federation, modeling and transformations that can be
executed at scale. Business users need a subset of
capabilities with a simple yet powerful user interface. And
every new data set that is created is also registered to our
catalog for immediate use by others.
Continuously Updated and Analytics-Ready Data
5
Qlik for Managed Data Lakes
Trusted, Enterprise-Ready Data
Headline Why do people care? What is it? What makes ours better?
Smart,
Integrated
Data Catalog
To get a return on their investments in
data lake, organizations need to ensure
that IT is able to manage large data
collections simply and efficiently; and
data consumers are able at understand,
find and access all data in their
collection.
An integrated catalog of technical,
operational and business metadata which
organizes, documents, and describes all
data in the data collection.
• The solution automatically profiles data and generates rich
metadata for all data, not only in the data lake but across all
enterprise sources, so users can leverage all data.
• Metadata and data are directly linked by the catalog (living
pipeline/connection) so that Metadata is always complete and
accurate. It enables the most up-to-date central data marketplace
where users can easily access, find, understand and utilize data.
Security &
Governance
Corporate security and governance
teams require enterprise data
management platforms that ensure the
security and governance of all data
made available to the business.
The solution I provides enterprise-scale
data access controls and data obfuscation
capabilities to ensure data is protected
and secure. It also integrates with other
open-source and commercial security &
governance products for data protection
across platforms
• The solution preserves data through “raw to ready” preparation
process. Lineage tracks the journey of each dataset, allowing users
to understand its origin and evolution, and thereby trustworthiness
• The platform approach of the solution means all security and
governance measures are consistently applied from end-to-end.
This removes potential failure points, tightens security, and reduces
risk. Data protection features are easy to administer even in large
settings with many users, data sources, or complex infrastructure.
IT & Business
Collaboration
To gain and share insights quickly and
efficiently, data consumers often need to
reuse previously IT created datasets,
and at other times create new data
preparation jobs and add their own
insights into data.
The solution enables IT and business
users to collaborate. IT can create
datasets and data prep jobs for re-use/re-
purpose by business users. It also allows
business users to share knowledge by
adding business names, definitions, or
tags to specific data assets
• By being centralized rather than departmental, the Qlik Data
Integration platform allows collaboration through reuse of data
assets between user groups across the organization.
• The solution promotes collaboration by supporting needs of
multiple personas, with data quality/ data profiling capabilities for
data engineers, centralized security and governance enforcement
for data stewards in IT, and central data marketplace to enable
data consumers access, find, understand and self-provision.
Continuously Updated and Analytics-Ready Data

QDI for Managed Data Lakes Messaging.pptx

  • 1.
    1 Real-Time Data For Faster,Better Insights Automated & Continuous Refinement Trusted, Enterprise-Ready Data Change Data Capture Pipeline Automation & Orchestration Smart, Integrated Data Catalog Universal Sources, Targets, & Platforms Resilient & Self Healing Security and Governance Enterprise-Wide Monitoring & Control Prepare & Provision at Scale IT & Business Collaboration QDI for Managed Data Lakes Continuously Updated & Analytics-Ready Data
  • 2.
    2 Summary message The QlikData Integration (QDI) Platform for Managed Data Lakes enables enterprises to quickly gain return from their data lake investments by providing continuously updated, accurate, and trusted data sets for business analytics. The enterprise-class solution automates the entire data lake pipeline from real time data ingestion, to refinement, provisioning and governance, driving agility in data and analytics process. Qlik solution continually ingests data, from virtually any source – legacy systems, warehouses, enterprise applications, and more – to data lake of choice; automates schema creation, data refinement, and provisioning while persisting history for data lineage and trust; and comes complete with metadata management and a secure, self- service data catalog, so business users can easily find, understand, and use enterprise data for timely business insights. • Real-Time Data For Faster, Better Insights - Industry-leading Change Data Capture capability in the solution enables real-time data ingestion to accelerate data movement to your data lakes from virtually any source – databases, data warehouses, and enterprise systems such as SAP and mainframe. A fully automated interface eliminates manual coding and provides you the ability to configure, control and monitor all streaming data pipelines across the enterprise. • Automated & Continuous Refinement - Qlik solution standardizes and combines change streams into a single transformation-ready data store in the lake; automatically merges multi-table and/or multi-sourced data into a flexible format and structure; and creates operational and historical data stores to enable provisioning of enriched data sub-sets to a target. The solution retains full change history to help rewind, identify/remediate bugs if needed to ensure data quality and trust. • Trusted, Enterprise-Ready Data - Qlik solution persists entire change history of sources, targets, replication and transformation processes for end-to-end data lineage. The solution also builds a secure, enterprise-scale catalog of all the data, not just in your lake, but also across all your sources, providing business users a single, trusted data marketplace, they can self-sufficiently access to easily find, understand, and use any enterprise data without IT support or reliance.
  • 3.
    3 Qlik for ManagedData Lakes Real-Time Data For Faster, Better Insights Headline Why do people care? What is it? What makes ours better? Change Data Capture Data Engineers are tasked with moving and integrating data from a variety of core transactional systems. The traditional approach to overnight batch processing is not meeting the business need for more real-time data and it is not an efficient way to deliver data to the cloud. Qlik provides fully automated real- time change data capture (CDC) to identify and move just the changes to data sets and metadata as they occur. • Agentless and log-based approach to CDC ensures minimal impact to performance of production systems. • By capturing incremental changes, the solution ensures data and data structure updates are applied at near-zero latency • The solution offers flexibility to automatically apply these changes to any desired delivery target – from transactional, to data lake/warehouse, to stream optimized. Universal Sources, Targets, and Platforms Users want to be able to utilize data from a wider variety of sources, formats, and locations (on-prem, global and multi-cloud) while Enterprise Architects are looking for a single data integration infrastructure that supports not only today’s infrastructure but can easily adapt to new technologies and platforms Qlik Data Integration supports connectivity to over 30 different data sources, 40 different targets, and all of the major technology stacks for the public cloud vendors, thus enabling real-time data ingestion into all popular data lakes from the widest variety of heterogeneous systems. • We support the largest set of data sources and targets each of which are optimized to the native APIs and formats. • Automatic data cleansing, validation, and profiling ensure data quality, and document the exact content, and structure of each source. Built-in data analysis and conversion simplifies onboarding data, including complex, legacy, and dirty data. • No other integration offering offers the breadth and depth of cloud platform coverage including object stores, databases, data warehouses, data lakes and streaming Enterprise- wide Monitoring & Control IT organizations often ingest data from hundreds or even thousands of sources to multiple targets simultaneously. Monitoring and managing accurate set up for all these tasks, and then debugging errors, broken links, incomplete tasks etc. during execution is tedious, resource intensive, and overall an administrative nightmare. Qlik Data Integration comes with a fully automated interface to design, execute and monitor thousands of data replication tasks through a single console. • Fully automated, wizard driven interface enables data engineers and database analysts to create data endpoints, design, execute and monitor thousands of data replication tasks through a single console. • User-defined alerts and KPIs enable exception monitoring reducing the administrative overhead substantially. Continuously Updated and Analytics-Ready Data
  • 4.
    4 Qlik for ManagedData Lakes Automated & Continuous Refinement Headline Why do people care? What is it? What makes ours better? Pipeline Automation & Orchestration Enterprises are moving to cloud data lakes as they provide greater agility and elasticity; however, they continue to be challenged to efficiently create analytics-ready data sets from heterogeneous data sources. Such integration can be a manually intensive and complex endeavor, challenging to assemble and often resulting in outdated data when finally ready for business consumption The Qlik Data Integration accelerates data pipelines with efficient changed data transfer at scale and automation of data transformations. It fully automates data pipelines – from the generation of source system data streams right through to the creation of analytics-ready data sets. The solution provides an orchestration layer to abstract all of the underlying complexity and automate the integration processes at each step - landing, staging, refining and provisioning. • Qlik offers the only solution that combines real-time CDC together with complete automation from raw to analytics- ready data lakes. • A unified orchestration and management console allows the design, execution, and monitoring of integration tasks across large and growing business landscapes, both on premise and multi-cloud. • Data Engineers can now rapidly add new data sources and create purpose-built data sets to meet evolving business needs without any coding. This allows for improved agility, productivity and governance. Resilient & Self-Healing System overloads, poorly set-up ingestion tasks, application breakdown and other issues can lead to corrupt, inconsistent data, multiple schemas and other data quality/ integrity issues between source and target systems. Qlik Data Integration captures changes to data sets and metadata to ensure schema changes are automatically propagated for source-target schema sync. The solution retains full change history for trusted data and lineage. • The solution provides resiliency to source schema drift through capturing and transferring changed metadata • The solution persists entire change history of sources, targets, replication and transformation processes to help rewind, identify/remediate bugs if needed, for data quality and end-to-end data lineage. Prepare & Provision At Scale Data is stored in multiple heterogeneous systems and formats and difficult to combine. Ideally IT will operationalize the core sets of data that business needs without pushing all of the data preparation tasks to the users. Qlik delivers data preparation where it’s needed – both to IT users who can quickly build scalable data pipelines, and to business users if they have an ad-hoc requirement for blending non-managed data. • Qlik meets the needs of both IT and business users. IT needs the power to quickly perform complex data federation, modeling and transformations that can be executed at scale. Business users need a subset of capabilities with a simple yet powerful user interface. And every new data set that is created is also registered to our catalog for immediate use by others. Continuously Updated and Analytics-Ready Data
  • 5.
    5 Qlik for ManagedData Lakes Trusted, Enterprise-Ready Data Headline Why do people care? What is it? What makes ours better? Smart, Integrated Data Catalog To get a return on their investments in data lake, organizations need to ensure that IT is able to manage large data collections simply and efficiently; and data consumers are able at understand, find and access all data in their collection. An integrated catalog of technical, operational and business metadata which organizes, documents, and describes all data in the data collection. • The solution automatically profiles data and generates rich metadata for all data, not only in the data lake but across all enterprise sources, so users can leverage all data. • Metadata and data are directly linked by the catalog (living pipeline/connection) so that Metadata is always complete and accurate. It enables the most up-to-date central data marketplace where users can easily access, find, understand and utilize data. Security & Governance Corporate security and governance teams require enterprise data management platforms that ensure the security and governance of all data made available to the business. The solution I provides enterprise-scale data access controls and data obfuscation capabilities to ensure data is protected and secure. It also integrates with other open-source and commercial security & governance products for data protection across platforms • The solution preserves data through “raw to ready” preparation process. Lineage tracks the journey of each dataset, allowing users to understand its origin and evolution, and thereby trustworthiness • The platform approach of the solution means all security and governance measures are consistently applied from end-to-end. This removes potential failure points, tightens security, and reduces risk. Data protection features are easy to administer even in large settings with many users, data sources, or complex infrastructure. IT & Business Collaboration To gain and share insights quickly and efficiently, data consumers often need to reuse previously IT created datasets, and at other times create new data preparation jobs and add their own insights into data. The solution enables IT and business users to collaborate. IT can create datasets and data prep jobs for re-use/re- purpose by business users. It also allows business users to share knowledge by adding business names, definitions, or tags to specific data assets • By being centralized rather than departmental, the Qlik Data Integration platform allows collaboration through reuse of data assets between user groups across the organization. • The solution promotes collaboration by supporting needs of multiple personas, with data quality/ data profiling capabilities for data engineers, centralized security and governance enforcement for data stewards in IT, and central data marketplace to enable data consumers access, find, understand and self-provision. Continuously Updated and Analytics-Ready Data