In this session you will learn how Qlikโs Data Integration platform (formerly Attunity) reduces time to market and time to insights for modern data architectures through real-time automated pipelines for data warehouse and data lake initiatives. Hear how pipeline automation has impacted large financial services organizations ability to rapidly deliver value and see how to build an automated near real-time pipeline to efficiently load and transform data into a Snowflake data warehouse on AWS in under 10 minutes.
1. Accelerate and Modernize Your
Data Pipelines
with Qlik Data Integration
February 26, 2020
Tim Garrod
Analytics Data Architect
2. ยฉ 2019 QlikTech International AB. All rights reserved.
3
IaaS
PaaS
Micro
Services
DB
MF
EDW
FILES
DWaaS
DB
MF
EDW
FILES
IaaS
PaaS
SaaS
DB
MF
EDW
FILES
Data Lake
DATA
CONSUMPTION
& ANALYTICS
Trends Driving Integration Modernization & Automation
CLOUD APPLICATION DEVELOPMENT
Legacy application modernization
Faster, easier new application
development
Higher scalability/elasticity
Infrastructure and maintenance cost
savings
Requires real-time data from
on-premise systems
DATA WAREHOUSE MODERNIZATION
Reduce the costs associated with
legacy EDWโs and provide elasticity
Meet new business requirements
Support more advanced analytics
Data Warehouse Automation
replaces traditional ETL with modern
self-service capabilities
Requires real-time data from on-
premise systems and cloud platforms
NEXT GENERATION ANALYTICS & DATA
MONETIZATION
Analyze a broader set of data structures
along with structured data
Faster and improved decision making
Leverage AI/ML, IoT and decision
automation for a competitive advantage
Requires Managed Data Lake Creation
and Big Data processing at scale
Requires real-time data from on-
premise systems and cloud platforms
3. 4
Data Architecture evolution
Bulk data movement
Brittle hand-coded ETL
Monolithic / appliance driven architectures (one size fits all)
Slow time to market / react to change
Legacy Data Warehouse Architecture
โ Real-time data movement
โ Automated design and code generation
โ Use-case driven scalable technologies
โ Faster Time to Market / Value
Modern Analytics Data Management
Architecture
Platform
Development lifecycle?
4. 5
Qlik Data Integration
Automated Data Pipelines
โข Data Movement Automation
- Low impact, real-time data replication
- Heterogeneous sources & targets
โข Data Lake Automation
- Automated schema evolution
- Analytics ready data provisioning
โข Data Warehouse Automation
- Agile end-to-end dw lifecycle
- ETL code generation
โข Operational Data Catalog
- Smart data catalog w/ intelligent tagging
- Shop & Publish
Heterogeneous Low Code Platform
Enterprise Manager
CENTRALIZED OPERATIONAL MANAGEMENT, CONTROL & ALERTS
MAINFRAME
SAAS
FILES
RDBMS
NoSQL
SAP
Replicate
REAL-TIME / BATCH
CHANGE DATA CAPTURE
CATALOG > PROFILE > ENRICH > SHOP > PUBLISH
Data Catalyst
Compose for
Data Warehouses
STAGING > EDW > MARTS
DATA WAREHOUSE
AUTOMATION
Compose for
Data Lakes
RAW > ASSEMBLE > PROVISION
DATA LAKE
AUTOMATION
STREAM GENERATION
STREAMING &
RDMBS Platforms
HETEROGENEOUS
DATA REPLICATION
DATA
INTELLIGENCE
ENTERPRISE DATA
SOURCES
formerly
9. 12
Vanguard
Stream Processing & Data Lake Hydration in AWS
โข Largest provider of
Mutual Funds in the
world
โข Over $5.3 trillion in AUM
Challenges
โข Reduce mainframe MIPS
โข Modernize data architecture in AWS to
โข enable stream processing
โข reduce friction for data lake hydration
Solution
โข Qlik Data Integration
โข Replicate for delivery
to stream based
processing in AWS
Kinesis
โข Fan-out delivery to S3
for lake hydration
**Qlik Replicate โ
formerly Attunity
Replicate
Results
โข Fan-out architecture for streaming
โข Modernize โdata-lake hydrationโ and ingestion
AWS re:Invent 2019: Data platform engineering: How Vanguard is migrating data to AWS
11. 14
Deliver Analytics Ready Data Sets
Automated Spark data pipelines
Automated Schema Evolution
Analytics-ready data sets
12. 17
Large Multinational Bank
Expand data access via Data Lake
โข Large multinational bank
โข Over 25 million customers
and over $975bn in
assets
Challenges
โข Enable data management and access to more data in the data lake
โข Deliver data that is ready for use
โข Extraction of data from heterogeneous data sources to support batch and
event-driven workloads
Solution
โข QDI Managed Data Lake
Solution
โข Replicate for mainframe,
Oracle and other
heterogeneous data
sources
โข Compose for Data Lakes
for data lake
management
Results
โข Low impact, low latency capture from heterogeneous source systems
โข Enable lambda architecture (streaming via Kafka and batch with hive)
โข Transactional data stitching automated by Compose for Data Lakes
โข Reduction in time spent building scripts for data lake stitching allowed
resources to focus on curating data through the Silver/Gold/Mastered layers
Large
Bank
15. 20
Compose for Data Warehouse Automation
Reduce Time to Market with Automation
โข End-to-end dw lifecycle
management
โข Automated E-LT code generation
โข Architected for near real-time
processing with Replicate
ingestion
16. 21
Top Investment Management platform provider
Modernize Analytics Pipelines
โข Leading provider of front
and middle office
investment management
platform
โข The providers platform is
relied on by investment
firms, wealth managers,
hedge funds and insurers
in more than 30 countries
to manage more than
US$30 Trillion in AUM.
Challenges
โข Deliver right-time data and analytics for investment firms
โข Build differentiated client value via DW & Analytics as a Service
โข Complex data processing requirements for Accounts, Securities, Positions and
Transactions
โข Platform agnostic requirement as they evaluate cloud data platforms
Solution
โข QDI Data Warehouse
Automation solution
โข Replicate
โข Compose for Data
Warehouses
Results
โข Delivery of first data warehouse solution within weeks instead of months /
years
โข Agile solution that provides self-service capabilities to their customers with
improved data quality
โข Ability to migrate to new cloud environments with no recoding effort
17. 22
Data Catalyst โ Increasing Data Intelligence
The Smart Data Catalog
Semi-Structured
Data
Automated, 1 step process
Complete profiling and validation
Rule-Driven Tagging
Semi-structured Data
Data Relationships
Integrated security
Analytical
Database
Analytical
Datawarehouse
Analytical
Data Lake
Enterprise
Data Sources
Compose for
Data Warehouses
STAGING > EDW > MARTS
Compose for
Data Lakes
RAW > ASSEMBLE > PROVISION
18. 23
Data Catalyst โ Increasing Data Intelligence
The Actionable Data Catalog
Find what youโre looking for
Take action
Understand the data assets
20. 25
Trusted by over 2500 customers worldwide
And Half the Fortune 100
FIN. SERVICES MANUF. / INDUS. GOVERNMENTHEALTH CARE
TECHNOLOGY / TELECOM OTHER INDUSTRIESRETAIL