SlideShare a Scribd company logo
Data in Motion
Building Stream-Based
Architectures with Qlik
Replicate & Kafka
John Neal
Senior Solution Architect
Qlik Partner Engineering
2
Qlik Data
Integration
Quick Overview
3
Data Warehouse Automation
Streaming Data Pipeline Automation
Design, Manage & Monitor
Modernize and Automate Data Integration
CDC Streaming
Azure
SQL DW
Amazon
Redshift
Managed Data Lake Creation
Generate
Change Data
Streams
Deliver
To Clouds,
Lakes…
Refine &
Merge
For Analytics,
AI/ML, Data
Science…
AI/ML
Analytics
Data
Science
Model
Commit
Conform
Consume
Catalog
Shop, Prepare & Provision
Catalog
Shop, Prepare & Provision
RDBMS
Data Warehouse
Files
Mainframe
SAAS
APPS
SAP
Amazon RDS Azure SQL DB
Google Cloud SQL
4
Streaming Data Pipeline Automation
Design, Manage & Monitor
Our Focus for Today: Qlik Replicate & Kafka
Generate
Change Data
Streams
Deliver
To Clouds,
Lakes…
Refine &
Merge
For Analytics,
AI/ML, Data
Science…
RDBMS
Data Warehouse
Files
Mainframe
SAAS
APPS
SAP
5
TARGET SCHEMA
CREATION
SAP
RDBMS
EDW
FILE
MAINFRAME
HETEROGENEOUS
DATA TYPE MAPPING
BATCH TO CDC
TRANSITION
DDL CHANGE
PROPAGATION
FILTERING
TRANSFORMATIONS
RDBMS
EDW
FILES
STREAMING
DATA LAKE
Log Based
CDC
BATCH
IN-MEMORY
Replicate
Qlik Replicate
Automated Real-time Data Delivery
6
Physics 101
As It Applies to Data
7
An object will not change its
motion unless acted on by an
unbalanced force.
• If it is at rest, it will stay at rest
• If it is in motion, it will remain at the
same velocity
Corollary: Objects with greater mass
have more inertia. It therefore takes
more force to change their motion.
Newton’s First Law of
Motion
Inertia
8
Data in motion tends to stay in motion until it
comes rest on disk.
Similarly, if data is at rest, it will remain at rest
until an external “force” puts it in motion
again.
— John Neal *
* With apologies to Sir Isaac Newton
9
Writing Data to a Database Introduces Friction
Data in Motion
Friction
How do we get the
data moving
again?
STOP
10
Get Landed Data Moving
Overcoming Storage “Friction”
File I/O (reads)
• Parsing challenges
• No deltas
Database Queries
• Not real-time
• Added database load
Database Triggers
• Added database load
• Doesn’t scale
ETL Tools
• Not real-time
• Added database load
• Getting deltas is hard
Qlik Replicate
• Real-time
• Reads the DB logs
• CDC provides delta processing
11
Getting Data in
Motion Again
With Qlik Replicate & Kafka
12
“Modern” Applications Leverage Microservices
• Components are “decoupled” and have well-defined interfaces
- Changes are easier to make because they are localized and isolated
- Results in increased reliability
- Allows for a faster release schedule supporting agile approaches
- Increases opportunity to innovate
• Microservices can use “purpose built” storage rather than a central
repository
- Teams are free to choose the most appropriate repository for the problem
at hand … a relational database is not always the answer.
• Data flows between components
Microservices
13
Data Catalog
Microservice-Based Applications
A Bucket of Bricks
Data Warehouse
Automation
Media
Data Streaming
(CDC)
Analytics
Security
Kafka
Streaming
Services
Event Processing
RDBMS
Wide-Column
Store
Spark /
ML
Cloud DW
Hadoop
Key-Value
Store
Graph DB
(NoSQL)
File Storage
Document
Store
(NoSQL)
IoT
Qlik
14
Lambda-Style Architectures
Streaming and batch working together
NoSQL
IoT
Mobile
Apps
Web
Legacy
DB/DW
Incoming Data
Streaming (Speed) Layer
Serving Layer
Batch Layer
Stream Processing
(Spark Streaming,
Storm, Flink, …)
Incremental
Views
All Data Pre-Compute
Views
(Spark, M/R, HQL, …)
Real-time Views
Batch Views
Queries /
ML /
Analytics
Ingest & Store Prepare / Curate Publish ConsumeData
15
Kappa-Style Architectures
Where everything is a stream
Streaming Data
Streaming Layer
Stream Processing
(Spark Streaming,
Storm, Flink, …)
Real-time Results
Serving Layer
Real-time View
Queries /
ML /
Analytics
Mirror events
to long term
storage
Storage Layer
Raw Data History
Re-compute
events from
storage if
needed
Historical View
Ingest & Store Prepare / Curate Publish ConsumeData
16
Making Rubber
Meet the Road
Innovate by Keeping Data in Motion
17
Source
Legacy
SAP
Kafka
Streaming Data with Qlik Replicate
Ingest & Store Prepare / Curate Publish ConsumeData
And then what?
Qlik
Replicate
CDC
18
Source
Data
Kafka
A Real-World Example
Credit Card Authorization
Ingest & Store Prepare / Curate Publish ConsumeData
Qlik
Replicate
Spark
HBase
Hive
Machine
Learning
Decision
Service
Engine
Analytics
Application
Spark
Models
Data Lake
CDC
19
Demo
20
Sample Records – Willie Mays
Load / Reload, UPDATE
{"data": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry": "USA",
"birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "", "deathState": "",
"deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard", "weight": 170,
"height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09", "retroID":
"maysw101", "bbrefID": "mayswi01"}, "beforeData": null, "headers": {"operation": "REFRESH",
"changeSequence": "", "timestamp": "", "streamPosition": "", "transactionId": "", "changeMask": null,
"columnMask": null, "transactionEventCounter": null, "transactionLastEvent": null}}
{"data": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry":
"NewCountry", "birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "",
"deathState": "", "deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard",
"weight": 170, "height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09",
"retroID": "maysw101", "bbrefID": "mayswi01"}, "beforeData": {"playerID": "mayswi01", "birthYear": 1931,
"birthMonth": 5, "birthDay": 6, "birthCountry": "USA", "birthState": "AL", "birthCity": "Westfield",
"deathYear": "", "deathCountry": "", "deathState": "", "deathCity": "", "nameFirst": "Willie", "nameLast":
"Mays", "nameGiven": "Willie Howard", "weight": 170, "height": 70, "bats": "R", "throws": "R", "debut":
"1951-05-25", "finalGame": "1973-09-09", "retroID": "maysw101", "bbrefID": "mayswi01"}, "headers":
{"operation": "UPDATE", "changeSequence": "20200713204536000000000000000110813", "timestamp": "2020-07-
13T20:45:36.000", "streamPosition": "mysql-bin.000004:415943395:20:415951456:17592712139:mysql-
bin.000004:412843032", "transactionId": "000000000000000000000004189B7BCB", "changeMask": "000010",
"columnMask": "3FFFFF", "transactionEventCounter": 10962, "transactionLastEvent": false}}
21
Sample Records – Willie Mays
DELETE, INSERT
{"data": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry":
"NewCountry", "birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "",
"deathState": "", "deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard",
"weight": 170, "height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09",
"retroID": "maysw101", "bbrefID": "mayswi01"}, "beforeData": null, "headers": {"operation": "DELETE",
"changeSequence": "20200713204542000000000000000219813", "timestamp": "2020-07-13T20:45:42.000",
"streamPosition": "mysql-bin.000004:419832331:55:419840520:17598121412:mysql-bin.000004:418252305",
"transactionId": "00000000000000000000000418EE05C4", "changeMask": "000001", "columnMask": "3FFFFF",
"transactionEventCounter": 10962, "transactionLastEvent": false}}
{"data": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry": "USA",
"birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "", "deathState": "",
"deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard", "weight": 170,
"height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09", "retroID":
"maysw101", "bbrefID": "mayswi01"}, "beforeData": null, "headers": {"operation": "INSERT", "changeSequence":
"20200713204606000000000000000297113", "timestamp": "2020-07-13T20:46:06.000", "streamPosition": "mysql-
bin.000004:422559929:1:422565460:17602420793:mysql-bin.000004:422551686", "transactionId":
"000000000000000000000004192FA039", "changeMask": "3FFFFF", "columnMask": "3FFFFF",
"transactionEventCounter": 62, "transactionLastEvent": false}}
22
Wrapping Up
23
Summarizing Key Points
Physics applies to data
Qlik Replicate delivers
data from databases to
Kafka in real-time.
“Modern” architectures
want data to be in
motion.
Kafka is a key
component.
Feedback loops can be
a useful way to keep
data moving
https://www.qlik.com/products/data-integration-products
john.neal@qlik.com

More Related Content

What's hot

Masterclass Live: Amazon EMR
Masterclass Live: Amazon EMRMasterclass Live: Amazon EMR
Masterclass Live: Amazon EMR
Amazon Web Services
 
아름답고 유연한 데이터 파이프라인 구축을 위한 Amazon Managed Workflow for Apache Airflow - 유다니엘 A...
아름답고 유연한 데이터 파이프라인 구축을 위한 Amazon Managed Workflow for Apache Airflow - 유다니엘 A...아름답고 유연한 데이터 파이프라인 구축을 위한 Amazon Managed Workflow for Apache Airflow - 유다니엘 A...
아름답고 유연한 데이터 파이프라인 구축을 위한 Amazon Managed Workflow for Apache Airflow - 유다니엘 A...
Amazon Web Services Korea
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
ScyllaDB
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data Framework
Wes McKinney
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on Read
Databricks
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best Practices
Amazon Web Services
 
NetApp & Storage fundamentals
NetApp & Storage fundamentalsNetApp & Storage fundamentals
NetApp & Storage fundamentals
Shashidhar Basavaraju
 
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Edureka!
 
Application Archaeology: Accelerating App Modernization at DICK’S Sporting Goods
Application Archaeology: Accelerating App Modernization at DICK’S Sporting GoodsApplication Archaeology: Accelerating App Modernization at DICK’S Sporting Goods
Application Archaeology: Accelerating App Modernization at DICK’S Sporting Goods
VMware Tanzu
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
Wasm1953
 
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
Altinity Ltd
 
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache AtlasOpen Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache Atlas
DataWorks Summit
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data Flows
Thomas Sykes
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
Julien Le Dem
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
Flink Forward
 

What's hot (20)

Masterclass Live: Amazon EMR
Masterclass Live: Amazon EMRMasterclass Live: Amazon EMR
Masterclass Live: Amazon EMR
 
아름답고 유연한 데이터 파이프라인 구축을 위한 Amazon Managed Workflow for Apache Airflow - 유다니엘 A...
아름답고 유연한 데이터 파이프라인 구축을 위한 Amazon Managed Workflow for Apache Airflow - 유다니엘 A...아름답고 유연한 데이터 파이프라인 구축을 위한 Amazon Managed Workflow for Apache Airflow - 유다니엘 A...
아름답고 유연한 데이터 파이프라인 구축을 위한 Amazon Managed Workflow for Apache Airflow - 유다니엘 A...
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data Framework
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on Read
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best Practices
 
NetApp & Storage fundamentals
NetApp & Storage fundamentalsNetApp & Storage fundamentals
NetApp & Storage fundamentals
 
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
 
Application Archaeology: Accelerating App Modernization at DICK’S Sporting Goods
Application Archaeology: Accelerating App Modernization at DICK’S Sporting GoodsApplication Archaeology: Accelerating App Modernization at DICK’S Sporting Goods
Application Archaeology: Accelerating App Modernization at DICK’S Sporting Goods
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
 
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
 
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache AtlasOpen Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache Atlas
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data Flows
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 

Similar to Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kafka (John Neal, Qlik) Kafka Summit 2020

Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Facultad de Informática UCM
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
Gary Stafford
 
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
DATAVERSITY
 
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikKeeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
HostedbyConfluent
 
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
Amazon Web Services Korea
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
Data Con LA
 
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA
 
Designing big data analytics solutions on azure
Designing big data analytics solutions on azureDesigning big data analytics solutions on azure
Designing big data analytics solutions on azure
Mohamed Tawfik
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
DataWorks Summit
 
Thu-310pm-Impetus-SachinAndAjay
Thu-310pm-Impetus-SachinAndAjayThu-310pm-Impetus-SachinAndAjay
Thu-310pm-Impetus-SachinAndAjayAjay Shriwastava
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
CCG
 
Building Serverless Data Infrastructure in the AWS Cloud
Building Serverless Data Infrastructure in the AWS CloudBuilding Serverless Data Infrastructure in the AWS Cloud
Building Serverless Data Infrastructure in the AWS Cloud
Ryan Plant
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
Omid Vahdaty
 
Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Jeff Harris
 
Building Data Lakes with Apache Airflow
Building Data Lakes with Apache AirflowBuilding Data Lakes with Apache Airflow
Building Data Lakes with Apache Airflow
Gary Stafford
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutes
Karen Lopez
 
Big Data Building Blocks with AWS Cloud
Big Data Building Blocks with AWS CloudBig Data Building Blocks with AWS Cloud
Big Data Building Blocks with AWS Cloud
Blazeclan Technologies Private Limited
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
Amazon Web Services
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Michael Rys
 
Adf walkthrough
Adf walkthroughAdf walkthrough
Adf walkthrough
MSDEVMTL
 

Similar to Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kafka (John Neal, Qlik) Kafka Summit 2020 (20)

Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
 
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikKeeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
 
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
 
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
 
Designing big data analytics solutions on azure
Designing big data analytics solutions on azureDesigning big data analytics solutions on azure
Designing big data analytics solutions on azure
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
 
Thu-310pm-Impetus-SachinAndAjay
Thu-310pm-Impetus-SachinAndAjayThu-310pm-Impetus-SachinAndAjay
Thu-310pm-Impetus-SachinAndAjay
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
 
Building Serverless Data Infrastructure in the AWS Cloud
Building Serverless Data Infrastructure in the AWS CloudBuilding Serverless Data Infrastructure in the AWS Cloud
Building Serverless Data Infrastructure in the AWS Cloud
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Couchbase Overview Nov 2013
Couchbase Overview Nov 2013
 
Building Data Lakes with Apache Airflow
Building Data Lakes with Apache AirflowBuilding Data Lakes with Apache Airflow
Building Data Lakes with Apache Airflow
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutes
 
Big Data Building Blocks with AWS Cloud
Big Data Building Blocks with AWS CloudBig Data Building Blocks with AWS Cloud
Big Data Building Blocks with AWS Cloud
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
 
Adf walkthrough
Adf walkthroughAdf walkthrough
Adf walkthrough
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 

Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kafka (John Neal, Qlik) Kafka Summit 2020

  • 1. Data in Motion Building Stream-Based Architectures with Qlik Replicate & Kafka John Neal Senior Solution Architect Qlik Partner Engineering
  • 3. 3 Data Warehouse Automation Streaming Data Pipeline Automation Design, Manage & Monitor Modernize and Automate Data Integration CDC Streaming Azure SQL DW Amazon Redshift Managed Data Lake Creation Generate Change Data Streams Deliver To Clouds, Lakes… Refine & Merge For Analytics, AI/ML, Data Science… AI/ML Analytics Data Science Model Commit Conform Consume Catalog Shop, Prepare & Provision Catalog Shop, Prepare & Provision RDBMS Data Warehouse Files Mainframe SAAS APPS SAP Amazon RDS Azure SQL DB Google Cloud SQL
  • 4. 4 Streaming Data Pipeline Automation Design, Manage & Monitor Our Focus for Today: Qlik Replicate & Kafka Generate Change Data Streams Deliver To Clouds, Lakes… Refine & Merge For Analytics, AI/ML, Data Science… RDBMS Data Warehouse Files Mainframe SAAS APPS SAP
  • 5. 5 TARGET SCHEMA CREATION SAP RDBMS EDW FILE MAINFRAME HETEROGENEOUS DATA TYPE MAPPING BATCH TO CDC TRANSITION DDL CHANGE PROPAGATION FILTERING TRANSFORMATIONS RDBMS EDW FILES STREAMING DATA LAKE Log Based CDC BATCH IN-MEMORY Replicate Qlik Replicate Automated Real-time Data Delivery
  • 6. 6 Physics 101 As It Applies to Data
  • 7. 7 An object will not change its motion unless acted on by an unbalanced force. • If it is at rest, it will stay at rest • If it is in motion, it will remain at the same velocity Corollary: Objects with greater mass have more inertia. It therefore takes more force to change their motion. Newton’s First Law of Motion Inertia
  • 8. 8 Data in motion tends to stay in motion until it comes rest on disk. Similarly, if data is at rest, it will remain at rest until an external “force” puts it in motion again. — John Neal * * With apologies to Sir Isaac Newton
  • 9. 9 Writing Data to a Database Introduces Friction Data in Motion Friction How do we get the data moving again? STOP
  • 10. 10 Get Landed Data Moving Overcoming Storage “Friction” File I/O (reads) • Parsing challenges • No deltas Database Queries • Not real-time • Added database load Database Triggers • Added database load • Doesn’t scale ETL Tools • Not real-time • Added database load • Getting deltas is hard Qlik Replicate • Real-time • Reads the DB logs • CDC provides delta processing
  • 11. 11 Getting Data in Motion Again With Qlik Replicate & Kafka
  • 12. 12 “Modern” Applications Leverage Microservices • Components are “decoupled” and have well-defined interfaces - Changes are easier to make because they are localized and isolated - Results in increased reliability - Allows for a faster release schedule supporting agile approaches - Increases opportunity to innovate • Microservices can use “purpose built” storage rather than a central repository - Teams are free to choose the most appropriate repository for the problem at hand … a relational database is not always the answer. • Data flows between components Microservices
  • 13. 13 Data Catalog Microservice-Based Applications A Bucket of Bricks Data Warehouse Automation Media Data Streaming (CDC) Analytics Security Kafka Streaming Services Event Processing RDBMS Wide-Column Store Spark / ML Cloud DW Hadoop Key-Value Store Graph DB (NoSQL) File Storage Document Store (NoSQL) IoT Qlik
  • 14. 14 Lambda-Style Architectures Streaming and batch working together NoSQL IoT Mobile Apps Web Legacy DB/DW Incoming Data Streaming (Speed) Layer Serving Layer Batch Layer Stream Processing (Spark Streaming, Storm, Flink, …) Incremental Views All Data Pre-Compute Views (Spark, M/R, HQL, …) Real-time Views Batch Views Queries / ML / Analytics Ingest & Store Prepare / Curate Publish ConsumeData
  • 15. 15 Kappa-Style Architectures Where everything is a stream Streaming Data Streaming Layer Stream Processing (Spark Streaming, Storm, Flink, …) Real-time Results Serving Layer Real-time View Queries / ML / Analytics Mirror events to long term storage Storage Layer Raw Data History Re-compute events from storage if needed Historical View Ingest & Store Prepare / Curate Publish ConsumeData
  • 16. 16 Making Rubber Meet the Road Innovate by Keeping Data in Motion
  • 17. 17 Source Legacy SAP Kafka Streaming Data with Qlik Replicate Ingest & Store Prepare / Curate Publish ConsumeData And then what? Qlik Replicate CDC
  • 18. 18 Source Data Kafka A Real-World Example Credit Card Authorization Ingest & Store Prepare / Curate Publish ConsumeData Qlik Replicate Spark HBase Hive Machine Learning Decision Service Engine Analytics Application Spark Models Data Lake CDC
  • 20. 20 Sample Records – Willie Mays Load / Reload, UPDATE {"data": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry": "USA", "birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "", "deathState": "", "deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard", "weight": 170, "height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09", "retroID": "maysw101", "bbrefID": "mayswi01"}, "beforeData": null, "headers": {"operation": "REFRESH", "changeSequence": "", "timestamp": "", "streamPosition": "", "transactionId": "", "changeMask": null, "columnMask": null, "transactionEventCounter": null, "transactionLastEvent": null}} {"data": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry": "NewCountry", "birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "", "deathState": "", "deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard", "weight": 170, "height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09", "retroID": "maysw101", "bbrefID": "mayswi01"}, "beforeData": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry": "USA", "birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "", "deathState": "", "deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard", "weight": 170, "height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09", "retroID": "maysw101", "bbrefID": "mayswi01"}, "headers": {"operation": "UPDATE", "changeSequence": "20200713204536000000000000000110813", "timestamp": "2020-07- 13T20:45:36.000", "streamPosition": "mysql-bin.000004:415943395:20:415951456:17592712139:mysql- bin.000004:412843032", "transactionId": "000000000000000000000004189B7BCB", "changeMask": "000010", "columnMask": "3FFFFF", "transactionEventCounter": 10962, "transactionLastEvent": false}}
  • 21. 21 Sample Records – Willie Mays DELETE, INSERT {"data": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry": "NewCountry", "birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "", "deathState": "", "deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard", "weight": 170, "height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09", "retroID": "maysw101", "bbrefID": "mayswi01"}, "beforeData": null, "headers": {"operation": "DELETE", "changeSequence": "20200713204542000000000000000219813", "timestamp": "2020-07-13T20:45:42.000", "streamPosition": "mysql-bin.000004:419832331:55:419840520:17598121412:mysql-bin.000004:418252305", "transactionId": "00000000000000000000000418EE05C4", "changeMask": "000001", "columnMask": "3FFFFF", "transactionEventCounter": 10962, "transactionLastEvent": false}} {"data": {"playerID": "mayswi01", "birthYear": 1931, "birthMonth": 5, "birthDay": 6, "birthCountry": "USA", "birthState": "AL", "birthCity": "Westfield", "deathYear": "", "deathCountry": "", "deathState": "", "deathCity": "", "nameFirst": "Willie", "nameLast": "Mays", "nameGiven": "Willie Howard", "weight": 170, "height": 70, "bats": "R", "throws": "R", "debut": "1951-05-25", "finalGame": "1973-09-09", "retroID": "maysw101", "bbrefID": "mayswi01"}, "beforeData": null, "headers": {"operation": "INSERT", "changeSequence": "20200713204606000000000000000297113", "timestamp": "2020-07-13T20:46:06.000", "streamPosition": "mysql- bin.000004:422559929:1:422565460:17602420793:mysql-bin.000004:422551686", "transactionId": "000000000000000000000004192FA039", "changeMask": "3FFFFF", "columnMask": "3FFFFF", "transactionEventCounter": 62, "transactionLastEvent": false}}
  • 23. 23 Summarizing Key Points Physics applies to data Qlik Replicate delivers data from databases to Kafka in real-time. “Modern” architectures want data to be in motion. Kafka is a key component. Feedback loops can be a useful way to keep data moving