SlideShare a Scribd company logo
1 of 21
Download to read offline
B A T C H
T O
S T R E A M
D A T A P I P E L I N E
Christina Lin
Redpanda
Developer Advocate
Agenda
• Quick Intro – 10-15 mins
• HANDS-ON – 35-40 mins
• Q & A – 5-10 mins
© 2024 REDPANDA DATA
Christina Lin
Developer Advocate, Redpanda
aka. The Redpanda Lady
SOA
WebSphere
DB2
Sybase
Oracle
MQ
J2EE
EJB
DevOps
Microservice
EIP
K8s
Agile
Integration
Data
Mesh
Active MQ
Living data stack
Resilience - handle failures and scale gracefully
Elasticity – infrastructure that can scale dynamically
Decentralization - data ownership, empowering
individual teams
Performance - low latency and high throughput
Autonomy – self service, define quality, and access
Nimble - efficient data movement
Distributed -distributed data processing for cloud native
Agility – quickly respond to change in data
© 2024 REDPANDA DATA
© 2024 REDPANDA DATA
An ordinary day of Data Engineer
© 2024 REDPANDA DATA
© 2024 REDPANDA DATA
Stateless
Streaming Pipeline
Transform
format Change, masking, filtering, validating
Dispatch, Wiretap
Spilt, multiple destination
Control
reroute
Normalize/ Denormalize Enrich
Multiple ingestion
Stateful
Streaming Pipeline
Complex event processing
Time-window based processing
Enrich
Multiple ingestion
Micro batch Pipeline
Transform for large output (Dataset)
Partitioning Split workload
Analytics
batch
Pipeline
Analytics large volume (legacy)
Transform large output (Dataset, legacy)
Transport large unstructured data
Better scalability for pipelines
Batch
Every 10 mins
CSV
Right away!
CSV
CSV
Stream
Batch
pipeline
Batch Processing
Batch
pipeline
TheWorkshop overview
© 2024 REDPANDA DATA
Batch
Every 10 mins
Right away!
CSV
CSV
Stream
Batch
pipeline
Batch Processing
Batch
pipeline
TheWorkshop overview
© 2024 REDPANDA DATA
CSV
VM
cassandra:5432
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
The Batch
Setup & Load data
postgres:5432
Table
public.bos_air_traffic
jupyterlab:8888
postgresload.ipynb
VM
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
Setup
cassandraload.ipynb
Table
latest_flight_data
cassandra:5432
The Batch
VM
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
CSV
CSV
CSV
spark.ipynb
The Batch
VM
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
CSV
CSV
CSV
Map.ipynb
The Batch
VM
© 2024 REDPANDA DATA
https://bit.ly/osdc-redpanda
Let’s Stream
redpanda-0:9092
console:8080
VM
© 2024 REDPANDA DATA
https://bit.ly/osdc-redpanda
kafka-connect:8083
Let’s Stream
VM
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
Config
Topic:
boston.public.bos_air_traffic
Table
public.bos_air_traffic
Let’s Stream
VM
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
jobmanager:8081
Topic:
boston.public.bos_air_traffic
Flink Data
Stream
Java JAR
Let’s Stream
VM
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
CSV
CSV
CSV
Topic:
sensor_csv
Topic:
Sensor_csv
Let’s Stream
VM
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
Let’s Stream
CSV
CSV
CSV
Topic:
sensor_csv
rpk
redpanda-0:9644
Topic:
filtered_sensor_csv
VM
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
CSV
CSV
CSV
Topic:
sensor_csv
SQL
SQL
Client
Let’s Stream
Batch
CSV
Batch
pipeline
Batch
pipeline
Batch Processing
Every 10 mins
CSV
Right away!
CSV
Stream
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
TheWorkshop overview
© 2024 REDPANDA DATA
Stateless
Streaming Pipeline
Transform
format Change, masking, filtering, validating
Dispatch, Wiretap
Spilt, multiple destination
Control
reroute
Normalize/ Denormalize Enrich
Multiple ingestion
Stateful
Streaming Pipeline
Complex event processing
Time-window based processing
Enrich
Multiple ingestion
Better scalability for pipelines
© 2024 REDPANDA DATA
Keep Learning
Streaming - Communication
Basics of K8s networking
Connectivity
Performance
Docs
Get a peak under the hood.
https://docs.redpanda.com/
Blogs
Keep up to date with Redpanda.
https://redpanda.com/blog
Slack
Engage with our community.
https://redpanda.com/slack
Code
Check out the source.
https://github.com/redpanda-data
Redpanda University
Free, self-paced online learning
https://university.redpanda.com

More Related Content

Similar to Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide

Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Rajit Saha
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream DataDataWorks Summit
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
Modernizing your Application Architecture with Microservices
Modernizing your Application Architecture with MicroservicesModernizing your Application Architecture with Microservices
Modernizing your Application Architecture with Microservicesconfluent
 
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
Eliminating  Volatile Latencies Inside Rakuten’s NoSQL MigrationEliminating  Volatile Latencies Inside Rakuten’s NoSQL Migration
Eliminating Volatile Latencies Inside Rakuten’s NoSQL MigrationScyllaDB
 
Track 2, session 4, data protection and disaster recovery with riverbed
Track 2, session 4, data protection and disaster recovery with riverbedTrack 2, session 4, data protection and disaster recovery with riverbed
Track 2, session 4, data protection and disaster recovery with riverbedEMC Forum India
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrailnvirters
 
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache KafkaScylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache KafkaScyllaDB
 
6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization6WIND
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsVMware Tanzu
 
Replication in real-time from Oracle and MySQL into data warehouses and analy...
Replication in real-time from Oracle and MySQL into data warehouses and analy...Replication in real-time from Oracle and MySQL into data warehouses and analy...
Replication in real-time from Oracle and MySQL into data warehouses and analy...Continuent
 
Real-time Data Loading from Oracle and MySQL to Data Warehouses, Analytics
Real-time Data Loading from Oracle and MySQL to Data Warehouses, AnalyticsReal-time Data Loading from Oracle and MySQL to Data Warehouses, Analytics
Real-time Data Loading from Oracle and MySQL to Data Warehouses, AnalyticsContinuent
 
Replication in real-time from Oracle and MySQL into data warehouses and analy...
Replication in real-time from Oracle and MySQL into data warehouses and analy...Replication in real-time from Oracle and MySQL into data warehouses and analy...
Replication in real-time from Oracle and MySQL into data warehouses and analy...Continuent
 
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads Srikanth Ramakrishnan
 
Cardinality-HL-Overview
Cardinality-HL-OverviewCardinality-HL-Overview
Cardinality-HL-OverviewHarry Frost
 
Lisa Guess - Embracing the Cloud
Lisa Guess - Embracing the CloudLisa Guess - Embracing the Cloud
Lisa Guess - Embracing the Cloudcentralohioissa
 
Cisco at v mworld 2015 theater presentation brfarnha
Cisco at v mworld 2015 theater presentation brfarnhaCisco at v mworld 2015 theater presentation brfarnha
Cisco at v mworld 2015 theater presentation brfarnhaldangelo0772
 
Forward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationForward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationAndrew Wesbecher
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitectureMaheedhar Gunturu
 

Similar to Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide (20)

Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Modernizing your Application Architecture with Microservices
Modernizing your Application Architecture with MicroservicesModernizing your Application Architecture with Microservices
Modernizing your Application Architecture with Microservices
 
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
Eliminating  Volatile Latencies Inside Rakuten’s NoSQL MigrationEliminating  Volatile Latencies Inside Rakuten’s NoSQL Migration
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
 
Track 2, session 4, data protection and disaster recovery with riverbed
Track 2, session 4, data protection and disaster recovery with riverbedTrack 2, session 4, data protection and disaster recovery with riverbed
Track 2, session 4, data protection and disaster recovery with riverbed
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrail
 
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache KafkaScylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
 
6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
 
Replication in real-time from Oracle and MySQL into data warehouses and analy...
Replication in real-time from Oracle and MySQL into data warehouses and analy...Replication in real-time from Oracle and MySQL into data warehouses and analy...
Replication in real-time from Oracle and MySQL into data warehouses and analy...
 
Real-time Data Loading from Oracle and MySQL to Data Warehouses, Analytics
Real-time Data Loading from Oracle and MySQL to Data Warehouses, AnalyticsReal-time Data Loading from Oracle and MySQL to Data Warehouses, Analytics
Real-time Data Loading from Oracle and MySQL to Data Warehouses, Analytics
 
Replication in real-time from Oracle and MySQL into data warehouses and analy...
Replication in real-time from Oracle and MySQL into data warehouses and analy...Replication in real-time from Oracle and MySQL into data warehouses and analy...
Replication in real-time from Oracle and MySQL into data warehouses and analy...
 
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
 
Cardinality-HL-Overview
Cardinality-HL-OverviewCardinality-HL-Overview
Cardinality-HL-Overview
 
Lisa Guess - Embracing the Cloud
Lisa Guess - Embracing the CloudLisa Guess - Embracing the Cloud
Lisa Guess - Embracing the Cloud
 
Cisco at v mworld 2015 theater presentation brfarnha
Cisco at v mworld 2015 theater presentation brfarnhaCisco at v mworld 2015 theater presentation brfarnha
Cisco at v mworld 2015 theater presentation brfarnha
 
Forward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationForward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentation
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitecture
 

More from Christina Lin

Kafka summit apac session
Kafka summit apac sessionKafka summit apac session
Kafka summit apac sessionChristina Lin
 
Serverless integration anatomy
Serverless integration anatomyServerless integration anatomy
Serverless integration anatomyChristina Lin
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshopChristina Lin
 
Agile integration cloud native developement
Agile integration   cloud native developementAgile integration   cloud native developement
Agile integration cloud native developementChristina Lin
 
Dev conf .in cloud native reference architecture .advance
Dev conf .in cloud native reference architecture .advanceDev conf .in cloud native reference architecture .advance
Dev conf .in cloud native reference architecture .advanceChristina Lin
 
Camel k Taiwan Java user group
Camel k  Taiwan Java user groupCamel k  Taiwan Java user group
Camel k Taiwan Java user groupChristina Lin
 
Devoxxma-API centric microservices Architecture
Devoxxma-API centric microservices ArchitectureDevoxxma-API centric microservices Architecture
Devoxxma-API centric microservices ArchitectureChristina Lin
 
JBoss Fuse - Fuse workshop EAP container
JBoss Fuse - Fuse workshop EAP containerJBoss Fuse - Fuse workshop EAP container
JBoss Fuse - Fuse workshop EAP containerChristina Lin
 
Supercharge Your Integration Services
Supercharge Your Integration Services�Supercharge Your Integration Services�
Supercharge Your Integration ServicesChristina Lin
 
Improve business process with microservice integration
Improve business process with microservice integration �Improve business process with microservice integration �
Improve business process with microservice integration Christina Lin
 
Integrating BPM with Fuse
Integrating BPM with FuseIntegrating BPM with Fuse
Integrating BPM with FuseChristina Lin
 
Scalable Integration with JBoss Fuse
Scalable Integration with JBoss FuseScalable Integration with JBoss Fuse
Scalable Integration with JBoss FuseChristina Lin
 
JBoss Fuse - Fuse workshop Error Handling
JBoss Fuse - Fuse workshop Error HandlingJBoss Fuse - Fuse workshop Error Handling
JBoss Fuse - Fuse workshop Error HandlingChristina Lin
 
JBoss Fuse Workshop 101 part 6
JBoss Fuse Workshop 101 part 6JBoss Fuse Workshop 101 part 6
JBoss Fuse Workshop 101 part 6Christina Lin
 
JBoss Fuse Workshop 101 part 5
JBoss Fuse Workshop 101 part 5JBoss Fuse Workshop 101 part 5
JBoss Fuse Workshop 101 part 5Christina Lin
 
JBoss Fuse Workshop 101 part 4
JBoss Fuse Workshop 101 part 4JBoss Fuse Workshop 101 part 4
JBoss Fuse Workshop 101 part 4Christina Lin
 
JBoss Fuse Workshop 101 part 3
JBoss Fuse Workshop 101 part 3JBoss Fuse Workshop 101 part 3
JBoss Fuse Workshop 101 part 3Christina Lin
 
JBoss Fuse Workshop 101 part 2
JBoss Fuse Workshop 101 part 2JBoss Fuse Workshop 101 part 2
JBoss Fuse Workshop 101 part 2Christina Lin
 
Jboss Fuse Workshop 101 part 1
Jboss Fuse Workshop 101 part 1Jboss Fuse Workshop 101 part 1
Jboss Fuse Workshop 101 part 1Christina Lin
 
Messaging on the cloud with xPAAS
Messaging on the cloud with xPAASMessaging on the cloud with xPAAS
Messaging on the cloud with xPAASChristina Lin
 

More from Christina Lin (20)

Kafka summit apac session
Kafka summit apac sessionKafka summit apac session
Kafka summit apac session
 
Serverless integration anatomy
Serverless integration anatomyServerless integration anatomy
Serverless integration anatomy
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
 
Agile integration cloud native developement
Agile integration   cloud native developementAgile integration   cloud native developement
Agile integration cloud native developement
 
Dev conf .in cloud native reference architecture .advance
Dev conf .in cloud native reference architecture .advanceDev conf .in cloud native reference architecture .advance
Dev conf .in cloud native reference architecture .advance
 
Camel k Taiwan Java user group
Camel k  Taiwan Java user groupCamel k  Taiwan Java user group
Camel k Taiwan Java user group
 
Devoxxma-API centric microservices Architecture
Devoxxma-API centric microservices ArchitectureDevoxxma-API centric microservices Architecture
Devoxxma-API centric microservices Architecture
 
JBoss Fuse - Fuse workshop EAP container
JBoss Fuse - Fuse workshop EAP containerJBoss Fuse - Fuse workshop EAP container
JBoss Fuse - Fuse workshop EAP container
 
Supercharge Your Integration Services
Supercharge Your Integration Services�Supercharge Your Integration Services�
Supercharge Your Integration Services
 
Improve business process with microservice integration
Improve business process with microservice integration �Improve business process with microservice integration �
Improve business process with microservice integration
 
Integrating BPM with Fuse
Integrating BPM with FuseIntegrating BPM with Fuse
Integrating BPM with Fuse
 
Scalable Integration with JBoss Fuse
Scalable Integration with JBoss FuseScalable Integration with JBoss Fuse
Scalable Integration with JBoss Fuse
 
JBoss Fuse - Fuse workshop Error Handling
JBoss Fuse - Fuse workshop Error HandlingJBoss Fuse - Fuse workshop Error Handling
JBoss Fuse - Fuse workshop Error Handling
 
JBoss Fuse Workshop 101 part 6
JBoss Fuse Workshop 101 part 6JBoss Fuse Workshop 101 part 6
JBoss Fuse Workshop 101 part 6
 
JBoss Fuse Workshop 101 part 5
JBoss Fuse Workshop 101 part 5JBoss Fuse Workshop 101 part 5
JBoss Fuse Workshop 101 part 5
 
JBoss Fuse Workshop 101 part 4
JBoss Fuse Workshop 101 part 4JBoss Fuse Workshop 101 part 4
JBoss Fuse Workshop 101 part 4
 
JBoss Fuse Workshop 101 part 3
JBoss Fuse Workshop 101 part 3JBoss Fuse Workshop 101 part 3
JBoss Fuse Workshop 101 part 3
 
JBoss Fuse Workshop 101 part 2
JBoss Fuse Workshop 101 part 2JBoss Fuse Workshop 101 part 2
JBoss Fuse Workshop 101 part 2
 
Jboss Fuse Workshop 101 part 1
Jboss Fuse Workshop 101 part 1Jboss Fuse Workshop 101 part 1
Jboss Fuse Workshop 101 part 1
 
Messaging on the cloud with xPAAS
Messaging on the cloud with xPAASMessaging on the cloud with xPAAS
Messaging on the cloud with xPAAS
 

Recently uploaded

Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide

  • 1. B A T C H T O S T R E A M D A T A P I P E L I N E Christina Lin Redpanda Developer Advocate
  • 2. Agenda • Quick Intro – 10-15 mins • HANDS-ON – 35-40 mins • Q & A – 5-10 mins © 2024 REDPANDA DATA
  • 3. Christina Lin Developer Advocate, Redpanda aka. The Redpanda Lady SOA WebSphere DB2 Sybase Oracle MQ J2EE EJB DevOps Microservice EIP K8s Agile Integration Data Mesh Active MQ Living data stack Resilience - handle failures and scale gracefully Elasticity – infrastructure that can scale dynamically Decentralization - data ownership, empowering individual teams Performance - low latency and high throughput Autonomy – self service, define quality, and access Nimble - efficient data movement Distributed -distributed data processing for cloud native Agility – quickly respond to change in data © 2024 REDPANDA DATA
  • 4. © 2024 REDPANDA DATA An ordinary day of Data Engineer
  • 5. © 2024 REDPANDA DATA © 2024 REDPANDA DATA Stateless Streaming Pipeline Transform format Change, masking, filtering, validating Dispatch, Wiretap Spilt, multiple destination Control reroute Normalize/ Denormalize Enrich Multiple ingestion Stateful Streaming Pipeline Complex event processing Time-window based processing Enrich Multiple ingestion Micro batch Pipeline Transform for large output (Dataset) Partitioning Split workload Analytics batch Pipeline Analytics large volume (legacy) Transform large output (Dataset, legacy) Transport large unstructured data Better scalability for pipelines
  • 6. Batch Every 10 mins CSV Right away! CSV CSV Stream Batch pipeline Batch Processing Batch pipeline TheWorkshop overview © 2024 REDPANDA DATA
  • 7. Batch Every 10 mins Right away! CSV CSV Stream Batch pipeline Batch Processing Batch pipeline TheWorkshop overview © 2024 REDPANDA DATA CSV
  • 8. VM cassandra:5432 © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda The Batch Setup & Load data postgres:5432 Table public.bos_air_traffic jupyterlab:8888 postgresload.ipynb
  • 9. VM © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda Setup cassandraload.ipynb Table latest_flight_data cassandra:5432 The Batch
  • 10. VM © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda CSV CSV CSV spark.ipynb The Batch
  • 11. VM © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda CSV CSV CSV Map.ipynb The Batch
  • 12. VM © 2024 REDPANDA DATA https://bit.ly/osdc-redpanda Let’s Stream redpanda-0:9092 console:8080
  • 13. VM © 2024 REDPANDA DATA https://bit.ly/osdc-redpanda kafka-connect:8083 Let’s Stream
  • 14. VM © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda Config Topic: boston.public.bos_air_traffic Table public.bos_air_traffic Let’s Stream
  • 15. VM © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda jobmanager:8081 Topic: boston.public.bos_air_traffic Flink Data Stream Java JAR Let’s Stream
  • 16. VM © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda CSV CSV CSV Topic: sensor_csv Topic: Sensor_csv Let’s Stream
  • 17. VM © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda Let’s Stream CSV CSV CSV Topic: sensor_csv rpk redpanda-0:9644 Topic: filtered_sensor_csv
  • 18. VM © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda CSV CSV CSV Topic: sensor_csv SQL SQL Client Let’s Stream
  • 19. Batch CSV Batch pipeline Batch pipeline Batch Processing Every 10 mins CSV Right away! CSV Stream © 2024 REDPANDA DATA https://bit.ly/odsc-redpanda TheWorkshop overview
  • 20. © 2024 REDPANDA DATA Stateless Streaming Pipeline Transform format Change, masking, filtering, validating Dispatch, Wiretap Spilt, multiple destination Control reroute Normalize/ Denormalize Enrich Multiple ingestion Stateful Streaming Pipeline Complex event processing Time-window based processing Enrich Multiple ingestion Better scalability for pipelines
  • 21. © 2024 REDPANDA DATA Keep Learning Streaming - Communication Basics of K8s networking Connectivity Performance Docs Get a peak under the hood. https://docs.redpanda.com/ Blogs Keep up to date with Redpanda. https://redpanda.com/blog Slack Engage with our community. https://redpanda.com/slack Code Check out the source. https://github.com/redpanda-data Redpanda University Free, self-paced online learning https://university.redpanda.com