SlideShare a Scribd company logo
1 of 28
Download to read offline
Fresh Predictions
Using Real-Time
Data for Machine
Learning
Christina Lin
The Redpanda Lady
With
Redpanda
Data
Transforms
Christina Lin
Developer Advocate, Redpanda
aka. The Redpanda Lady
© 2024 REDPANDA DATA
SOA
WebSphere
DB2
Sybase
Oracle
MQ
J2EE
EJB
DevOps
Microservice
EIP
K8s
Agile
Integration
Data
Mesh
Active MQ
Living data stack
Resilience - handle failures and scale gracefully
Elasticity – infrastructure that can scale dynamically
Decentralization - data ownership, empowering
individual teams
Performance - low latency and high throughput
Autonomy – self service, define quality, and access
Nimble - efficient data movement
Distributed -distributed data processing for cloud native
Agility – quickly respond to change in data
Agenda
• Streamlined data ingestion and transformation
• Real-time machine learning
• Demo
© 2024 REDPANDA DATA
© 2024 REDPANDA DATA
LLM
RAG
GenAI
Prompt
Engineering
Natural
Language
Generation
Natural
Language
Processing
Deep
Learning
Vector/ Semantic
search
Neural
Network
Application
© 2024 REDPANDA DATA
LLM
LLM
LLM
How do you build application with AI?
© 2024 REDPANDA DATA
How do you build application with AI?
When is the next eclipse
when where is the best
place to see it?
April 8, 2024 are in
Exmouth, Australia and
East Timor
Application
© 2024 REDPANDA DATA
LLM
LLM
LLM
How do you build application with AI?
• Performance problem
• Incorrect, unpredictable result
• Text-based, hard to customize
with small set of data
• $$$$$$$
© 2024 REDPANDA DATA
Events
Events
Events
Event
Data Layer
Model
Prediction
Model
Testing
Model
Training
Machine Learning
Events
Events
Events
Event
Dataset
Dataset
Dataset
Dataset
Dataset
Events
Events
Events
Reference
data
Inference
Model
Registry
APP
APP
Model
Model
Model
Streaming Architecture for AI
© 2024 REDPANDA DATA
Customized
Model
Customized
Model
Customized
Model
LLM
LLM
Better AI implementation
Retrieval
Augmented
Generation
Customized Domain
trained models
Customized Domain
trained models
Fine-tuned
© 2024 REDPANDA DATA
RAG & Stream & EDA
Broker
APP
LLM
Vector
DB
APP
Model
Service
APP
Model
Broker
Aggregate
© 2024 REDPANDA DATA
RAG & Stream & EDA
Broker
NPC1
LLM
Broker
NPC2
LLM
NPC3
LLM
WebSocket
Topic
Topic
Topic
© 2024 REDPANDA DATA
Events
Events
Events
Event
Data Layer
Model
Prediction
Model
Testing
Model
Training
Machine Learning
Events
Events
Events
Event
Dataset
Dataset
Dataset
Dataset
Dataset
Events
Events
Events
Reference
data
Inference
Model
Registry
APP
APP
Model
Model
Model
Streaming Architecture for AI
Redpanda in 3 mins
Broker
Zookeeper/
KRaft
JVM
Page
Cache
Page
Cache
Page
Cache
Schema
Registry
Http Proxy Client
Connector
Debezium
Client
Disk
Redpanda in 3 mins
Broker
Zookeeper/
KRaft
JVM
Page
Cache
Page
Cache
Page
Cache
Schema
Registry
Http Proxy Client
Connector
Debezium
Client
Disk
WASM
© 2024 REDPANDA DATA
Stateless
Streaming Pipeline
Transform
format Change, masking, filtering, validating
Dispatch, Wiretap
Spilt, multiple destination
Control
reroute
Normalize/ Denormalize Enrich
Multiple ingestion
Stateful
Streaming Pipeline
Complex event processing
Time-window based processing
Enrich
Multiple ingestion
Micro batch Pipeline
Transform for large output (Dataset)
Partitioning Split workload
Analytics
batch
Pipeline
Analytics large volume (legacy)
Transform large output (Dataset, legacy)
Transport large unstructured data
Better scalability for pipelines
Data
Pipeline
Broker
© 2024 REDPANDA DATA
Data Ping-Pong
Data
Pipeline
Over the Network - Slow
Data
Pipeline
© 2023 REDPANDA DATA
Redpanda Data Transform
Stateless
Streaming Pipeline Transform
format Change, masking, filtering,
validating
Dispatch, Wiretap
Spilt, multiple destination Control
reroute
Normalize/ Denormalize
Enrich
Multiple ingestion
WASM
WebAssembly
Binary instruction format for a stacked-based VM.
Portable compilation
Go
Rust
JS
Python
Ruby
rpk
cloud login
Choose my
fav language!
Builds the
WebAssembly module
Define transformation rules
rpk transform build
rpk transform init
rpk transform deploy
--input-topic=customer
--output-topic=customer_masked
Deploy transformation to cluster
customer
customer_masked
customer
customer_masked
customer
customer_masked
Replicate
across clusters
Redpanda Data Transforms
cloud login
customer
customer_masked
customer
customer_masked
customer
customer_masked
Replicate
across clusters
customer
partition 1
customer_masked
partition 1
Load to cache
Customer age: 34
↓
Customer age: 3*
Transform
Write back to disk with DMA
Thread per Core
(Quick to process data)
Redpanda Data Transform
© 2024 REDPANDA DATA
Customized
Model
Customized
Model
Customized
Model
LLM
LLM
Better AI implementation
Retrieval
Augmented
Generation
Customized Domain
trained models
Customized Domain
trained models
Fine-tuned
Demo - Real-Time Data for Machine Learning
Machine Learning
lifecycle
Data ETL
Feature
Engineering
Model Training
Deploy/Experi
ment
Prediction
Monitor
Problem
Application
MLOps
Real time food delivery
result – Raw data
In broker processing
data on the fly, in
broker avoid data
ping-pong
Process cleaned
features and param
data set
Continuous real-
time data
training for ML
Dynamic
Model
Updating
Real-time inference
Demo - Real-Time Data for Machine Learning
bit.ly/redpanda-india
redpanda-0 redpanda-1
redpanda-2
redpanda-
console
Redpanda Cluster
Jupytor
Notebook
TensorFlow
Simulator
producer.py
Redpanda Cluster
Simulator
producer.py
Demo - Real-Time Data for Machine Learning
bit.ly/redpanda-india
redpanda-
console
Redpanda Cluster
Simulator
producer.py
redpanda-0
L L
redpanda-1
L L
redpanda-2
L L
Demo - Real-Time Data for Machine Learning
bit.ly/redpanda-india
redpanda-0 redpanda-1
redpanda-2
redpanda-
console
Redpanda Cluster
Jupytor
Notebook
TensorFlow
Simulator
producer.py
Redpanda
Transforms
Redpanda
Transforms
build
deploy
Redpanda Cluster
Simulator
producer.py
Redpanda
Transforms
Redpanda
Transforms
Demo - Real-Time Data for Machine Learning
bit.ly/redpanda-india
redpanda-
console
Redpanda Cluster
Simulator
producer.py
redpanda-0
L L
redpanda-1
L L
redpanda-2
L L
Demo - Real-Time Data for Machine Learning
bit.ly/redpanda-india
redpanda-0 redpanda-1
redpanda-2
redpanda-
console
Redpanda Cluster
Jupytor
Notebook
TensorFlow
Simulator
producer.py
ML Model
training
consumer.py
model
Real-time
inference
app.py
model
Demo - Real-Time Data for Machine Learning
bit.ly/redpanda-india
© 2024 REDPANDA DATA
Redpanda University
Free, self-paced online learning
https://university.redpanda.com
•Learn the fundamentals of data streaming
and Redpanda
•Install Redpanda and use the rpk CLI to
configure it
•Create producers and consumers
in Java, Python and NodeJS
•Sign up today for free!

More Related Content

Similar to Bangalore Meetup - Enable realtime machine learning with streaming data

AWS Partner Presentation - Datapipe - Deploying Hybrid IT, AWS Summit 2012 - NYC
AWS Partner Presentation - Datapipe - Deploying Hybrid IT, AWS Summit 2012 - NYCAWS Partner Presentation - Datapipe - Deploying Hybrid IT, AWS Summit 2012 - NYC
AWS Partner Presentation - Datapipe - Deploying Hybrid IT, AWS Summit 2012 - NYC
Amazon Web Services
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitecture
Maheedhar Gunturu
 

Similar to Bangalore Meetup - Enable realtime machine learning with streaming data (20)

First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Keynote: Open Source für den geschäftskritischen Einsatz
Keynote: Open Source für den geschäftskritischen EinsatzKeynote: Open Source für den geschäftskritischen Einsatz
Keynote: Open Source für den geschäftskritischen Einsatz
 
Data Virtualization in the Cloud – Accelerating Time-to-Value
Data Virtualization in the Cloud – Accelerating Time-to-ValueData Virtualization in the Cloud – Accelerating Time-to-Value
Data Virtualization in the Cloud – Accelerating Time-to-Value
 
Building scalable data with kafka and spark
Building scalable data with kafka and sparkBuilding scalable data with kafka and spark
Building scalable data with kafka and spark
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
 
AWS Partner Presentation - Datapipe - Deploying Hybrid IT, AWS Summit 2012 - NYC
AWS Partner Presentation - Datapipe - Deploying Hybrid IT, AWS Summit 2012 - NYCAWS Partner Presentation - Datapipe - Deploying Hybrid IT, AWS Summit 2012 - NYC
AWS Partner Presentation - Datapipe - Deploying Hybrid IT, AWS Summit 2012 - NYC
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitecture
 
Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
 
Slides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data ArchitectureSlides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data Architecture
 
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
 
Equinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journeyEquinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journey
 
Streaming IBM i to Kafka for Next-Gen Use Cases
Streaming IBM i to Kafka for Next-Gen Use CasesStreaming IBM i to Kafka for Next-Gen Use Cases
Streaming IBM i to Kafka for Next-Gen Use Cases
 
Seamless, Real-Time Data Integration with Connect
Seamless, Real-Time Data Integration with ConnectSeamless, Real-Time Data Integration with Connect
Seamless, Real-Time Data Integration with Connect
 
Azure Services Platform
Azure Services PlatformAzure Services Platform
Azure Services Platform
 

More from Christina Lin

More from Christina Lin (20)

Kafka summit apac session
Kafka summit apac sessionKafka summit apac session
Kafka summit apac session
 
Serverless integration anatomy
Serverless integration anatomyServerless integration anatomy
Serverless integration anatomy
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
 
Agile integration cloud native developement
Agile integration   cloud native developementAgile integration   cloud native developement
Agile integration cloud native developement
 
Dev conf .in cloud native reference architecture .advance
Dev conf .in cloud native reference architecture .advanceDev conf .in cloud native reference architecture .advance
Dev conf .in cloud native reference architecture .advance
 
Camel k Taiwan Java user group
Camel k  Taiwan Java user groupCamel k  Taiwan Java user group
Camel k Taiwan Java user group
 
Devoxxma-API centric microservices Architecture
Devoxxma-API centric microservices ArchitectureDevoxxma-API centric microservices Architecture
Devoxxma-API centric microservices Architecture
 
JBoss Fuse - Fuse workshop EAP container
JBoss Fuse - Fuse workshop EAP containerJBoss Fuse - Fuse workshop EAP container
JBoss Fuse - Fuse workshop EAP container
 
Supercharge Your Integration Services
Supercharge Your Integration Services�Supercharge Your Integration Services�
Supercharge Your Integration Services
 
Improve business process with microservice integration
Improve business process with microservice integration �Improve business process with microservice integration �
Improve business process with microservice integration
 
Integrating BPM with Fuse
Integrating BPM with FuseIntegrating BPM with Fuse
Integrating BPM with Fuse
 
Scalable Integration with JBoss Fuse
Scalable Integration with JBoss FuseScalable Integration with JBoss Fuse
Scalable Integration with JBoss Fuse
 
JBoss Fuse - Fuse workshop Error Handling
JBoss Fuse - Fuse workshop Error HandlingJBoss Fuse - Fuse workshop Error Handling
JBoss Fuse - Fuse workshop Error Handling
 
JBoss Fuse Workshop 101 part 6
JBoss Fuse Workshop 101 part 6JBoss Fuse Workshop 101 part 6
JBoss Fuse Workshop 101 part 6
 
JBoss Fuse Workshop 101 part 5
JBoss Fuse Workshop 101 part 5JBoss Fuse Workshop 101 part 5
JBoss Fuse Workshop 101 part 5
 
JBoss Fuse Workshop 101 part 4
JBoss Fuse Workshop 101 part 4JBoss Fuse Workshop 101 part 4
JBoss Fuse Workshop 101 part 4
 
JBoss Fuse Workshop 101 part 3
JBoss Fuse Workshop 101 part 3JBoss Fuse Workshop 101 part 3
JBoss Fuse Workshop 101 part 3
 
JBoss Fuse Workshop 101 part 2
JBoss Fuse Workshop 101 part 2JBoss Fuse Workshop 101 part 2
JBoss Fuse Workshop 101 part 2
 
Jboss Fuse Workshop 101 part 1
Jboss Fuse Workshop 101 part 1Jboss Fuse Workshop 101 part 1
Jboss Fuse Workshop 101 part 1
 
Messaging on the cloud with xPAAS
Messaging on the cloud with xPAASMessaging on the cloud with xPAAS
Messaging on the cloud with xPAAS
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 

Bangalore Meetup - Enable realtime machine learning with streaming data

  • 1. Fresh Predictions Using Real-Time Data for Machine Learning Christina Lin The Redpanda Lady With Redpanda Data Transforms
  • 2. Christina Lin Developer Advocate, Redpanda aka. The Redpanda Lady © 2024 REDPANDA DATA SOA WebSphere DB2 Sybase Oracle MQ J2EE EJB DevOps Microservice EIP K8s Agile Integration Data Mesh Active MQ Living data stack Resilience - handle failures and scale gracefully Elasticity – infrastructure that can scale dynamically Decentralization - data ownership, empowering individual teams Performance - low latency and high throughput Autonomy – self service, define quality, and access Nimble - efficient data movement Distributed -distributed data processing for cloud native Agility – quickly respond to change in data
  • 3. Agenda • Streamlined data ingestion and transformation • Real-time machine learning • Demo © 2024 REDPANDA DATA
  • 4. © 2024 REDPANDA DATA LLM RAG GenAI Prompt Engineering Natural Language Generation Natural Language Processing Deep Learning Vector/ Semantic search Neural Network
  • 5. Application © 2024 REDPANDA DATA LLM LLM LLM How do you build application with AI?
  • 6. © 2024 REDPANDA DATA How do you build application with AI? When is the next eclipse when where is the best place to see it? April 8, 2024 are in Exmouth, Australia and East Timor
  • 7. Application © 2024 REDPANDA DATA LLM LLM LLM How do you build application with AI? • Performance problem • Incorrect, unpredictable result • Text-based, hard to customize with small set of data • $$$$$$$
  • 8. © 2024 REDPANDA DATA Events Events Events Event Data Layer Model Prediction Model Testing Model Training Machine Learning Events Events Events Event Dataset Dataset Dataset Dataset Dataset Events Events Events Reference data Inference Model Registry APP APP Model Model Model Streaming Architecture for AI
  • 9. © 2024 REDPANDA DATA Customized Model Customized Model Customized Model LLM LLM Better AI implementation Retrieval Augmented Generation Customized Domain trained models Customized Domain trained models Fine-tuned
  • 10. © 2024 REDPANDA DATA RAG & Stream & EDA Broker APP LLM Vector DB APP Model Service APP Model Broker Aggregate
  • 11. © 2024 REDPANDA DATA RAG & Stream & EDA Broker NPC1 LLM Broker NPC2 LLM NPC3 LLM WebSocket Topic Topic Topic
  • 12. © 2024 REDPANDA DATA Events Events Events Event Data Layer Model Prediction Model Testing Model Training Machine Learning Events Events Events Event Dataset Dataset Dataset Dataset Dataset Events Events Events Reference data Inference Model Registry APP APP Model Model Model Streaming Architecture for AI
  • 13. Redpanda in 3 mins Broker Zookeeper/ KRaft JVM Page Cache Page Cache Page Cache Schema Registry Http Proxy Client Connector Debezium Client Disk
  • 14. Redpanda in 3 mins Broker Zookeeper/ KRaft JVM Page Cache Page Cache Page Cache Schema Registry Http Proxy Client Connector Debezium Client Disk WASM
  • 15. © 2024 REDPANDA DATA Stateless Streaming Pipeline Transform format Change, masking, filtering, validating Dispatch, Wiretap Spilt, multiple destination Control reroute Normalize/ Denormalize Enrich Multiple ingestion Stateful Streaming Pipeline Complex event processing Time-window based processing Enrich Multiple ingestion Micro batch Pipeline Transform for large output (Dataset) Partitioning Split workload Analytics batch Pipeline Analytics large volume (legacy) Transform large output (Dataset, legacy) Transport large unstructured data Better scalability for pipelines
  • 16. Data Pipeline Broker © 2024 REDPANDA DATA Data Ping-Pong Data Pipeline Over the Network - Slow Data Pipeline
  • 17. © 2023 REDPANDA DATA Redpanda Data Transform Stateless Streaming Pipeline Transform format Change, masking, filtering, validating Dispatch, Wiretap Spilt, multiple destination Control reroute Normalize/ Denormalize Enrich Multiple ingestion WASM WebAssembly Binary instruction format for a stacked-based VM. Portable compilation Go Rust JS Python Ruby
  • 18. rpk cloud login Choose my fav language! Builds the WebAssembly module Define transformation rules rpk transform build rpk transform init rpk transform deploy --input-topic=customer --output-topic=customer_masked Deploy transformation to cluster customer customer_masked customer customer_masked customer customer_masked Replicate across clusters Redpanda Data Transforms
  • 19. cloud login customer customer_masked customer customer_masked customer customer_masked Replicate across clusters customer partition 1 customer_masked partition 1 Load to cache Customer age: 34 ↓ Customer age: 3* Transform Write back to disk with DMA Thread per Core (Quick to process data) Redpanda Data Transform
  • 20. © 2024 REDPANDA DATA Customized Model Customized Model Customized Model LLM LLM Better AI implementation Retrieval Augmented Generation Customized Domain trained models Customized Domain trained models Fine-tuned
  • 21. Demo - Real-Time Data for Machine Learning Machine Learning lifecycle Data ETL Feature Engineering Model Training Deploy/Experi ment Prediction Monitor Problem
  • 22. Application MLOps Real time food delivery result – Raw data In broker processing data on the fly, in broker avoid data ping-pong Process cleaned features and param data set Continuous real- time data training for ML Dynamic Model Updating Real-time inference Demo - Real-Time Data for Machine Learning bit.ly/redpanda-india
  • 23. redpanda-0 redpanda-1 redpanda-2 redpanda- console Redpanda Cluster Jupytor Notebook TensorFlow Simulator producer.py Redpanda Cluster Simulator producer.py Demo - Real-Time Data for Machine Learning bit.ly/redpanda-india
  • 24. redpanda- console Redpanda Cluster Simulator producer.py redpanda-0 L L redpanda-1 L L redpanda-2 L L Demo - Real-Time Data for Machine Learning bit.ly/redpanda-india
  • 25. redpanda-0 redpanda-1 redpanda-2 redpanda- console Redpanda Cluster Jupytor Notebook TensorFlow Simulator producer.py Redpanda Transforms Redpanda Transforms build deploy Redpanda Cluster Simulator producer.py Redpanda Transforms Redpanda Transforms Demo - Real-Time Data for Machine Learning bit.ly/redpanda-india
  • 26. redpanda- console Redpanda Cluster Simulator producer.py redpanda-0 L L redpanda-1 L L redpanda-2 L L Demo - Real-Time Data for Machine Learning bit.ly/redpanda-india
  • 27. redpanda-0 redpanda-1 redpanda-2 redpanda- console Redpanda Cluster Jupytor Notebook TensorFlow Simulator producer.py ML Model training consumer.py model Real-time inference app.py model Demo - Real-Time Data for Machine Learning bit.ly/redpanda-india
  • 28. © 2024 REDPANDA DATA Redpanda University Free, self-paced online learning https://university.redpanda.com •Learn the fundamentals of data streaming and Redpanda •Install Redpanda and use the rpk CLI to configure it •Create producers and consumers in Java, Python and NodeJS •Sign up today for free!