SlideShare a Scribd company logo
1 of 57
1Apache Kafka and Machine Learning – Kai Waehner
Streaming Machine Learning with
Python, Jupyter, TensorFlow, Apache Kafka, and KSQL
Kai Waehner
Technology Evangelist
contact@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
2
3 Talks at Oracle
Code One 2019 in
San Francisco
3Apache Kafka and Machine Learning – Kai Waehner
Key Takeaways
• The Apache Kafka ecosystem helps to do data engineering and production deployment at scale
• Jupyter allows debugging, prototyping and scalable, reliable data processing by combining tool sets
• Kafka and TensorFlow I/O enable streaming model training without extra data store
4Apache Kafka and Machine Learning – Kai Waehner
Agenda
• Challenges of Machine Learning
• Apache Kafka Ecosystem for Machine Learning
• TensorFlow Ecosystem
• Prototyping and Data Engineering with Python, Jupyter and KSQL
• Streaming Model Training with Kafka and TensorFlow I/O
• Production Deployment Alternatives and Trade-Offs
• Real World Example - Supply Chain Optimization
5Apache Kafka and Machine Learning – Kai Waehner
Agenda
• Challenges of Machine Learning
• Apache Kafka Ecosystem for Machine Learning
• TensorFlow Ecosystem
• Prototyping and Data Engineering with Python, Jupyter and KSQL
• Streaming Model Training with Kafka and TensorFlow I/O
• Production Deployment Alternatives and Trade-Offs
• Real World Example - Supply Chain Optimization
6Apache Kafka and Machine Learning – Kai Waehner
Analyze and act on critical business moments
Seconds Minutes Hours
Real Time
Tracking
Predictive
Maintenance
Fraud
Detection
Cross Selling
Transportation
Rerouting
Customer
Service
Inventory
Management
Windows of Opportunity
7Apache Kafka and Machine Learning – Kai Waehner
Machine Learning (ML)
...allows computers to find hidden insights without being explicitly
programmed where to look.
Machine Learning
• Decision Trees
• Naïve Bayes
• Clustering
• Neural Networks
• Etc.
Deep Learning
• CNN
• RNN
• Autoencoder
• Etc.
8Apache Kafka and Machine Learning – Kai Waehner
Python == De Facto Standard for Machine Learning
9Apache Kafka and Machine Learning – Kai Waehner
The First Analytic Models
How to deploy the models
in production?
…real-time processing?
…at scale?
…24/7 zero downtime?
10Apache Kafka and Machine Learning – Kai Waehner
Hidden Technical Debt in Machine Learning Systems
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
11Apache Kafka and Machine Learning – Kai Waehner
Impedance mismatch between model development and model deployment
https://www.slideshare.net/NickPentreath/productionizing-spark-ml-pipelines-with-the-portable-format-for-analytics-100788521
12Apache Kafka and Machine Learning – Kai Waehner
Scalable, Technology-Agnostic Machine Learning Infrastructures
https://www.infoq.com/presentations/netflix-ml-meson
https://eng.uber.com/michelangelo
https://www.infoq.com/presentations/paypal-data-service-fraud
What is this
thing used everywhere?
13Apache Kafka and Machine Learning – Kai Waehner
Agenda
• Challenges of Machine Learning
• Apache Kafka Ecosystem for Machine Learning
• TensorFlow Ecosystem
• Prototyping and Data Engineering with Python, Jupyter and KSQL
• Streaming Model Training with Kafka and TensorFlow I/O
• Production Deployment Alternatives and Trade-Offs
• Real World Example - Supply Chain Optimization
14Apache Kafka and Machine Learning – Kai Waehner
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
Apache Kafka—The Rise of an Event Streaming Platform
15Apache Kafka and Machine Learning – Kai Waehner
Apache Kafka at Scale at Tech Giants
> 4.5 trillion messages / day > 6 Petabytes / day
“You name it”
* Kafka Is not just used by tech giants
** Kafka is not just used for big data
16Apache Kafka and Machine Learning – Kai Waehner
Confluent - Business Value per Use Case
Improve
Customer
Experience
(CX)
Increase
Revenue
(make money)
Business
Value
Decrease
Costs
(save
money)
Core Business
Platform
Increase
Operational
Efficiency
Migrate to
Cloud
Mitigate Risk
(protect money)
Key Drivers
Strategic Objectives
(sample)
Fraud
Detection
IoT sensor
ingestion
Digital
replatforming/
Mainframe Offload
Connected Car: Navigation & improved
in-car experience: Audi
Customer 360
Simplifying Omni-channel Retail at
Scale: Target
Faster transactional
processing / analysis
incl. Machine Learning / AI
Mainframe Offload: RBC
Microservices
Architecture
Online Fraud Detection
Online Security
(syslog, log
aggregation, Splunk
replacement)
Middleware
replacement
Regulatory
Digital
Transformation
Application Modernization: Multiple
Examples
Website / Core
Operations
(Central Nervous System)
The [Silicon Valley] Digital Natives;
LinkedIn, Netflix, Uber, Yelp...
Predictive Maintenance: Audi
Streaming Platform in a regulated
environment (e.g. Electronic Medical
Records): Celmatix
Real-time app
updates
Real Time Streaming Platform for
Communications and Beyond: Capital One
Developer Velocity - Building Stateful
Financial Applications with Kafka
Streams: Funding Circle
Detect Fraud & Prevent Fraud in Real
Time: PayPal
Kafka as a Service - A Tale of Security
and Multi-Tenancy: Apple
Example Use Cases
$↑
$↓
$
Example Case Studies
(of many)
17Apache Kafka and Machine Learning – Kai Waehner
Apache Kafka’s Open Source Ecosystem as Infrastructure for ML
18Apache Kafka and Machine Learning – Kai Waehner
Apache Kafka’s Open Ecosystem as Infrastructure for ML
Kafka
Streams
Kafka
Connect
Rest Proxy
Schema Registry
Go/.NET /Python
Kafka Producer
KSQL
Kafka
Streams
19Apache Kafka and Machine Learning – Kai Waehner
Want to learn more about Apache Kafka + Machine Learning?
Overview à www.kai-waehner.de
• Blog Post: How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka
https://www.confluent.io/blog/build-deploy-scalable-machine-learning-production-apache-kafka/
• Slide Deck: Apache Kafka + Machine Learning => Intelligent Real Time Applications
https://www.slideshare.net/KaiWaehner/apache-kafka-streams-machine-learning-deep-learning
• Slide Deck: Deep Learning at Extreme Scale (in the Cloud) with the Apache Kafka Open Source Ecosystem
https://www.slideshare.net/KaiWaehner/deep-learning-at-extreme-scale-in-the-cloud-with-the-apache-kafka-open-source-ecosystem
• Video Recording: Deep Learning in Mission Critical and Scalable Real Time Applications with Open Source Frameworks
https://vimeo.com/jaxtv/review/256406763/7fbf4213be
• Blog Post: Using Apache Kafka to Drive Cutting-Edge Machine Learning - Hybrid ML Architectures, AutoML, and more...
https://www.confluent.io/blog/using-apache-kafka-drive-cutting-edge-machine-learning
• Blog Post: Machine Learning with Python, Jupyter, KSQL and TensorFlow
https://www.confluent.io/blog/machine-learning-with-python-jupyter-ksql-tensorflow
20Apache Kafka and Machine Learning – Kai Waehner
Agenda
• Challenges of Machine Learning
• Apache Kafka Ecosystem for Machine Learning
• TensorFlow Ecosystem
• Prototyping and Data Engineering with Python, Jupyter and KSQL
• Streaming Model Training with Kafka and TensorFlow I/O
• Production Deployment Alternatives and Trade-Offs
• Real World Example - Supply Chain Optimization
21Apache Kafka and Machine Learning – Kai Waehner
TensorFlow
TensorFlow is an open source software library for high
performance numerical computation. Its flexible architecture
allows easy deployment of computation across a variety of
platforms (CPUs, GPUs, TPUs), and from desktops to clusters of
servers to mobile and edge devices. Originally developed by
researchers and engineers from the Google Brain team within
Google’s AI organization, it comes with strong support for
machine learning and deep learning and the flexible
numerical computation core is used across many other scientific
domains.
https://www.tensorflow.org/
22Apache Kafka and Machine Learning – Kai Waehner
TensorFlow Ecosystem
+ large community
+ integration with most 3rd party ML tools
+ support by all major cloud providers
23Apache Kafka and Machine Learning – Kai Waehner
TensorFlow Model
• Serialization: Protocol Buffers (protobufs)
• Generated classes in C, Python, Java, etc. that can load, save, and access the data
• File Format: Human readable TextFormat (.pbtxt) vs. compressed Binary (.pb)
• Graph object: Foundation of computation in TensorFlow
• Weights: Held in separate checkpoint files
• Standards: Support for ONNX, PMML
Autoencoder for Anomaly Detection
24Apache Kafka and Machine Learning – Kai Waehner
Agenda
• Challenges of Machine Learning
• Apache Kafka Ecosystem for Machine Learning
• TensorFlow Ecosystem
• Prototyping and Data Engineering with Python, Jupyter and KSQL
• Streaming Model Training with Kafka and TensorFlow I/O
• Production Deployment Alternatives and Trade-Offs
• Real World Example - Supply Chain Optimization
25Apache Kafka and Machine Learning – Kai Waehner
Jupyter
https://jupyter.org/
26Apache Kafka and Machine Learning – Kai Waehner
Prototyping with TensorFlow in a Jupyter Notebook
27Apache Kafka and Machine Learning – Kai Waehner
Data Preprocessing at Scale and Reliable
Preprocessing
Filter, transform, anonymize, extract features
Data needs to be
preprocessed at
scale and reusable!
Streams
• Use KSQL to preprocess data at scale without coding
• Use SQL statements for interactive analysis
+ deployment to production at scale
• Leverage e.g. Python with KSQL REST interface
Data Ready
for
Model Training
28Apache Kafka and Machine Learning – Kai Waehner
KSQL – A Streaming SQL Engine for Apache Kafka
29Apache Kafka and Machine Learning – Kai Waehner
Preprocessing with KSQL
SELECT car_id, event_id, car_model_id, sensor_input
FROM car_sensor c
LEFT JOIN car_models m ON c.car_model_id =
m.car_model_id
WHERE m.car_model_type ='Audi_A8';
30Apache Kafka and Machine Learning – Kai Waehner
Excursus: KSQL compared to Kafka Streams
https://www.slideshare.net/KaiWaehner/kafka-streams-vs-ksql-for-stream-processing-on-top-of-apache-kafka-142127337
31Apache Kafka and Machine Learning – Kai Waehner
Data Engineering with Python, KSQL, TensorFlow and Keras
https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras
https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras/blob/master/python-
jupyter-apache-kafka-ksql-tensorflow-keras.ipynb
Pick and combine the tools
you need and want to use!
Some libraries used in this example:
• Numpy
• Pandas
• TensorFlow
• Keras
• KSQL
• ksql-python
• sklearn
• matplotlib
32Apache Kafka and Machine Learning – Kai Waehner
Live Demo
Rapid Prototyping and Data Preprocessing
at Scale with Python, Jupyter and KSQL
33Apache Kafka and Machine Learning – Kai Waehner
Data Engineering and Interactive Queries with Jupyter, Python and KSQL
https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
https://github.com/takluyver/bash_kernel
You can also use just the
bash kernel and KSQL CLI:
34Apache Kafka and Machine Learning – Kai Waehner
ksql> SELECT customer_id, location_id FROM orders WHERE customer_id = 32235;
+-------------+-------------+
| customer_id | location_id |
+-------------+-------------+
| 32235 | 90 |
+-------------+-------------+
1 row in 0.003s
ksql>
ksql>
ksql>
ksql> SELECT count FROM orders WHERE customer_id = 1980;
+-----------+
| count |
+-----------+
| 12 |
+-----------+
1 row in 0.002s
ksql>
CREATE TABLE orders AS
SELECT customer_id, location_id, count(*)
FROM orders_stream
GROUP BY customer_id, location_id;
Query runs until completion and returns
the final result as quickly as possible
KSQL: Interactive Queries (aka Point-in-Time Queries)
https://github.com/confluentinc/ksql/blob/master/design-proposals/klip-8-interactive-queries.md
35Apache Kafka and Machine Learning – Kai Waehner
Agenda
• Challenges of Machine Learning
• Apache Kafka Ecosystem for Machine Learning
• TensorFlow Ecosystem
• Prototyping and Data Engineering with Python, Jupyter and KSQL
• Streaming Model Training with Kafka and TensorFlow I/O
• Production Deployment Alternatives and Trade-Offs
• Real World Example - Supply Chain Optimization
36Apache Kafka and Machine Learning – Kai Waehner
Data Ingestion into a Data Store
Connect
Preprocessed
Data
There isn’t just
one ML solution.
We need to be
flexible!
37Apache Kafka and Machine Learning – Kai Waehner
Kafka Connect
• “Kafka Benefits Under the Hood”
• Out-of-the-box connectivity
• Data format conversion
• Single message transformation
(including error-handling)
KafkaConnect
KafkaConnect
Data Source Data Sink
REST API
38Apache Kafka and Machine Learning – Kai Waehner
CREATE SOURCE CONNECTOR reader
WITH (source = ‘confluent.jdbc.postgres’, table = ‘customers’, …);
CREATE SINK CONNECTOR writer
WITH (sink = ‘confluent.s3’, bucket = ‘vip_customers’, …);
CREATE STREAM postgres_customers (id integer, purchases integer)
WITH (source = ‘reader’, ...);
CREATE STREAM vip_customers WITH (sink = ‘writer’, ...) AS
SELECT * FROM postgres_customers WHERE purchases > 10;
KSQL: Embedded Kafka Connect
https://github.com/confluentinc/ksql/blob/master/design-proposals/klip-7-connect-integration.md
Continuous streaming integration and pre-processing at scale and reliable – just with SQL commands!
39Apache Kafka and Machine Learning – Kai Waehner
Model Training using a Data Store
Let’s build some models
at extreme scale using
TensorFlow and TPUs!
Analytic Model
40Apache Kafka and Machine Learning – Kai Waehner
Streaming Model Training without additional Data Store
https://github.com/tensorflow/io/tree/master/tensorflow_io/kafka
TensorFlow I/O Kafka Plugin
• Native integration between Kafka and TensorFlow
• KafkaDataSet and KafkaOutputSequence for TensorFlow
• Written in C++ (linked with librdkafka)
• Part of the graph in TensorFlow
• Direct training and inference from streaming data
• No data storage like S3 or HDFS needed
41Apache Kafka and Machine Learning – Kai Waehner
Streaming Model Training with Kafka and TensorFlow I/O
https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference
Python Kafka Producer
Python Kafka Consumer
+ Streaming Ingestion
+ Model Training
42Apache Kafka and Machine Learning – Kai Waehner
Time
Model BModel A
Producer
Distributed Commit Log
Streaming Model Training with Kafka and TensorFlow I/O
Another
Real Time
Consumer
Another
Batch
Consumer
43Apache Kafka and Machine Learning – Kai Waehner
Model Example: Autoencoder for Anomaly Detection
44Apache Kafka and Machine Learning – Kai Waehner
Agenda
• Challenges of Machine Learning
• Apache Kafka Ecosystem for Machine Learning
• TensorFlow Ecosystem
• Prototyping and Data Engineering with Python, Jupyter and KSQL
• Streaming Model Training with Kafka and TensorFlow I/O
• Production Deployment Alternatives and Trade-Offs
• Real World Example - Supply Chain Optimization
45Apache Kafka and Machine Learning – Kai Waehner
RPC communication to do model inference
Streams
Input Event
Prediction
Request
Response
Model Serving
TensorFlow Serving
gRPC / HTTP
Application
47Apache Kafka and Machine Learning – Kai Waehner
Model interference natively embedded into the App
Application
Input Event
Prediction
48Apache Kafka and Machine Learning – Kai Waehner
Model interference in a Stream Processing App
Streams
Input Event
Prediction
Stream Processing
Model
doPrediction()
return value
49Apache Kafka and Machine Learning – Kai Waehner
Model interference in any Kafka Client App
Input Event
Prediction
Kafka Client
REST
Client
Model
doPrediction()
return value
52Apache Kafka and Machine Learning – Kai Waehner
RPC vs. Stream Processing for Model Serving
Why a Model Server and RPC
• Simple integration with existing technologies
and organizational processes
• Easier to understand if you come from non-
streaming world
• Later migration to real streaming is also
possible
• Model management built-in for different
models, versioning and A/B testing
• Monitoring built-in
Why embedded into Streaming App
• Better latency as remote call instead of local
inference
• Offline inference (devices, edge processing, etc.)
• No coupling of the availability, scalability, and
latency/throughput of your Kafka Streams
application with the SLAs of the RPC interface
• No side-effects (e.g., in case of failure), all
covered by Kafka processing (e.g., exactly once)
Application
Input Event
Prediction
54Apache Kafka and Machine Learning – Kai Waehner
Model Deployment with Apache Kafka, KSQL and TensorFlow
“CREATE STREAM AnomalyDetection AS
SELECT sensor_id, detectAnomaly(sensor_values)
FROM machine_engine;“
User Defined Function (UDF)
55Apache Kafka and Machine Learning – Kai Waehner
Live Demo
Real Time Model Scoring with KSQL and TensorFlow
56Apache Kafka and Machine Learning – Kai Waehner
Agenda
• Challenges of Machine Learning
• Apache Kafka Ecosystem for Machine Learning
• TensorFlow Ecosystem
• Prototyping and Data Engineering with Python, Jupyter and KSQL
• Streaming Model Training with Kafka and TensorFlow I/O
• Production Deployment Alternatives and Trade-Offs
• Real World Example - Supply Chain Optimization
Planners
forecast long
term schedule
Production
begins
IOT data from
production:
inventories,
manufacturing
machines,
yield metrics
Production
forecast
Forecasted
production -
plan diffs
Re optimize
plan based on
actuals
Change orders
to supply
chain:
inventory,
manufacturing
schedules
Change
operational
characteristics
: plant 223
needs new Al
extruder
Customer
delivery SLAs:
actuals vs.
plan
Streaming analytics using Confluent
Batch analytics using other frameworks
Physical operations
UI UI UIUI
(Reference use case implemented with our partner Expero)
Planners
forecast long
term schedule
Production
begins
IOT data from
production:
inventories,
manufacturing
machines,
yield metrics
Production
forecast
Forecasted
production -
plan diffs
Re optimize
plan based on
actuals
Change orders
to supply
chain:
inventory,
manufacturing
schedules
Change
operational
characteristics
: plant 223
needs new Al
extruder
Customer
delivery SLAs:
actuals vs.
plan
UI UI UIUI
PLC4X
Connector
Kafka
ConnectMQTT
File
HTTP
Machine
Sensors
Kafka
Cluster
KSQL
Tensor
Flow
Kafka
Connect
Notebooks
(Jupyter)
Spark
Real
Time
Kafka
App
Streaming analytics using Confluent
Batch analytics using other frameworks
Physical operations
TensorFlow
Serving
(Reference use case implemented with our partner Expero)
62Apache Kafka and Machine Learning – Kai Waehner
Code and Demos for Kafka and Machine Learning
https://github.com/kaiwaehner
63Apache Kafka and Machine Learning – Kai Waehner
Key Takeaways
• The Apache Kafka ecosystem helps to do data engineering and production deployment at scale
• Jupyter allows debugging, prototyping and scalable, reliable data processing by combining tool sets
• Kafka and TensorFlow I/O enable streaming model training without extra data store
64Apache Kafka and Machine Learning – Kai Waehner
Kai Waehner
Technology Evangelist
contact@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
www.confluent.io
LinkedIn
Questions? Feedback?
Let’s connect!

More Related Content

What's hot

Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningKai Wähner
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
 
the-coming-perfect-storm-john-paul-jackson
the-coming-perfect-storm-john-paul-jacksonthe-coming-perfect-storm-john-paul-jackson
the-coming-perfect-storm-john-paul-jacksonKaturi Susmitha
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDatabricks
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Databricks
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsTimothy Spann
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...HostedbyConfluent
 
The evolution of Apache Calcite and its Community
The evolution of Apache Calcite and its CommunityThe evolution of Apache Calcite and its Community
The evolution of Apache Calcite and its CommunityJulian Hyde
 
Streaming all over the world Real life use cases with Kafka Streams
Streaming all over the world  Real life use cases with Kafka StreamsStreaming all over the world  Real life use cases with Kafka Streams
Streaming all over the world Real life use cases with Kafka Streamsconfluent
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Timothy Spann
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 

What's hot (20)

Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
the-coming-perfect-storm-john-paul-jackson
the-coming-perfect-storm-john-paul-jacksonthe-coming-perfect-storm-john-paul-jackson
the-coming-perfect-storm-john-paul-jackson
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's Perspective
 
Hadoop at Ebay
Hadoop at EbayHadoop at Ebay
Hadoop at Ebay
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
The evolution of Apache Calcite and its Community
The evolution of Apache Calcite and its CommunityThe evolution of Apache Calcite and its Community
The evolution of Apache Calcite and its Community
 
Streaming all over the world Real life use cases with Kafka Streams
Streaming all over the world  Real life use cases with Kafka StreamsStreaming all over the world  Real life use cases with Kafka Streams
Streaming all over the world Real life use cases with Kafka Streams
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Data Engineering Basics
Data Engineering BasicsData Engineering Basics
Data Engineering Basics
 

Similar to Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and KSQL

Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Kai Wähner
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...Codemotion
 
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...confluent
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Kai Wähner
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Kai Wähner
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesUnleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesKai Wähner
 
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Kai Wähner
 
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...Codemotion
 
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehnerNitin Kumar
 
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka EcosystemMachine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka EcosystemKai Wähner
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Kai Wähner
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...confluent
 
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...confluent
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...confluent
 
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdService Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdKai Wähner
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
 
Mainframe Integration, Offloading and Replacement with Apache Kafka
Mainframe Integration, Offloading and Replacement with Apache KafkaMainframe Integration, Offloading and Replacement with Apache Kafka
Mainframe Integration, Offloading and Replacement with Apache KafkaKai Wähner
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridPaolo Castagna
 
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?Kai Wähner
 

Similar to Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and KSQL (20)

Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
 
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesUnleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
 
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
 
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
 
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
 
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka EcosystemMachine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
 
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
 
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdService Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
 
Mainframe Integration, Offloading and Replacement with Apache Kafka
Mainframe Integration, Offloading and Replacement with Apache KafkaMainframe Integration, Offloading and Replacement with Apache Kafka
Mainframe Integration, Offloading and Replacement with Apache Kafka
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - Madrid
 
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
 

More from Kai Wähner

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?Kai Wähner
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail IndustryKai Wähner
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsKai Wähner
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationApache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationKai Wähner
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Kai Wähner
 
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureServerless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureKai Wähner
 

More from Kai Wähner (20)

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and Logistics
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationApache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
 
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureServerless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
 

Recently uploaded

BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
BusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptxBusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptx
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptxAGATSoftware
 
User Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeUser Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeKaylee Miller
 
Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developersmichealwillson701
 
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurMinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurPriyadarshini T
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern
 
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxCYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxBarakaMuyengi
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Technical improvements. Reasons. Methods. Estimations. CJ
Technical improvements.  Reasons. Methods. Estimations. CJTechnical improvements.  Reasons. Methods. Estimations. CJ
Technical improvements. Reasons. Methods. Estimations. CJpolinaucc
 
Mobile App Development process | Expert Tips
Mobile App Development process | Expert TipsMobile App Development process | Expert Tips
Mobile App Development process | Expert Tipsmichealwillson701
 
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...jackiepotts6
 
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...MyFAA
 
Unlocking AI: Navigating Open Source vs. Commercial Frontiers
Unlocking AI:Navigating Open Source vs. Commercial FrontiersUnlocking AI:Navigating Open Source vs. Commercial Frontiers
Unlocking AI: Navigating Open Source vs. Commercial FrontiersRaphaël Semeteys
 
Boost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made EasyBoost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made Easymichealwillson701
 
8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.Ritesh Kanjee
 
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...telebusocialmarketin
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdfOffsiteNOC
 
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfFlutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfMind IT Systems
 
Einstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfEinstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfCloudMetic
 
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsMUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsUniversity of Antwerp
 

Recently uploaded (20)

BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
BusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptxBusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptx
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
 
User Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeUser Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller Resume
 
Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developers
 
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurMinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data Mesh
 
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxCYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Technical improvements. Reasons. Methods. Estimations. CJ
Technical improvements.  Reasons. Methods. Estimations. CJTechnical improvements.  Reasons. Methods. Estimations. CJ
Technical improvements. Reasons. Methods. Estimations. CJ
 
Mobile App Development process | Expert Tips
Mobile App Development process | Expert TipsMobile App Development process | Expert Tips
Mobile App Development process | Expert Tips
 
20140812 - OBD2 Solution
20140812 - OBD2 Solution20140812 - OBD2 Solution
20140812 - OBD2 Solution
 
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
 
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
 
Unlocking AI: Navigating Open Source vs. Commercial Frontiers
Unlocking AI:Navigating Open Source vs. Commercial FrontiersUnlocking AI:Navigating Open Source vs. Commercial Frontiers
Unlocking AI: Navigating Open Source vs. Commercial Frontiers
 
Boost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made EasyBoost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made Easy
 
8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.
 
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf
 
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfFlutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
 
Einstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfEinstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdf
 
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsMUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
 

Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and KSQL

  • 1. 1Apache Kafka and Machine Learning – Kai Waehner Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka, and KSQL Kai Waehner Technology Evangelist contact@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  • 2. 2 3 Talks at Oracle Code One 2019 in San Francisco
  • 3. 3Apache Kafka and Machine Learning – Kai Waehner Key Takeaways • The Apache Kafka ecosystem helps to do data engineering and production deployment at scale • Jupyter allows debugging, prototyping and scalable, reliable data processing by combining tool sets • Kafka and TensorFlow I/O enable streaming model training without extra data store
  • 4. 4Apache Kafka and Machine Learning – Kai Waehner Agenda • Challenges of Machine Learning • Apache Kafka Ecosystem for Machine Learning • TensorFlow Ecosystem • Prototyping and Data Engineering with Python, Jupyter and KSQL • Streaming Model Training with Kafka and TensorFlow I/O • Production Deployment Alternatives and Trade-Offs • Real World Example - Supply Chain Optimization
  • 5. 5Apache Kafka and Machine Learning – Kai Waehner Agenda • Challenges of Machine Learning • Apache Kafka Ecosystem for Machine Learning • TensorFlow Ecosystem • Prototyping and Data Engineering with Python, Jupyter and KSQL • Streaming Model Training with Kafka and TensorFlow I/O • Production Deployment Alternatives and Trade-Offs • Real World Example - Supply Chain Optimization
  • 6. 6Apache Kafka and Machine Learning – Kai Waehner Analyze and act on critical business moments Seconds Minutes Hours Real Time Tracking Predictive Maintenance Fraud Detection Cross Selling Transportation Rerouting Customer Service Inventory Management Windows of Opportunity
  • 7. 7Apache Kafka and Machine Learning – Kai Waehner Machine Learning (ML) ...allows computers to find hidden insights without being explicitly programmed where to look. Machine Learning • Decision Trees • Naïve Bayes • Clustering • Neural Networks • Etc. Deep Learning • CNN • RNN • Autoencoder • Etc.
  • 8. 8Apache Kafka and Machine Learning – Kai Waehner Python == De Facto Standard for Machine Learning
  • 9. 9Apache Kafka and Machine Learning – Kai Waehner The First Analytic Models How to deploy the models in production? …real-time processing? …at scale? …24/7 zero downtime?
  • 10. 10Apache Kafka and Machine Learning – Kai Waehner Hidden Technical Debt in Machine Learning Systems https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
  • 11. 11Apache Kafka and Machine Learning – Kai Waehner Impedance mismatch between model development and model deployment https://www.slideshare.net/NickPentreath/productionizing-spark-ml-pipelines-with-the-portable-format-for-analytics-100788521
  • 12. 12Apache Kafka and Machine Learning – Kai Waehner Scalable, Technology-Agnostic Machine Learning Infrastructures https://www.infoq.com/presentations/netflix-ml-meson https://eng.uber.com/michelangelo https://www.infoq.com/presentations/paypal-data-service-fraud What is this thing used everywhere?
  • 13. 13Apache Kafka and Machine Learning – Kai Waehner Agenda • Challenges of Machine Learning • Apache Kafka Ecosystem for Machine Learning • TensorFlow Ecosystem • Prototyping and Data Engineering with Python, Jupyter and KSQL • Streaming Model Training with Kafka and TensorFlow I/O • Production Deployment Alternatives and Trade-Offs • Real World Example - Supply Chain Optimization
  • 14. 14Apache Kafka and Machine Learning – Kai Waehner The Log ConnectorsConnectors Producer Consumer Streaming Engine Apache Kafka—The Rise of an Event Streaming Platform
  • 15. 15Apache Kafka and Machine Learning – Kai Waehner Apache Kafka at Scale at Tech Giants > 4.5 trillion messages / day > 6 Petabytes / day “You name it” * Kafka Is not just used by tech giants ** Kafka is not just used for big data
  • 16. 16Apache Kafka and Machine Learning – Kai Waehner Confluent - Business Value per Use Case Improve Customer Experience (CX) Increase Revenue (make money) Business Value Decrease Costs (save money) Core Business Platform Increase Operational Efficiency Migrate to Cloud Mitigate Risk (protect money) Key Drivers Strategic Objectives (sample) Fraud Detection IoT sensor ingestion Digital replatforming/ Mainframe Offload Connected Car: Navigation & improved in-car experience: Audi Customer 360 Simplifying Omni-channel Retail at Scale: Target Faster transactional processing / analysis incl. Machine Learning / AI Mainframe Offload: RBC Microservices Architecture Online Fraud Detection Online Security (syslog, log aggregation, Splunk replacement) Middleware replacement Regulatory Digital Transformation Application Modernization: Multiple Examples Website / Core Operations (Central Nervous System) The [Silicon Valley] Digital Natives; LinkedIn, Netflix, Uber, Yelp... Predictive Maintenance: Audi Streaming Platform in a regulated environment (e.g. Electronic Medical Records): Celmatix Real-time app updates Real Time Streaming Platform for Communications and Beyond: Capital One Developer Velocity - Building Stateful Financial Applications with Kafka Streams: Funding Circle Detect Fraud & Prevent Fraud in Real Time: PayPal Kafka as a Service - A Tale of Security and Multi-Tenancy: Apple Example Use Cases $↑ $↓ $ Example Case Studies (of many)
  • 17. 17Apache Kafka and Machine Learning – Kai Waehner Apache Kafka’s Open Source Ecosystem as Infrastructure for ML
  • 18. 18Apache Kafka and Machine Learning – Kai Waehner Apache Kafka’s Open Ecosystem as Infrastructure for ML Kafka Streams Kafka Connect Rest Proxy Schema Registry Go/.NET /Python Kafka Producer KSQL Kafka Streams
  • 19. 19Apache Kafka and Machine Learning – Kai Waehner Want to learn more about Apache Kafka + Machine Learning? Overview à www.kai-waehner.de • Blog Post: How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka https://www.confluent.io/blog/build-deploy-scalable-machine-learning-production-apache-kafka/ • Slide Deck: Apache Kafka + Machine Learning => Intelligent Real Time Applications https://www.slideshare.net/KaiWaehner/apache-kafka-streams-machine-learning-deep-learning • Slide Deck: Deep Learning at Extreme Scale (in the Cloud) with the Apache Kafka Open Source Ecosystem https://www.slideshare.net/KaiWaehner/deep-learning-at-extreme-scale-in-the-cloud-with-the-apache-kafka-open-source-ecosystem • Video Recording: Deep Learning in Mission Critical and Scalable Real Time Applications with Open Source Frameworks https://vimeo.com/jaxtv/review/256406763/7fbf4213be • Blog Post: Using Apache Kafka to Drive Cutting-Edge Machine Learning - Hybrid ML Architectures, AutoML, and more... https://www.confluent.io/blog/using-apache-kafka-drive-cutting-edge-machine-learning • Blog Post: Machine Learning with Python, Jupyter, KSQL and TensorFlow https://www.confluent.io/blog/machine-learning-with-python-jupyter-ksql-tensorflow
  • 20. 20Apache Kafka and Machine Learning – Kai Waehner Agenda • Challenges of Machine Learning • Apache Kafka Ecosystem for Machine Learning • TensorFlow Ecosystem • Prototyping and Data Engineering with Python, Jupyter and KSQL • Streaming Model Training with Kafka and TensorFlow I/O • Production Deployment Alternatives and Trade-Offs • Real World Example - Supply Chain Optimization
  • 21. 21Apache Kafka and Machine Learning – Kai Waehner TensorFlow TensorFlow is an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. Originally developed by researchers and engineers from the Google Brain team within Google’s AI organization, it comes with strong support for machine learning and deep learning and the flexible numerical computation core is used across many other scientific domains. https://www.tensorflow.org/
  • 22. 22Apache Kafka and Machine Learning – Kai Waehner TensorFlow Ecosystem + large community + integration with most 3rd party ML tools + support by all major cloud providers
  • 23. 23Apache Kafka and Machine Learning – Kai Waehner TensorFlow Model • Serialization: Protocol Buffers (protobufs) • Generated classes in C, Python, Java, etc. that can load, save, and access the data • File Format: Human readable TextFormat (.pbtxt) vs. compressed Binary (.pb) • Graph object: Foundation of computation in TensorFlow • Weights: Held in separate checkpoint files • Standards: Support for ONNX, PMML Autoencoder for Anomaly Detection
  • 24. 24Apache Kafka and Machine Learning – Kai Waehner Agenda • Challenges of Machine Learning • Apache Kafka Ecosystem for Machine Learning • TensorFlow Ecosystem • Prototyping and Data Engineering with Python, Jupyter and KSQL • Streaming Model Training with Kafka and TensorFlow I/O • Production Deployment Alternatives and Trade-Offs • Real World Example - Supply Chain Optimization
  • 25. 25Apache Kafka and Machine Learning – Kai Waehner Jupyter https://jupyter.org/
  • 26. 26Apache Kafka and Machine Learning – Kai Waehner Prototyping with TensorFlow in a Jupyter Notebook
  • 27. 27Apache Kafka and Machine Learning – Kai Waehner Data Preprocessing at Scale and Reliable Preprocessing Filter, transform, anonymize, extract features Data needs to be preprocessed at scale and reusable! Streams • Use KSQL to preprocess data at scale without coding • Use SQL statements for interactive analysis + deployment to production at scale • Leverage e.g. Python with KSQL REST interface Data Ready for Model Training
  • 28. 28Apache Kafka and Machine Learning – Kai Waehner KSQL – A Streaming SQL Engine for Apache Kafka
  • 29. 29Apache Kafka and Machine Learning – Kai Waehner Preprocessing with KSQL SELECT car_id, event_id, car_model_id, sensor_input FROM car_sensor c LEFT JOIN car_models m ON c.car_model_id = m.car_model_id WHERE m.car_model_type ='Audi_A8';
  • 30. 30Apache Kafka and Machine Learning – Kai Waehner Excursus: KSQL compared to Kafka Streams https://www.slideshare.net/KaiWaehner/kafka-streams-vs-ksql-for-stream-processing-on-top-of-apache-kafka-142127337
  • 31. 31Apache Kafka and Machine Learning – Kai Waehner Data Engineering with Python, KSQL, TensorFlow and Keras https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras/blob/master/python- jupyter-apache-kafka-ksql-tensorflow-keras.ipynb Pick and combine the tools you need and want to use! Some libraries used in this example: • Numpy • Pandas • TensorFlow • Keras • KSQL • ksql-python • sklearn • matplotlib
  • 32. 32Apache Kafka and Machine Learning – Kai Waehner Live Demo Rapid Prototyping and Data Preprocessing at Scale with Python, Jupyter and KSQL
  • 33. 33Apache Kafka and Machine Learning – Kai Waehner Data Engineering and Interactive Queries with Jupyter, Python and KSQL https://github.com/jupyter/jupyter/wiki/Jupyter-kernels https://github.com/takluyver/bash_kernel You can also use just the bash kernel and KSQL CLI:
  • 34. 34Apache Kafka and Machine Learning – Kai Waehner ksql> SELECT customer_id, location_id FROM orders WHERE customer_id = 32235; +-------------+-------------+ | customer_id | location_id | +-------------+-------------+ | 32235 | 90 | +-------------+-------------+ 1 row in 0.003s ksql> ksql> ksql> ksql> SELECT count FROM orders WHERE customer_id = 1980; +-----------+ | count | +-----------+ | 12 | +-----------+ 1 row in 0.002s ksql> CREATE TABLE orders AS SELECT customer_id, location_id, count(*) FROM orders_stream GROUP BY customer_id, location_id; Query runs until completion and returns the final result as quickly as possible KSQL: Interactive Queries (aka Point-in-Time Queries) https://github.com/confluentinc/ksql/blob/master/design-proposals/klip-8-interactive-queries.md
  • 35. 35Apache Kafka and Machine Learning – Kai Waehner Agenda • Challenges of Machine Learning • Apache Kafka Ecosystem for Machine Learning • TensorFlow Ecosystem • Prototyping and Data Engineering with Python, Jupyter and KSQL • Streaming Model Training with Kafka and TensorFlow I/O • Production Deployment Alternatives and Trade-Offs • Real World Example - Supply Chain Optimization
  • 36. 36Apache Kafka and Machine Learning – Kai Waehner Data Ingestion into a Data Store Connect Preprocessed Data There isn’t just one ML solution. We need to be flexible!
  • 37. 37Apache Kafka and Machine Learning – Kai Waehner Kafka Connect • “Kafka Benefits Under the Hood” • Out-of-the-box connectivity • Data format conversion • Single message transformation (including error-handling) KafkaConnect KafkaConnect Data Source Data Sink REST API
  • 38. 38Apache Kafka and Machine Learning – Kai Waehner CREATE SOURCE CONNECTOR reader WITH (source = ‘confluent.jdbc.postgres’, table = ‘customers’, …); CREATE SINK CONNECTOR writer WITH (sink = ‘confluent.s3’, bucket = ‘vip_customers’, …); CREATE STREAM postgres_customers (id integer, purchases integer) WITH (source = ‘reader’, ...); CREATE STREAM vip_customers WITH (sink = ‘writer’, ...) AS SELECT * FROM postgres_customers WHERE purchases > 10; KSQL: Embedded Kafka Connect https://github.com/confluentinc/ksql/blob/master/design-proposals/klip-7-connect-integration.md Continuous streaming integration and pre-processing at scale and reliable – just with SQL commands!
  • 39. 39Apache Kafka and Machine Learning – Kai Waehner Model Training using a Data Store Let’s build some models at extreme scale using TensorFlow and TPUs! Analytic Model
  • 40. 40Apache Kafka and Machine Learning – Kai Waehner Streaming Model Training without additional Data Store https://github.com/tensorflow/io/tree/master/tensorflow_io/kafka TensorFlow I/O Kafka Plugin • Native integration between Kafka and TensorFlow • KafkaDataSet and KafkaOutputSequence for TensorFlow • Written in C++ (linked with librdkafka) • Part of the graph in TensorFlow • Direct training and inference from streaming data • No data storage like S3 or HDFS needed
  • 41. 41Apache Kafka and Machine Learning – Kai Waehner Streaming Model Training with Kafka and TensorFlow I/O https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference Python Kafka Producer Python Kafka Consumer + Streaming Ingestion + Model Training
  • 42. 42Apache Kafka and Machine Learning – Kai Waehner Time Model BModel A Producer Distributed Commit Log Streaming Model Training with Kafka and TensorFlow I/O Another Real Time Consumer Another Batch Consumer
  • 43. 43Apache Kafka and Machine Learning – Kai Waehner Model Example: Autoencoder for Anomaly Detection
  • 44. 44Apache Kafka and Machine Learning – Kai Waehner Agenda • Challenges of Machine Learning • Apache Kafka Ecosystem for Machine Learning • TensorFlow Ecosystem • Prototyping and Data Engineering with Python, Jupyter and KSQL • Streaming Model Training with Kafka and TensorFlow I/O • Production Deployment Alternatives and Trade-Offs • Real World Example - Supply Chain Optimization
  • 45. 45Apache Kafka and Machine Learning – Kai Waehner RPC communication to do model inference Streams Input Event Prediction Request Response Model Serving TensorFlow Serving gRPC / HTTP Application
  • 46. 47Apache Kafka and Machine Learning – Kai Waehner Model interference natively embedded into the App Application Input Event Prediction
  • 47. 48Apache Kafka and Machine Learning – Kai Waehner Model interference in a Stream Processing App Streams Input Event Prediction Stream Processing Model doPrediction() return value
  • 48. 49Apache Kafka and Machine Learning – Kai Waehner Model interference in any Kafka Client App Input Event Prediction Kafka Client REST Client Model doPrediction() return value
  • 49. 52Apache Kafka and Machine Learning – Kai Waehner RPC vs. Stream Processing for Model Serving Why a Model Server and RPC • Simple integration with existing technologies and organizational processes • Easier to understand if you come from non- streaming world • Later migration to real streaming is also possible • Model management built-in for different models, versioning and A/B testing • Monitoring built-in Why embedded into Streaming App • Better latency as remote call instead of local inference • Offline inference (devices, edge processing, etc.) • No coupling of the availability, scalability, and latency/throughput of your Kafka Streams application with the SLAs of the RPC interface • No side-effects (e.g., in case of failure), all covered by Kafka processing (e.g., exactly once) Application Input Event Prediction
  • 50. 54Apache Kafka and Machine Learning – Kai Waehner Model Deployment with Apache Kafka, KSQL and TensorFlow “CREATE STREAM AnomalyDetection AS SELECT sensor_id, detectAnomaly(sensor_values) FROM machine_engine;“ User Defined Function (UDF)
  • 51. 55Apache Kafka and Machine Learning – Kai Waehner Live Demo Real Time Model Scoring with KSQL and TensorFlow
  • 52. 56Apache Kafka and Machine Learning – Kai Waehner Agenda • Challenges of Machine Learning • Apache Kafka Ecosystem for Machine Learning • TensorFlow Ecosystem • Prototyping and Data Engineering with Python, Jupyter and KSQL • Streaming Model Training with Kafka and TensorFlow I/O • Production Deployment Alternatives and Trade-Offs • Real World Example - Supply Chain Optimization
  • 53. Planners forecast long term schedule Production begins IOT data from production: inventories, manufacturing machines, yield metrics Production forecast Forecasted production - plan diffs Re optimize plan based on actuals Change orders to supply chain: inventory, manufacturing schedules Change operational characteristics : plant 223 needs new Al extruder Customer delivery SLAs: actuals vs. plan Streaming analytics using Confluent Batch analytics using other frameworks Physical operations UI UI UIUI (Reference use case implemented with our partner Expero)
  • 54. Planners forecast long term schedule Production begins IOT data from production: inventories, manufacturing machines, yield metrics Production forecast Forecasted production - plan diffs Re optimize plan based on actuals Change orders to supply chain: inventory, manufacturing schedules Change operational characteristics : plant 223 needs new Al extruder Customer delivery SLAs: actuals vs. plan UI UI UIUI PLC4X Connector Kafka ConnectMQTT File HTTP Machine Sensors Kafka Cluster KSQL Tensor Flow Kafka Connect Notebooks (Jupyter) Spark Real Time Kafka App Streaming analytics using Confluent Batch analytics using other frameworks Physical operations TensorFlow Serving (Reference use case implemented with our partner Expero)
  • 55. 62Apache Kafka and Machine Learning – Kai Waehner Code and Demos for Kafka and Machine Learning https://github.com/kaiwaehner
  • 56. 63Apache Kafka and Machine Learning – Kai Waehner Key Takeaways • The Apache Kafka ecosystem helps to do data engineering and production deployment at scale • Jupyter allows debugging, prototyping and scalable, reliable data processing by combining tool sets • Kafka and TensorFlow I/O enable streaming model training without extra data store
  • 57. 64Apache Kafka and Machine Learning – Kai Waehner Kai Waehner Technology Evangelist contact@kai-waehner.de @KaiWaehner www.kai-waehner.de www.confluent.io LinkedIn Questions? Feedback? Let’s connect!