Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Machine Learning Microservices with Apache Kafka Streams and H2O.ai"
Kai Wähner works as Technology Evangelist at Confluent. Kai’s main area of expertise lies within the fields of Big Data Analytics, Machine Learning, Integration, Microservices, Internet of Things, Stream Processing and Blockchain. He is regular speaker at international conferences such as JavaOne, O’Reilly Software Architecture or ApacheCon, writes articles for professional journals, and shares his experiences with new technologies on his blog (www.kai-waehner.de/blog). Contact and references: kontakt@kai-waehner.de / @KaiWaehner / www.kai-waehner.de
Abstract:
Big Data and Machine Learning are key for innovation in many industries today. The first part of this session explains how to build analytic models with R, Python or Scala leveraging open source machine learning / deep learning frameworks like Apache Spark, TensorFlow or H2O.ai. The second part discusses the deployment of these built analytic models to your own applications or microservices leveraging the Apache Kafka cluster and Kafka Streams. The session focuses on live demos and teaches lessons learned for executing analytic models in a highly scalable and performant way.
Similar to Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Machine Learning Microservices with Apache Kafka Streams and H2O.ai"
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Kai Wähner
Similar to Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Machine Learning Microservices with Apache Kafka Streams and H2O.ai" (20)
Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Machine Learning Microservices with Apache Kafka Streams and H2O.ai"
1. 1Confidential
Apache Kafka + H2O.ai
Machine Learning Applied to Real Time Stream Processing
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.kai-waehner.de
2. 2Apache Kafka and Machine Learning
Agenda
1) Machine Learning and Real Time Applications
2) Building an Analytic Model with H2O.ai
3) Applying an Analytic Model with Apache Kafka
3. 3Apache Kafka and Machine Learning
Agenda
1) Machine Learning and Real Time Applications
2) Building an Analytic Model with H2O.ai
3) Applying an Analytic Model with Apache Kafka
4. 4Apache Kafka and Machine Learning
Machine Learning
... allows computers to find hidden insights without being
explicitly programmed where to look.
5. 5Apache Kafka and Machine Learning
Real World Examples of Machine Learning
Spam Detection
Search Results +
Product Recommendation
Picture Detection
(Friends, Locations, Products)
Your Company
The Next Disruption:
Google Beats Go Champion
6. 6Apache Kafka and Machine Learning
Leverage Machine Learning to Analyze and Act on Critical Business Moments
Seconds Minutes Hours
Price
Optimization
Predictive
Maintenance
Fraud
Detection
Cross Selling
Transportation
Rerouting
Customer
Service
Inventory
Management
Windows of Opportunity
7. 7Apache Kafka and Machine Learning
Big Data Analytics for Actionable Insights
From Insight to Action
(continuous loop)
8. 8Apache Kafka and Machine Learning
Streaming Platform
Big Data Analytics
Database
IoT Device
Streaming
Producer
…..
DWH
Data
Integration
C
O
N
N
E
C
T
C
O
N
N
E
C
T
Data Lake
Model
Building
Batch
Real
Time
Stream
Processing
REST
Interface
IoT Device
Mobile
App
Streaming
Consumer
C
O
N
N
E
C
T
C
O
N
N
E
C
T
BI Tool
Messaging
Web
Application
Model
Schema Registry
/ Governance
1) Data Producer
2) Analytics Platform
3) Streaming Platform
4) Data Consumer
9. 9Apache Kafka and Machine Learning
Agenda
1) Machine Learning and Real Time Applications
2) Building an Analytic Model with H2O.ai
3) Applying an Analytic Model with Apache Kafka
10. 10Apache Kafka and Machine Learning
STREAMING PLATFORM
BIG DATA ANALYTICS
Oracle DB
CoaP IoT
Kafka
Java Client
…..
HP Vertica
Data
Integration
F
L
U
M
E
H2O.ai,
TensorFlow
Batch
Real
Time
Confluent
REST Proxy
MQTT IoT
iPhone
App
Kafka
Go Client
C
K O
A N
F N
K E
A C
T
H
I
V
E
Grafana
Kafka
Java EE
Web App
Hadoop
C
K O
A N
F N
K E
A C
T
Confluent
Schema Registry
Kafka Streams
H2O.ai
Mesos
Kafka Streams
TensorFlow
Kubernetes
Avro
Avro
1) Data Producer
2) Analytics Platform
3) Streaming Platform
4) Data Consumer
11. 11Apache Kafka and Machine Learning
Languages, Frameworks and Tools
Many more ….
Portable Format
for Analytics (PFA)
12. 12Apache Kafka and Machine Learning
Live Demo
Use Case:
Airline Flight Delay Prediction
Machine Learning Algorithm:
Gradient Boosted Machines (GBM)
using Decision Trees
Technology:
H2O.ai
13. 13Apache Kafka and Machine Learning
Agenda
1) Machine Learning and Real Time Applications
2) Building an Analytic Model with H2O.ai
3) Applying an Analytic Model with Apache Kafka
14. 14Apache Kafka and Machine Learning
STREAMING PLATFORM
BIG DATA ANALYTICS
Oracle DB
CoaP IoT
Kafka
Java Client
…..
HP Vertica
Data
Integration
F
L
U
M
E
H2O.ai,
TensorFlow
Batch
Real
Time
Confluent
REST Proxy
MQTT IoT
iPhone
App
Kafka
Go Client
C
K O
A N
F N
K E
A C
T
H
I
V
E
Grafana
Kafka
Java EE
Web App
Hadoop
C
K O
A N
F N
K E
A C
T
Confluent
Schema Registry
Kafka Streams
H2O.ai
Mesos
Kafka Streams
TensorFlow
Kubernetes
Avro
Avro
1) Data Producer
2) Analytics Platform
3) Streaming Platform
4) Data Consumer
15. 15Apache Kafka and Machine Learning
When to use Kafka Streams for Stream Processing?
No need for a
Big Data cluster
Deploy in your
existing infrastructure
Kafka manages
scalability / fail-over
Focus on development
of business logic
in your department
16. 16Apache Kafka and Machine Learning
Use Case:
Airline Flight Delay Prediction
Machine Learning Algorithm:
Any! (in our example, H2O.ai GBM)
Streaming Platform:
Apache Kafka Core, Kafka’s Streams API
Live Demo with Open Source Technologies
17. 17Apache Kafka and Machine Learning
H2O.ai Model + Kafka Streams
Filter
Map
1) Create H2O ML model
2) Configure Kafka Streams Application
3) Apply H2O ML model to Streaming Data
4) Start Kafka Streams App
18. 18Apache Kafka and Machine Learning
Online Model Training
How to improve models?
1.Manual Update
2.Automated Batch
3.Real Time
à Apache Kafka for Messaging and Real Time Apps
19. 19Apache Kafka and Machine Learning
Caveats for Online Model Training
• Processes and infrastructure not ready
• Validation needed before production
• Slows down the system
• Only a few ML implementations
supported
• Many use cases do not need it
20. 20Apache Kafka and Machine Learning
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
LinkedIn
Questions? Feedback?
Please contact me!