Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Using MQTT, Kafka and InfluxDB 2.0 on Kubernetes | InfluxDays Virtual Experience London 2020

IoT Architectures for a Digital Twin
with Apache Kafka and InfluxDB
A Digital Replica of Things - Open, Scalable and Reliable
Kai Waehner
Technology Evangelist
contact@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de

IoT and Digital Twin with Apache Kafka and InfluxDB – @KaiWaehner - www.kai-waehner.de
Agenda
• Digital Twin - Merging the Physical and the Digital World
• Real World Challenges
• Apache Kafka as Event Streaming Solution for IoT
• IoT Platforms
• Spoilt for Choice for a Digital Twin
• IoT Architectures with Kafka and InfluxDB
• A Digital Twin for 100000 Connected Cars
3

Agenda
• IoT Platforms
4

Software and Digital Services become the Key Differentiator
5
https://www.mckinsey.com/industries/advanced-electronics/our-insights/iiot-platforms-the-technology-stack-as-value-driver-in-industrial-equipment-and-machinery

Digital Twin – Merging the Physical and the Digital World
6
• Downtime reduction
• Inventory management
• Fleet management
• What-if simulations
• Operational planning
• Servitization
• Product development
• Healthcare
• Customer experience
“Virtual representation of something else (Physical thing, process, service)”
“A living model that drives a business outcome”
https://www.youtube.com/watch?v=Ri0TD7kYsIQ

Smart Infrastructure:
Digital Solutions for Entire Building Lifecycle
7
https://new.siemens.com/global/en/products/buildings/digitalization/digital-building-lifecycle.html

Connected Car Infrastructure
8
https://www.youtube.com/watch?v=yGLKi3TMJv8

Twinning the Human Body to Enhance Medical Care
9
https://www.challenge.org/insights/digital-twin-in-healthcare/
https://youtu.be/H6JzPCbyVSM

Digital Twin and Artificial Intelligence (AI) / Machine Learning
• Complementary Concepts
• Continuous Learning, Monitoring and Acting
• (Good) Data is key for success
10
https://towardsdatascience.com/understanding-feature-engineering-part-1-continuous-numeric-data-da4e47099a7b

Agenda
• IoT Platforms
11

History of Automation Industry vs. Big Data and Cloud
https://foss-backstage.de/sites/foss-backstage.de/files/2018-07/Revolutionizing%20Industrial%20IoT%20with%20Apache%20PLC4X.pdf

Trends: Evolution of Convergence between IT and Industrial Automation
https://iot-analytics.com/5-industrial-connectivity-trends-driving-the-it-ot-convergence

Complexity, Cost and Scalability are Main Blockers
14

Huge demand to build an open, flexible, scalable platform
• Real time
• Scalability
• High availability
• Decoupling
• Cost reduction
• Flexibility
• Standards-based
• Extendibility
• Security
• Infrastructure-independent
• Multi-region / global

Agenda
• IoT Platforms
16

The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
Apache Kafka - The Rise of an Event Streaming Platform
17
=
Messaging
+
Storage
+
Integration
+
Processing

P
Decoupling of Producers and Consumers
Time
C2 C3C1
18

Apache Kafka at Scale at Tech Giants
> 7 trillion messages / day > 6 Petabytes / day
“You name it”
* Kafka Is not just used by tech giants
** Kafka is not just used for big data
19

Improve
Customer
Experience
(CX)
Increase
Revenue
(make money)
Business
Value
Decrease
Costs
(save money)
Core Business
Platform
Increase
Operational
Efficiency
Migrate to
Cloud
Mitigate Risk
(protect money)
Key Drivers
Strategic Objectives
(sample)
Fraud
Detection
IoT sensor
ingestion
Digital
replatforming/
Mainframe Offload
Connected Car: Navigation & improved in-
car experience: Audi
Customer 360
Simplifying Omni-channel Retail at Scale:
Target
Faster transactional
processing / analysis
incl. Machine Learning / AI
Mainframe Offload: RBC
Microservices
Architecture
Online Fraud Detection
Online Security
(syslog, log aggregation,
Splunk replacement)
Middleware
replacement
Regulatory
Digital
Transformation
Application Modernization: Multiple
Examples
Website / Core
Operations
(Central Nervous System)
The [Silicon Valley] Digital Natives;
LinkedIn, Netflix, Uber, Yelp...
Predictive Maintenance: Audi
Streaming Platform in a regulated
environment (e.g. Electronic Medical
Records): Celmatix
Real-time app
updates
Real Time Streaming Platform for
Communications and Beyond: Capital One
Developer Velocity - Building Stateful
Financial Applications with Kafka Streams:
Funding Circle
Detect Fraud & Prevent Fraud in Real Time:
PayPal
Kafka as a Service - A Tale of Security and
Multi-Tenancy: Apple
Example Use Cases
$↑
$↓
$↔
Example Case Studies
(of many)
Confluent - Business Value per Use Case
20

10 Reasons for Event Streaming with Apache Kafka
Real Time
Scalable
Cost Reduction
24/7 – Zero downtime, zero data loss
Decoupling – Storage, Domain-driven Design
Data (re-)processing and stateful client applications
Integration – Connectivity to IoT, legacy, big data, everything
Hybrid Architecture – On Premises, multi cloud, edge computing
Fully managed cloud
No vendor locking
21

Digital Twin and AI / Machine Learning (with Kafka)
• Complementary Concepts
• Continuous Learning, Monitoring and Acting à Real time, scalable
• (Good) data is key for success à Integration, data processing
22

Hold on…
Kafka is NOT
an IoT Platform!

Device management
Unreliable networks
Connectivity beyond standards
Edge hardware
…

Agenda
• IoT Platforms
25

600+ IoT Platforms
26
https://iot-analytics.com/iot-platform-companies-landscape-2020/

Proprietary IoT Platforms
27

IoT Offerings from Cloud Providers
28

Standards-based / Open Source IoT Platforms
29

Agenda
• IoT Platforms
30

Characteristics of Digital Twin Technology
• Connectivity
• Physical assets, enterprise software, customers
• Bidirectional communication
• Homogenization
• Decoupling and standardization
• Virtualization of information
• Shared with multiple agents
• Lower cost
• Reprogrammable and smart
• Adjust and improve characteristics
• Digital traces
• Diagnose problems
• Modularity
• Tweak modules of models and machines
31

Scenario 1: Digital Twin Monolith
32
Siemens S7, Modbus, Allen Bradley, Beckhoff ADS
IoT
Platform
Digital
Twin
Device Mgt.
Analytics
Connectivity
Homogenization
Reprogrammable and smart
Digital traces
Modularity

Scenario 2: Digital Twin as External Database
33
IoT
Platform
Digital
Twin
Device Mgt.
InfluxDB
Analytics
Connectivity
Homogenization
Digital traces
Modularity

Apache
Kafka
Scenario 3: Kafka as Backbone for the
Digital Twin and the Rest of the Enterprise
34
IoT
Platform
Digital
Twin
InfluxDB
Real
Time
App
Batch
App
Request
Response
App
Kafka
Connect
Connectivity
Homogenization
Digital traces
Modularity

Apache Kafka
Scenario 4: Kafka as IoT Platform
35
Digital
Twin
InfluxD
B
Real
Time
App
Batch
App
Request
Response
App
Kafka Connect
Connectivity
Homogenization
Digital traces
Modularity
Storage Processing

Agenda
• IoT Platforms
36

Building a Digital Twin with Kafka and InfluxDB
Apache Kafka
• Integration
• Decoupling and Backpressure
• Data Processing
• Ingest into InfluxDB
• Consume from InfluxDB
• Consumption by other Applications
InfluxDB
• Storage
• Batch and Real Time Analytics
• Dashboards
Þ Open
Þ Scalable
Þ Mission-critical
37
Data Lake
Batch Analytics
Kafka Streams /
ksqlDB
Stream
Processing
Databases
Message Queues
Sensors
Applications

Edge Digital Twin
Single Broker
(or Cluster)
Digital Twin
Self-managed or
certified OEM Hardware
Kafka
Cluster
in DC /
Cloud
Replicator

Centralized Digital Twin
Single
Kafka Broker
(or Cluster)
Self-managed or
Single
Kafka Broker
(or Cluster)
Self-managed or

Global Digital Twin Architecture
Multiple Clusters and Aggregation
Factories à Analytics Cluster
Multi-Region Cluster
High Availability (Disaster Recovery)
Global Data Streaming
Outsourced
Development

Agenda
• IoT Platforms
41

A Digital Twin with
Kafka, TensorFlow and InfluxDB
42
MQTT
Proxy
InfluxDB
Storage
InfluxDB
Dashboards
+
Analytics
Kafka
Cluster
Kafka
Connect
Car Sensors
Kafka Ecosystem
TensorFlow
InfluxDB
Other Components
Kafka
Streams
(Java)
All
Data
Critical
Data
Ingest
Data
Potential
Detect
KSQL
TensorFlow
Train
Analytic
Model
Consume
Data
Preprocess
Data
Analytic
Model
Deploy
Analytic
Model
Python
https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference
Connectivity
Homogenization
Digital traces
Modularity
Mobile App
BI Tool

Architecture for 100000 Connected Cars
Kafka + KSQL + MQTT + TensorFlow + Kubernetes
43
https://www.kai-waehner.de/blog/2019/11/08/live-demo-iot-100-000-connected-cars-kubernetes-kafka-mqtt-tensorflow/

Kafka Connect Connector for InfluxDB
44
https://www.confluent.io/hub/confluentinc/kafka-connect-influxdb

Key Takeaways
• A Digital Twin merges the physical and the digital world
• Apache Kafka + InfluxDB enable an open, scalable and reliable infrastructure for a Digital Twin
• Event Streaming complements IoT platforms and other backend applications / databases.
+
45

Kai Waehner
Technology Evangelist
contact@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
www.confluent.io
LinkedIn
Questions? Feedback?
Let’s connect!

Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Using MQTT, Kafka and InfluxDB 2.0 on Kubernetes | InfluxDays Virtual Experience London 2020

More Related Content

What's hot

Similar to Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Using MQTT, Kafka and InfluxDB 2.0 on Kubernetes | InfluxDays Virtual Experience London 2020

More from InfluxData

Recently uploaded

Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Using MQTT, Kafka and InfluxDB 2.0 on Kubernetes | InfluxDays Virtual Experience London 2020