Our talk will explore the transformative impact of integrating Confluent, HiveMQ, and SparkPlug in Industry 4.0, emphasizing the creation of a Unified Namespace.
In addition to the creation of a Unified Namespace, our webinar will also delve into Stream Governance and Scaling, highlighting how these aspects are crucial for managing complex data flows and ensuring robust, scalable IIoT-Platforms.
You will learn how to ensure data accuracy and reliability, expand your data processing capabilities, and optimize your data management processes.
Don't miss out on this opportunity to learn from industry experts and take your business to the next level.
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and SparkPlug
1. @yourtwitterhandle | developer.confluent.io
What are the best practices to debug client applications
(producers/consumers in general but also Kafka Streams
applications)?
Starting soon…
STARTING SOOOOON..
Starting sooooon ..
2. @yourtwitterhandle | developer.confluent.io
What are the best practices to debug client applications
(producers/consumers in general but also Kafka Streams
applications)?
Starting soon…
STARTING SOOOOON..
3. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
IoT meets Confluent meets Data Platform
MQTT
Broker
OPC UA
gRPC
Proxy
4. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
IoT meets Confluent meets Data Platform
MQTT
Broker
OPC UA
gRPC
Proxy
6. Goal
Partners Tech Talks are webinars where subject matter experts from a Partner talk about a
specific use case or project. The goal of Tech Talks is to provide best practices and
applications insights, along with inspiration, and help you stay up to date about innovations
in confluent ecosystem.
9. 9
Business challenges Technical challenges
Limited scalability with messaging
brokers requiring manual horizontal
scaling via VMs.
Downtime and support tickets with
operational disruption.
Data schema incompatibility breaking
existing functionality or causing data loss.
Unprecedented data volume – collecting
data from your 100,000s of IoT devices can
be challenging to store and process.
Delayed time to insight from high latency
in batch data processing, hindering your
organization’s ability to react by hours or
more.
Data variety, as data from text, audio, and
video is challenging to analyze.
Data quality issues with noisy and
incomplete data in inconsistent formats
affecting the accuracy of your analysis.
10. 10
Why Confluent
Stream
data everywhere, from IoT devices via MQTT,
on premises and in every major public cloud.
Connect
IoT sensors, objects, devices, and other systems
with pre-built, fully managed connectors to
build streaming data pipelines.
Process
data streams in flight to create live, refined,
ready-to-use IoT data products.
Govern
data to ensure quality, security, and
compliance while enabling teams to discover
and leverage existing data products.
Business impact
Create new revenue streams for your
business (e.g., route optimization modules for
your customers to save fuel costs and optimize
driver hours).
Unlock real-time analytics for new use cases
such as predictive maintenance.
Improve your platform reliability and
stability with Confluent’s 99.99% uptime SLA.
Seamlessly scale from 0.5 MBps to 50 MBps in
a matter of minutes.
INDUSTRY: ALL
11. MQTT: the natural candidate
➢ MQTT is lightweight and designed to address edge devices connectivity
○ Poor connectivity / High latency network
➢ MQTT can address many thousands connections with filtered distribution of data to
consumers (esp. devices)
➢ Many enterprise and open source MQTT broker implementations
○ Mosquitto, RabbitMQ, HiveMQ, VerneMQ
➢ MQTT is becoming a de facto standard in (I)IoT space
○ Both Edge & Cloud
➢ Many Client Libraries
○ C, C++, Java, C#, Python, Javascript, websockets, Arduino …
11
12. … But MQTT has caveats and is not enough
MQTT is designed for safe message delivery, not for stream processing
Once message is delivered, message is not retained.
In case of processing crash after message delivery, messages are lost and cannot be
re-processed, then corrupting business outputs
Real-time processing of your manufacturing data require stream processing
infrastructure: Apache Kafka
Recommended read : https://www.umh.app/post/tools-techniques-for-scalable-data-processing-in-industrial-iot
12
IoT Data
Ingestion at Edge
IoT Data
Aggregation & Processing
Other OT Protocols
IoT
Gateway
Custom
Integration
Edge Integration
Data Ingestion &
Processing
13. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Connect - Broker to Kafka
7
Complements Kafka's concurrent
connection limits
Complementing a large number of
simultaneous connections, which Kafka
is not good at, with a dedicated broker.
All Brokers provide Kafka or Confluent
connection functions and can be
connected seamlessly.
Broker selection according to
connection needs
Various brokers can be selected
according to the number of
connected devices, connection
type, traffic volume, and client
requirements.
14. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
MQTT Confluent Connectors
14
Kafka Cluster
Kafka Connect Cluster
MQTT Source Connector
MQTT Sink Connector
Broker
Subscribe &
Consume
Publish
Publish
Subscribe &
Consume
➢ Relies on 3rd Party MQTT Broker
○ Some example of integration with HiveMQ :
https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realt
ime-iot-machine-learning-training-inference/tree/master/infrastruct
ure/terraform-gcp
➢ Handles both communication paths
➢ Available on confluent.io/hub
15. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
MQTT Proxy
Gateways BROKER
Devices MQTT
Proxy
➢ MQTT Proxy exposes MQTT protocol and translates into Kafka Protocol
➢ Remove need to deploy and manage 3rd party MQTT brokers
➢ Different solutions
○ Confluent MQTT Proxy (only handles device-to-Kafka data flows)
○ Technology partner
16. @yourtwitterhandle | developer.confluent.io
What are the best practices to debug client applications
(producers/consumers in general but also Kafka Streams
applications)?
Starting soon…
STARTING SOOOOON..
Starting sooooon ..
17. About me
ALEXANDER KEIDEL
Head of Product & Alliance
in the fields of Business Intelligence/Big Data and
IoT/Smart City.
Main focus: Design and implementation of
heterogeneous software architectures with open
source components such as Kafka, Kong,
Pentaho and ThingsBoard.
18. Content
− Intro
− What is the Problem that the Unified namespace solves
− Implementation with HiveMQ
− UNF: Confluent+ Hive ..better together
− Recaps
20. − Currently most challenging Area (Ton of Data Silos)
− Protocols for build OT
− Legacy hardware and limitations
− Avg. depreciation period for a maschine around 15Y
− Dominated by OPC-UA
Focus on OT
21. The Shopfloor in a Nutshell
− A machine (e.g. Robot) is interconnected via an Realtime-Protocol to a PLC that runs the
production routines
− Data is capatured by an MES and or SCADA for Ops Control
OT Protocol
(e.g.Profinet)
PL
C
SCAD
A
MES
22. Low grain data is key to AI
Level 4
ERP
Level 3 Ops
Control
(MES)
Level 2 Process Control
(SCADA)
Level 1 - Control (PLC)
Level 0 - Field Level (Sensors Actors)
min
hours
sec/m
s
Days
us/ms
23. OPC-UA and the problem it solves
• Unified Access for different machine vendors and types (Otherwise, the MES has to implement
the vendor-specific protocol)
• Addresses/Variables of the PLC Program are displayed in a hierarchical Namespace
OT
Protocol
Siemens
OT
Protocol
Beckhoff
OT
Protocol
Maxwell
OPC
-UA
Serv
er
MES
OPC-U
A
Client
24. 1. Complexity
OPC UA offers a wide range of
features including complex data
models, security mechanisms, and
interoperability capabilities.
While these features are beneficial,
they can also make the
implementation and configuration of
OPC UA more complex compared to
simpler protocols.
Problems of OPC-UA (my Top 3)
3. Security Concerns
While OPC UA includes robust
security features, the configuration
and maintenance of these security
measures can be complex.
Incorrectly configured security
settings might leave systems
vulnerable.
Additionally, as with any networked
technology, the larger attack surface
of IIoT systems might expose OPC
UA implementations to cyber threats
if not properly secured.
2. Scalability
Although OPC UA is designed to be
scalable, managing a large number
of OPC UA servers and clients in a
vast IIoT network can become
challenging.
The protocol's sophisticated features
might require more resources,
making it harder to scale up
efficiently in large deployments
without significant resource
allocation and careful planning.
25. MQTT in a Nutshell
− MQTT in its first version
was developed in 1999
for monitoring oil
pipelines. Initially
designed for limited
bandwidth through radio
and satellite networks
− Implements
publish/subscribe pattern
− Payload variable
TCP-based (MQTT-S for
UDP)
− TLS supported
− Scalable for millions of
connections
27. Vanilla MQTT – Problems (Top 5)
1. MQTT does allow all payloads (Structure).
2. MQTT does allow the Publisher to set the QoS (Quality of Service level should be decided by
the Subscriber).
3. MQTT does allow the Publisher to set Retained Messages (creates load on the Broker).
4. MQTT does allow the Publisher to register LWT (Last Will and Testament) Messages (pertains
to Business Logic).
5. MQTT does not make suggestions regarding Namespaces (Governance) (e.g.,
/plant/sensorA, /dev/SensorB…SensorC/temp/C).
28. Sparkplug B
•
2015 Introduction of Sparkplug: Cirrus Link Solutions introduces Sparkplug, a specification designed to
enhance MQTT with a standard data format.
• 2017 Eclipse Tahu Project: Sparkplug B is contributed to the Eclipse Foundation under the Tahu Project to
promote community-driven development and standardization.
• 2017 Adoption and Iteration: is released adoption by industries starts to take place, with iterations made to
the Sparkplug B specification based on real-world use cases and feedback.
• 2019-2020 : Sparkplug 2.2 is released as Industry 4.0 gains momentum, Sparkplug B sees increased adoption
as a key enabler for interoperability in IoT platforms.
• 2022-ongoing : Sparkplug 3.0.0 is released, containing various Improvements, super seeding 2.2.
29. Sparkplug B
1. Uniform Data Structure: Sparkplug B defines a uniform data
structure that ensures data is transmitted in a standardized way
in an MQTT-based network.
2. Uniform Namespaces: Sparkplug B defines a standardized
method for managing IoT endpoints, including the transmission
of device metadata and status information.
3. Extensibility: Sparkplug B is an open framework that is
extendable to meet the requirements of different applications and
industries. It allows developers to tailor it to their specific needs
without the need to change the underlying functionality.
4. Specifications regarding QoS, LW and Retained Flags for all
Message Types
30. Sparkplug Basis Message Type
1. "Birth" - This message is used to announce the creation or presence
of a device or namespace at the broker.
2. "Data" - This message type contains the payload data for a specific
data element within a namespace.
3. "CMD" - This message type is used to transmit commands from the
broker to a device.
4. "Death" - This message is used to announce the disappearance or
decommissioning of a device or namespace from the network.
5. "State" - This message type contains the current state or data of a
device or namespace. Distinction between N (Edge of Network) and
D(Device) messages Example: NData / DData
31. The Idea of the Unified Namespace in a Nutshell
1. Lets use MQTT for interconnecton in all the IIoT Space
2. Lets think of all Data Sources like devices/sensors
3. Lets use SparkPlug
4. Lets use ISA95 for our Namespace Hierachy as a Start
5. The unfied Namespace should be the single source of
truth of IIoT
36. HiveMQ
Solves Reliability
Cluster
Zero message loss
Persistent messaging and replication
to disk, true Quality of Service (QoS)!
Reliable communication
Connection and cluster overload
protection, automatic throttling,
queueing, retained messages..
No single point of failure
Masterless cluster architecture.
Zero downtime upgrades
Broker spawning with nodes
seamlessly upgraded.
36
37. HiveMQ Solves Scalability
● Proven scalability – benchmarked to
200M active clients with 1.8B
messages/hour
● Elastic scaling – Masterless load
balancing, automatic data balancing,
smart message distribution across cluster
nodes
● Linear scalability – Scale from 2->100+
nodes with consistent hashing algorithm
both vertically and horizontally
37
38. HiveMQ Enables Edge
Edge Deployment
Address connectivity challenges of organizations
Enables Unified Namespace
Eliminate data silos by enabling UNS
API-based Operability
Enables data sharing with enterprise
Machine Protocols Supported
OPC UA, Modbus, MQTT SN, …
Addresses escalating deployment costs
Open source technologies
38
39. HiveMQ Improves Data Quality
Data Policies
Define set of rules and guidelines to enforce
how data and messages should be expected.
Data Schemas
Create the blueprint for how data is formatted.
JSON and Protobuf currently supported.
Control Center
Simple GUI to manage schemas, data and
behavior policies. Dashboard provides an
overview of quality metrics making it easy
to locate bad actors and bad data sources.
Data Validation and Transformation
Defining and enforcing data standards across
deployments.
Policy Actions
Describe what should happen to messages/data
that fail validation. Messages can be rerouted,
forwarded, or simply logged and ignored.
39
40. Build your own!
Java SDK
HiveMQ Solves Interoperability
Runs anywhere
Cloud, on-premises, public and private cloud
Connect from everything
Client support for Java, C, C++, C#, Python, …
Enterprise security
OAuth 2.0, LDAP, RBAC, …
Robust streaming support
Kafka, Amazon Kinesis, Google Pub/Sub, …
Database analytics support
Postgres, Snowflake, Databricks, MongoDB,
InfluxDB, …
40
41. ● Create a federation of multiple clusters and
bidirectionally exchange IoT data between
geographically distributed areas (on-prem and
cloud)
● Allow low latency communication between devices
in local network
● Local broker serves as buffer in case of connection
loss to data center
41
Converge data in your central IT
42. 42
HiveMQ + Kafka
HiveMQ and Kafka are better together. Kafka is designed for fault-tolerant, high throughput data
pipelines, and HiveMQ is designed for reliable, scalable real-time communication with constrained
IoT devices. They can work together to enable end-to-end data streaming and real-time data
processing scenarios in IoT deployments.
43. Why Hive and Confluent for UNF
MQTT
− Optimize for monitoring of devices &
sensors
− Deep topic structure, millions of topics
− Millions von connections
− Data Collection, feedback canal, M2M
Kafka
− Optimize for data provision for distribution
in companies
− Flat topic structure (scale over partitions)
− High throughput (e.g. Analytics, Big
Data…)
44. Why Confluent and Hive for UNS
− The UNS being MQTT-based does not contain any history of data
and only represents a snap-shot of the current state, relies on
Historian that is not designed for that
▪ Solution: Shadow the MQTT broker with Confluent to preserve
history of the
− MQTT was not build for training AI or running Analytics
(Small File Problem for e.g. get 1M Sensor Points for One Device
Type.)
▪ Solution: Fan in thousends MQTT Topics e.g. based on
Device Type into larger Kafka Topics
− MQTT is not fit for complex or high throughput Streamprocessing
Tasks
▪ Solution: Fan în Data in Kafka Process it with Flink/Kstreams
45. For Confluent Partners what Confluent Features
do support the Unified Name Space
− Schema Registry
▪ SparkPlug Messages are Protobuf, putting the Schema on the SchemaReg allows
for easy Structured Streaming
− On Cloud: Advanced Stream Governance
▪ Data Contracts and Business Tags help to put Business Context to the Data for Data
Scientiests
− Kafka Streams
▪ Microservices for Data Transformation even for small Volume Topics
− Cluster Linking:
▪ Hub and Spoke Architectures with local Clusters and Cloud Clusters
47. Recap & Take-Aways
− Unified Namespace is a promising concept for IIoT to allow harmonized device
interconnections
− Sparkplug B vs OPC-UA is benefical when thinking about cloud and field-sensor integration
− Using MQTT with confluent is benefical as it adds
▪ History
▪ Schematization
▪ Governance
▪ Streamprocessing
49. Vielen Dank
für Ihre Aufmerksamkeit
it-novum GmbH Deutschland
Hauptsitz: Edelzeller Straße 44,
36043 Fulda
Niederlassung:
Kaiserswerther Str. 229,
40474 Düsseldorf
it-novum Schweiz
GmbH
Seestrasse 97
8800 Thalwil
Schweiz
it-novum Zweigniederlassung
Österreich
Ausstellungsstraße 50 /
Zugang C
1020 Wien
Alexander Keidel
Head of Product & Alliance
T +49 661 103-392
E
alexander.keidel@it-novum.co
m
data.it-novum.com
Thank you!