SlideShare a Scribd company logo
1 of 46
Download to read offline
BEST PRACTICES FOR STREAMING
IOT DATA TO APACHE KAFKA®
Dominik Obermaier | CTO @ HiveMQ
Numberofusersinmillions
Numberofusersinmillions
PEOPLE ON THE INTERNET
Source IHS © 2016 IHS
DEVICES ON THE INTERNET
Key Industry Trend:
IoT & Connectivity
Introduction
• HiveMQ CTO
• Strong background in distributed
and large scale systems
architecture
• OASIS MQTT TC Member
• Author of “The Technical
Foundations of IoT”
• Conference Speaker
• Program committee member for
German and international IoT
conferences
Dominik
Obermaier
@dobermai
IOT CHALLENGES
➤ Unreliable communication channels (e.g.
mobile)
➤ Constrained Devices
➤ Low Bandwidth and High Latency
environments
➤ Bi-directional communication required
➤ Security
➤ Instantaneous data exchange
Millions of Devices
Meanwhile…
KAFKA
FOR IOT?
KAFKA STRENGTHS
➤ Optimized to stream data between
systems and applications in a scalable
manner
➤ Scale-out with multiple topics and
partitions and multiple nodes
➤ Perfect for system communication inside
trusted network and limited producers
and consumers
KAFKA ALONE IS NOT OPTIMAL
➤  For IoT use cases where devices are
connected to the data center or cloud
over the public Internet as first point of
contact
➤ If you attempt to stream data from
thousands or even millions of devices
using Kafka over the Internet
KAFKA CHALLENGES
➤ Kafka Clients need to address
Kafka brokers directly, which is
not possible with load
balancers
IOT REALITY
➤ Clients are connected over the
Internet
➤ Load Balancers are used as first
line of defense
➤ IP addresses of infrastructure
(e.g. Kafka nodes) not exposed
to the public Internet
➤ Load Balancers effectively act
as proxy
KAFKA CHALLENGES
➤ Kafka is hard to scale to
multiple hundreds of
thousands or millions of topics
IOT REALITY
➤ IoT devices typically are
segmented to use individual
topics
➤ Individual topics very often
contain data like unique device
identifier
➤ Multiple millions of topics can
be used in a single IoT scenario
➤ Ideal for security as it’s possible
to restrict devices to only
produce and consume for
specific topics
➤ Topics are usually dynamic
KAFKA CHALLENGES
➤ Kafka Clients are reasonable
complex by design (e.g. use
multiple TCP connections)
➤ Libraries optimized for
throughput
➤ APIs for Kafka libraries are
simple to use but the behavior
sometimes isn’t configurable
easily (e.g. async send()
method can block)
IOT REALITY
➤ IoT devices are typically very
constrained (computing power
and memory)
➤ Device programmer need very
simple APIs AND full flexibility
when it comes to library behavior
➤ Single IoT devices typically don’t
require lot of throughput
➤ Important to limit and
understand the number of TCP
connections, especially over the
Internet. Very often only one TCP
connection to the backend
desired
KAFKA CHALLENGES
➤ No on/off notification
mechanism
➤ No Keep-Alive mechanism
individual TCP connections for
producers
➤ Kafka Protocol for producers
rather heavyweight over the
Internet (lots of
communication)
IOT REALITY
➤ Features like on/off
notifications are often required
➤ Unreliable networks require
lightweight keep-alive
mechanisms for producers and
consumers (half-open
connections)
➤ Device communication over
the Internet requires minimal
communication overhead
WHAT IS MQTT?
➤ Most popular IoT Messaging Protocol
➤ Minimal Overhead
➤ Publish / Subscribe
➤ Easy
➤ Binary
➤ Data Agnostic
➤ Designed for reliable communication
over unreliable channels
➤ ISO Standard
USE CASES
➤ Push Communication
➤ Unreliable communication
channels (e.g. mobile)
➤ Constrained Devices
➤ Low Bandwidth and High Latency
environments
➤ Communication from backend to
IoT device
➤ Lightweight backen communication
Source: https://iot.eclipse.org/resources/iot-developer-survey/iot-developer-survey-2019.pdf; 1717 participants
Source: https://iot.eclipse.org/resources/iot-developer-survey/iot-developer-survey-2019.pdf; 1717 participants
Source: https://iot.eclipse.org/resources/iot-developer-survey/iot-developer-survey-2019.pdf; 1717 participants
PUBLISH / SUBSCRIBE
Scalable communication paradigm for the IoT
PUBLISH / SUBSCRIBE EXPLAINED
HiveMQ
MQTT broker built for enterprise applications
Extensive plugin system
Scales to > 10 million of concurrent connections
OSS Community Edition available
Built for High Availability and used by 150+
of the largest IoT deployments in the world
Confidential and Proprietary. Copyright © by dc-square GmbH. All Rights Reserved.
Enterprise MQTT
Devices HiveMQ Enterprise
unreliable
network
Protocol
Integration
Enterprise Systems
• Kafka
• OAuth Server
• …
Kubernetes, Docker, OpenShift
Public or private cloud (AWS, MS Azure…) or on-premise
Backend
HiveMQ MQTT Client
Java based MQTT library
Developed by HiveMQ and BMW Car-IT
Built for devices and backends
Open Source (Apache 2)
Extremely fast and low overhead
➤https://www.hivemq.com/
benchmark-10-million/
HOW TO
USE KAFKA
FOR IOT?
KAFKA CONNECT
https://www.confluent.io/blog/announcing-kafka-connect-building-large-scale-low-latency-data-pipelines/
CON
➤ Doesn’t scale well as Kafka acts
as MQTT client.
➤ MQTT Client are not designed
for extremely large amounts of
MQTT messages
➤ Centralizes business and
message transformation logic
PRO
➤ Ideal if you don’t control MQTT
Broker or use third party MQTT
broker
MQTT PROXY
CON
➤ Does not implement the full
MQTT ISO Standard
➤ Does not support features
unique to MQTT which were
designed for IoT use cases
(LWT, retained messages, …)
➤ Client must ensure that no
MQTT features beside plain
publishing are used
➤ Tightly coupled with Kafka,
so it does not allow the Pub/
Sub features of MQTT
PRO
➤ Does not require a MQTT
brokers
➤ Usually stateless, which makes
scaling easier
MQTT CUSTOM BRIDGE
CON
➤ Extremely hard to avoid
message loss
➤ Developers must take care of
fault tolerance themselves
➤ Another system which adds
complexity and adds no
value
PRO
➤ Application used for
transposing MQTT to Kafka
and vice versa built by
developers themselves, so no
external component
MQTT BROKER EXTENSION
CON
➤ Extension only available for
HiveMQ
PRO
➤ Broker uses native Kafka
protocol
➤ Full MQTT Features can be used
➤ Bi-directional producing and
consumption possible
➤ High Scalability and resilience.
➤ Extreme throughput. Can write
hundreds of thousands of MQTT
messages per second to Kafka
➤ Ideal for aggregating MQTT
topics to Kafka Firehose Topics
➤ Can write to multiple Kafka
Deployments
+ = ❤
Dominik
Obermaier
dominik.obermaier@dc-square.de
@dobermai
Get in touch
HiveMQ -
Enterprise MQTT Broker
www.hivemq.com
@hivemq

More Related Content

More from Dominik Obermaier

HiveMQ Webinar: Lightweight and scalable IoT Messaging with MQTT
HiveMQ Webinar: Lightweight and scalable IoT Messaging with MQTTHiveMQ Webinar: Lightweight and scalable IoT Messaging with MQTT
HiveMQ Webinar: Lightweight and scalable IoT Messaging with MQTTDominik Obermaier
 
A pure Java MQTT Stack for IoT
A pure Java MQTT Stack for IoTA pure Java MQTT Stack for IoT
A pure Java MQTT Stack for IoTDominik Obermaier
 
Lightweight and scalable IoT Architectures with MQTT
Lightweight and scalable IoT Architectures with MQTTLightweight and scalable IoT Architectures with MQTT
Lightweight and scalable IoT Architectures with MQTTDominik Obermaier
 
Lightweight and scalable IoT Messaging with MQTT
Lightweight and scalable IoT Messaging with MQTTLightweight and scalable IoT Messaging with MQTT
Lightweight and scalable IoT Messaging with MQTTDominik Obermaier
 
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTTIn search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTTDominik Obermaier
 
Scaling MQTT - Webinar with Elastic Beam
Scaling MQTT - Webinar with Elastic BeamScaling MQTT - Webinar with Elastic Beam
Scaling MQTT - Webinar with Elastic BeamDominik Obermaier
 
MQTT Deep Dive Workshop [GERMAN]
MQTT Deep Dive Workshop [GERMAN]MQTT Deep Dive Workshop [GERMAN]
MQTT Deep Dive Workshop [GERMAN]Dominik Obermaier
 
Securing MQTT - BuildingIoT 2016 slides
Securing MQTT - BuildingIoT 2016 slidesSecuring MQTT - BuildingIoT 2016 slides
Securing MQTT - BuildingIoT 2016 slidesDominik Obermaier
 
An introduction to MQTT - Pub / Sub for the masses
An introduction to MQTT - Pub / Sub for the massesAn introduction to MQTT - Pub / Sub for the masses
An introduction to MQTT - Pub / Sub for the massesDominik Obermaier
 
Pub/Sub for the masses- Ein Einführungsworkshop in MQTT [GERMAN]
Pub/Sub for the masses- Ein Einführungsworkshop in MQTT [GERMAN]Pub/Sub for the masses- Ein Einführungsworkshop in MQTT [GERMAN]
Pub/Sub for the masses- Ein Einführungsworkshop in MQTT [GERMAN]Dominik Obermaier
 
IoT with MQTT and Paho for Webpages - Eclipse Democamp München 2014
IoT with MQTT and Paho for Webpages - Eclipse Democamp München 2014IoT with MQTT and Paho for Webpages - Eclipse Democamp München 2014
IoT with MQTT and Paho for Webpages - Eclipse Democamp München 2014Dominik Obermaier
 
JAX 2014 - M2M for Java Developers with MQTT
JAX 2014 - M2M for Java Developers with MQTTJAX 2014 - M2M for Java Developers with MQTT
JAX 2014 - M2M for Java Developers with MQTTDominik Obermaier
 
Push! - MQTT for the Internet of Things
Push! - MQTT for the Internet of ThingsPush! - MQTT for the Internet of Things
Push! - MQTT for the Internet of ThingsDominik Obermaier
 
Eclipse Democamps 2013 - M2M for Java Developers with MQTT
Eclipse Democamps 2013 - M2M for Java Developers with MQTTEclipse Democamps 2013 - M2M for Java Developers with MQTT
Eclipse Democamps 2013 - M2M for Java Developers with MQTTDominik Obermaier
 
Bringing M2M to the web with Paho: Connecting Java Devices and online dashboa...
Bringing M2M to the web with Paho: Connecting Java Devices and online dashboa...Bringing M2M to the web with Paho: Connecting Java Devices and online dashboa...
Bringing M2M to the web with Paho: Connecting Java Devices and online dashboa...Dominik Obermaier
 
M2M for Java Developers: MQTT with Eclipse Paho - Eclipsecon Europe 2013
M2M for Java Developers: MQTT with Eclipse Paho - Eclipsecon Europe 2013M2M for Java Developers: MQTT with Eclipse Paho - Eclipsecon Europe 2013
M2M for Java Developers: MQTT with Eclipse Paho - Eclipsecon Europe 2013Dominik Obermaier
 

More from Dominik Obermaier (17)

HiveMQ Webinar: Lightweight and scalable IoT Messaging with MQTT
HiveMQ Webinar: Lightweight and scalable IoT Messaging with MQTTHiveMQ Webinar: Lightweight and scalable IoT Messaging with MQTT
HiveMQ Webinar: Lightweight and scalable IoT Messaging with MQTT
 
A pure Java MQTT Stack for IoT
A pure Java MQTT Stack for IoTA pure Java MQTT Stack for IoT
A pure Java MQTT Stack for IoT
 
Lightweight and scalable IoT Architectures with MQTT
Lightweight and scalable IoT Architectures with MQTTLightweight and scalable IoT Architectures with MQTT
Lightweight and scalable IoT Architectures with MQTT
 
Lightweight and scalable IoT Messaging with MQTT
Lightweight and scalable IoT Messaging with MQTTLightweight and scalable IoT Messaging with MQTT
Lightweight and scalable IoT Messaging with MQTT
 
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTTIn search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
 
MQTT 5 - What's New?
MQTT 5 - What's New?MQTT 5 - What's New?
MQTT 5 - What's New?
 
Scaling MQTT - Webinar with Elastic Beam
Scaling MQTT - Webinar with Elastic BeamScaling MQTT - Webinar with Elastic Beam
Scaling MQTT - Webinar with Elastic Beam
 
MQTT Deep Dive Workshop [GERMAN]
MQTT Deep Dive Workshop [GERMAN]MQTT Deep Dive Workshop [GERMAN]
MQTT Deep Dive Workshop [GERMAN]
 
Securing MQTT - BuildingIoT 2016 slides
Securing MQTT - BuildingIoT 2016 slidesSecuring MQTT - BuildingIoT 2016 slides
Securing MQTT - BuildingIoT 2016 slides
 
An introduction to MQTT - Pub / Sub for the masses
An introduction to MQTT - Pub / Sub for the massesAn introduction to MQTT - Pub / Sub for the masses
An introduction to MQTT - Pub / Sub for the masses
 
Pub/Sub for the masses- Ein Einführungsworkshop in MQTT [GERMAN]
Pub/Sub for the masses- Ein Einführungsworkshop in MQTT [GERMAN]Pub/Sub for the masses- Ein Einführungsworkshop in MQTT [GERMAN]
Pub/Sub for the masses- Ein Einführungsworkshop in MQTT [GERMAN]
 
IoT with MQTT and Paho for Webpages - Eclipse Democamp München 2014
IoT with MQTT and Paho for Webpages - Eclipse Democamp München 2014IoT with MQTT and Paho for Webpages - Eclipse Democamp München 2014
IoT with MQTT and Paho for Webpages - Eclipse Democamp München 2014
 
JAX 2014 - M2M for Java Developers with MQTT
JAX 2014 - M2M for Java Developers with MQTTJAX 2014 - M2M for Java Developers with MQTT
JAX 2014 - M2M for Java Developers with MQTT
 
Push! - MQTT for the Internet of Things
Push! - MQTT for the Internet of ThingsPush! - MQTT for the Internet of Things
Push! - MQTT for the Internet of Things
 
Eclipse Democamps 2013 - M2M for Java Developers with MQTT
Eclipse Democamps 2013 - M2M for Java Developers with MQTTEclipse Democamps 2013 - M2M for Java Developers with MQTT
Eclipse Democamps 2013 - M2M for Java Developers with MQTT
 
Bringing M2M to the web with Paho: Connecting Java Devices and online dashboa...
Bringing M2M to the web with Paho: Connecting Java Devices and online dashboa...Bringing M2M to the web with Paho: Connecting Java Devices and online dashboa...
Bringing M2M to the web with Paho: Connecting Java Devices and online dashboa...
 
M2M for Java Developers: MQTT with Eclipse Paho - Eclipsecon Europe 2013
M2M for Java Developers: MQTT with Eclipse Paho - Eclipsecon Europe 2013M2M for Java Developers: MQTT with Eclipse Paho - Eclipsecon Europe 2013
M2M for Java Developers: MQTT with Eclipse Paho - Eclipsecon Europe 2013
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

Best Practices for Streaming IoT Data to Apache Kafka®

  • 1. BEST PRACTICES FOR STREAMING IOT DATA TO APACHE KAFKA® Dominik Obermaier | CTO @ HiveMQ
  • 4. Source IHS © 2016 IHS DEVICES ON THE INTERNET
  • 5. Key Industry Trend: IoT & Connectivity
  • 6. Introduction • HiveMQ CTO • Strong background in distributed and large scale systems architecture • OASIS MQTT TC Member • Author of “The Technical Foundations of IoT” • Conference Speaker • Program committee member for German and international IoT conferences Dominik Obermaier @dobermai
  • 7. IOT CHALLENGES ➤ Unreliable communication channels (e.g. mobile) ➤ Constrained Devices ➤ Low Bandwidth and High Latency environments ➤ Bi-directional communication required ➤ Security ➤ Instantaneous data exchange
  • 10.
  • 11.
  • 12.
  • 14. KAFKA STRENGTHS ➤ Optimized to stream data between systems and applications in a scalable manner ➤ Scale-out with multiple topics and partitions and multiple nodes ➤ Perfect for system communication inside trusted network and limited producers and consumers
  • 15. KAFKA ALONE IS NOT OPTIMAL ➤  For IoT use cases where devices are connected to the data center or cloud over the public Internet as first point of contact ➤ If you attempt to stream data from thousands or even millions of devices using Kafka over the Internet
  • 16. KAFKA CHALLENGES ➤ Kafka Clients need to address Kafka brokers directly, which is not possible with load balancers IOT REALITY ➤ Clients are connected over the Internet ➤ Load Balancers are used as first line of defense ➤ IP addresses of infrastructure (e.g. Kafka nodes) not exposed to the public Internet ➤ Load Balancers effectively act as proxy
  • 17. KAFKA CHALLENGES ➤ Kafka is hard to scale to multiple hundreds of thousands or millions of topics IOT REALITY ➤ IoT devices typically are segmented to use individual topics ➤ Individual topics very often contain data like unique device identifier ➤ Multiple millions of topics can be used in a single IoT scenario ➤ Ideal for security as it’s possible to restrict devices to only produce and consume for specific topics ➤ Topics are usually dynamic
  • 18. KAFKA CHALLENGES ➤ Kafka Clients are reasonable complex by design (e.g. use multiple TCP connections) ➤ Libraries optimized for throughput ➤ APIs for Kafka libraries are simple to use but the behavior sometimes isn’t configurable easily (e.g. async send() method can block) IOT REALITY ➤ IoT devices are typically very constrained (computing power and memory) ➤ Device programmer need very simple APIs AND full flexibility when it comes to library behavior ➤ Single IoT devices typically don’t require lot of throughput ➤ Important to limit and understand the number of TCP connections, especially over the Internet. Very often only one TCP connection to the backend desired
  • 19. KAFKA CHALLENGES ➤ No on/off notification mechanism ➤ No Keep-Alive mechanism individual TCP connections for producers ➤ Kafka Protocol for producers rather heavyweight over the Internet (lots of communication) IOT REALITY ➤ Features like on/off notifications are often required ➤ Unreliable networks require lightweight keep-alive mechanisms for producers and consumers (half-open connections) ➤ Device communication over the Internet requires minimal communication overhead
  • 20.
  • 21. WHAT IS MQTT? ➤ Most popular IoT Messaging Protocol ➤ Minimal Overhead ➤ Publish / Subscribe ➤ Easy ➤ Binary ➤ Data Agnostic ➤ Designed for reliable communication over unreliable channels ➤ ISO Standard
  • 22. USE CASES ➤ Push Communication ➤ Unreliable communication channels (e.g. mobile) ➤ Constrained Devices ➤ Low Bandwidth and High Latency environments ➤ Communication from backend to IoT device ➤ Lightweight backen communication
  • 23.
  • 27. PUBLISH / SUBSCRIBE Scalable communication paradigm for the IoT
  • 28. PUBLISH / SUBSCRIBE EXPLAINED
  • 29. HiveMQ MQTT broker built for enterprise applications Extensive plugin system Scales to > 10 million of concurrent connections OSS Community Edition available Built for High Availability and used by 150+ of the largest IoT deployments in the world
  • 30. Confidential and Proprietary. Copyright © by dc-square GmbH. All Rights Reserved. Enterprise MQTT Devices HiveMQ Enterprise unreliable network Protocol Integration Enterprise Systems • Kafka • OAuth Server • … Kubernetes, Docker, OpenShift Public or private cloud (AWS, MS Azure…) or on-premise Backend
  • 31. HiveMQ MQTT Client Java based MQTT library Developed by HiveMQ and BMW Car-IT Built for devices and backends Open Source (Apache 2) Extremely fast and low overhead
  • 36. CON ➤ Doesn’t scale well as Kafka acts as MQTT client. ➤ MQTT Client are not designed for extremely large amounts of MQTT messages ➤ Centralizes business and message transformation logic PRO ➤ Ideal if you don’t control MQTT Broker or use third party MQTT broker
  • 38. CON ➤ Does not implement the full MQTT ISO Standard ➤ Does not support features unique to MQTT which were designed for IoT use cases (LWT, retained messages, …) ➤ Client must ensure that no MQTT features beside plain publishing are used ➤ Tightly coupled with Kafka, so it does not allow the Pub/ Sub features of MQTT PRO ➤ Does not require a MQTT brokers ➤ Usually stateless, which makes scaling easier
  • 40. CON ➤ Extremely hard to avoid message loss ➤ Developers must take care of fault tolerance themselves ➤ Another system which adds complexity and adds no value PRO ➤ Application used for transposing MQTT to Kafka and vice versa built by developers themselves, so no external component
  • 42. CON ➤ Extension only available for HiveMQ PRO ➤ Broker uses native Kafka protocol ➤ Full MQTT Features can be used ➤ Bi-directional producing and consumption possible ➤ High Scalability and resilience. ➤ Extreme throughput. Can write hundreds of thousands of MQTT messages per second to Kafka ➤ Ideal for aggregating MQTT topics to Kafka Firehose Topics ➤ Can write to multiple Kafka Deployments
  • 43.
  • 44.