Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IoT Integration with MQTT and Apache Kafka

5,392 views

Published on

Processing Internet of Things (IoT) Data from End to End with MQTT and Apache Kafka.

Live demos for my two projects on Github from Kafka Summit in San Francisco 2018 using Kafka Connect respectively Confluent MQTT Proxy:
Apache Kafka + Kafka Connect + MQTT Connector + Sensor Data: https://github.com/kaiwaehner/kafka-connect-iot-mqtt-connector-example

Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data: https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot

Live Demo: https://www.youtube.com/watch?v=L38-6ilGeKE

This session discusses end-to-end use cases such as connected cars, smart home or healthcare sensors, where you integrate Internet of Things (IoT) devices with enterprise IT using open source technologies and standards. MQTT is a lightweight messaging protocol for IoT. However, MQTT is not built for high scalability, longer storage or easy integration to legacy systems. Apache Kafka is a highly scalable distributed streaming platform, which ingests, stores, processes and forwards high volumes of data from thousands of IoT devices. Abstract from my corresponding session at Kafka Summit: This session discusses the Apache Kafka open source ecosystem as a streaming platform to process IoT data. See a live demo of how MQTT brokers like Mosquitto or RabbitMQ integrate with Kafka, and how you can even integrate MQTT clients to Kafka without MQTT Broker. Learn how to analyze the IoT data either natively on Kafka with Kafka Streams/KSQL or on an external big data cluster like Spark, Flink or Elasticsearch leveraging Kafka Connect.

Published in: Technology

IoT Integration with MQTT and Apache Kafka

  1. 1. 1Confidential Processing IoT Data with MQTT and Apache Kafka Kai Waehner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de Kafka-Native End-to-End IoT Data Integration and Processing
  2. 2. 3 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  3. 3. 4 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  4. 4. 6 Connected Intelligence (Cars, Machines, Robots, …)
  5. 5. 7 Smart Cities
  6. 6. 8 Smart Retail and Customer 360
  7. 7. 9 Intelligent Applications (Early Part Scrapping, Predictive Maintenance, …)
  8. 8. 10 ? Architecture (High Level) Kafka BrokerKafka BrokerStreaming Platform Connect w/ MQTT connector IoT Gateway DevicesDevicesDevicesDevice Device Tracking (Real Time) Predictive Maintenance (Near Real Time) Log Analytics (Batch) Edge Data Center / Cloud How to integrate?
  9. 9. 11 Poll Which IoT scenarios do you see in your company? 1) IoT ingestion into analytics cluster 2) Bi-directional communication to control IoT devices (e.g. connected cars, fleet management, logistics) 3) Real time stream processing using machine learning (e.g. predictive maintenance, early part scrapping) 4) No IoT scenarios today; maybe in the future
  10. 10. 13 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  11. 11. 14 MQTT - Publish / subscribe messaging protocol • Built on top of TCP/IP for constrained devices and unreliable networks • Many (open source) broker implementations • Many client libraries • IoT-specific features for bad network / connectivity • Widely used (mostly IoT, but also web and mobile apps via MQTT over WebSockets)
  12. 12. 15 MQTT Server 1 Processor 1 Processor 2 ... MQTT Architecture (no scale) topic: [deviceid]/car
  13. 13. 16 MQTT Server Coordinator MQTT Server 1 MQTT Server 2 MQTT Server 3 MQTT Server 4 topic: [deviceid]/car ... MQTT Architecture (clustering depends on broker implementation) Processor 1 Processor 2 Processor 3 Processor 4
  14. 14. 17 MQTT Architecture (clustering depends on broker implementation) Load Balancer MQTT Server 1 MQTT Server 2 MQTT Server 3 MQTT Server 4 topic: [deviceid]/car ... Processor 1 Processor 2 Processor 3 Processor 4
  15. 15. 18 MQTT Trade-Offs Pros • Lightweight • Simple API • Built for poor connectivity / high latency scenario • Many client connections (tens of thousands per MQTT server) Cons • Queuing, not stream processing • Can’t handle usage surges (no buffering) • No high scalability • Very asynchronous processing (often offline for long time) • No good integration to rest of the enterprise • No reprocessing of events
  16. 16. 19 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  17. 17. 20 Apache Kafka – The Rise of a Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  18. 18. 24 Apache Kafka == Distributed Commit Log with Replication
  19. 19. 26 Kafka Trade-Offs (from IoT perspective) Pros • Stream processing, not just queuing • High throughput • Large scale • High availability • Long term storage and buffering • Reprocessing of events • Good integration to rest of the enterprise Cons • Not built for tens of thousands connections • Requires stable network and good infrastructure • No IoT-specific features like keep alive, last will or testament
  20. 20. 27 (De facto) Standards for Processing IoT Data A Match Made In Heaven + =
  21. 21. 28 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  22. 22. 29 ? Architecture (High Level) Kafka BrokerKafka BrokerStreaming Platform Connect w/ MQTT connector GatewayDevicesDevicesDevicesDevice Device Tracking (Real Time) Predictive Maintenance (Near Real Time) Log Analytics (Batch) Edge Data Center / Cloud How to integrate?
  23. 23. 30 MQTT Server Coordinator MQTT Server 1 MQTT Server 2 MQTT Server 3 MQTT Server 4 topic: [deviceid]/car Kafka Integration Sensor Data Stream processing Kafka Cluster End-to-End Integration from MQTT to Apache Kafka
  24. 24. 31 Design Questions for End-to-End Integration • How much throughput? • Ingest-only vs. processing of data? • Analytical vs. operational deployments? • Device publish only vs. device pub/sub? • Pull vs. Push? • Low-level client vs. integration framework vs. proxy? • Integration patterns needed? (transform, route, …)? • IoT-specific features required (last will, testament, …)?
  25. 25. 32 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  26. 26. 33 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  27. 27. 34 MQTT Source and Sink Connectors for Kafka Connect https://www.confluent.io/hub/ https://www.confluent.io/connector/kafka-connect-mqtt/
  28. 28. 35 ? Integration with Kafka Connect (Source and Sink) Kafka BrokerKafka BrokerKafka Broker MQTT Broker Connect w/ MQTT connector Connect w/ MQTT connectorMQTT DevicesDevicesDevicesDevice Kafka Consumer MQTT Broker Persistent + offers MQTT-specific features Consumes push data from IoT devices Kafka Connect Kafka Consumer + Kafka Producer under the hood Pull-based (at own pace, without overwhelming the source or getting overwhelmed by the source) Out-of-the-box scalability and integration features (like connectors, converters, SMTs)
  29. 29. 38 Kafka Connect components
  30. 30. 39 Connector for MQTT Source + Single Message Transformation (SMT) curl -s -X POST -H 'Content-Type: application/json' http://localhost:8083/connectors -d '{ "name" : "mqtt-source", "config" : { "connector.class" : "io.confluent.connect.mqtt.MqttSourceConnector", "tasks.max" : "1", "mqtt.server.uri" : "tcp://127.0.0.1:1883", "mqtt.topics" : "temperature", "kafka.topics" : "mqtt.", "transforms":"filter", "transforms.filter.type":"com.github.kaiwaehner.kafka.connect.smt.StringFilter", "transforms.filter.topic.format":"fraud" } }'
  31. 31. 40 Kafka Connect - Converters MQTT Broker S3 Object Storage MQTT Source Connector AvroConverter AvroConverter S3 Sink Connector JSON Message Connect data API format byte[] (Avro) byte[] (Avro) Connect data API format AmazonS3 Object
  32. 32. 41 Live Demo … MQTT Integration with Kafka Connect… Connect
  33. 33. 42 Kafka Connect + MQTT Connector https://github.com/kaiwaehner/kafka-connect-iot-mqtt-connector-example
  34. 34. 43 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  35. 35. 44 MQTT Proxy Kafka BrokerKafka BrokerKafka Broker MQTT ProxyMQTT DevicesDevicesDevicesDevices Kafka Consumer MQTT Proxy MQTT is push-based Horizontally scalable Consumes push data from IoT devices and forwards it to Kafka Broker at low-latency Kafka Producer under the hood No MQTT Broker needed Kafka Broker Source of truth Responsible for persistence, high availability, reliability
  36. 36. 45 Details of Confluent’s MQTT Proxy Implementation General and modular framework • Based on Netty to not re-invent the wheel (network layer handling, thread pools) • Scalable with standard load balancer • Internally uses Kafka Connect formats (allows re-using transformation and other Connect- constructs à Coming soon) Three pipeline stages • Network (Netty) • Protocol (like MQTT with QoS 0,1,2 today, later others, maybe e.g. WebSockets) • Stream (Kafka clients: Today Producers, later also consumers) Missing parts in first release • Only MQTT Publish; MQTT Subscribe coming soon • MQTT-specific features like last will or testament
  37. 37. 46 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  38. 38. 47 Confluent REST Proxy REST Proxy Non-Java Applications Native Kafka Java Applications Schema Registry REST / HTTP(S) TCP The „simple alternative“ for IoT • Simple and understood • HTTP(S) Proxy à Push-based • Security ”easier” • Scalable with standard load balancer (still synchronous HTTP) • Not for very high throughput • Implement Connect features in your client app
  39. 39. 48 Confluent REST Proxy - Produce and Consume Messages
  40. 40. 49 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  41. 41. 5050 Processing Options for MQTT Data with Apache Kafka Streams Kafka native vs. additional big data cluster and technology (or others, you name it …)
  42. 42. 5353 Example: Anomaly Detection System to Predict Defects in Car Engine MQTT Proxy Elastic search Grafana Kafka Cluster Kafka Connect KSQL Car Sensors Kafka Ecosystem Other Components Real Time Emergency System All Data PotentialDefect Apply Analytic Model Filter Anomalies On premise DC: Kubernetes + Confluent OperatorAt the edge
  43. 43. 5454 KSQL and Deep Learning (Auto Encoder) for Anomaly Detection “CREATE STREAM AnomalyDetection AS SELECT sensor_id, detectAnomaly(sensor_values) FROM car_engine;“ User Defined Function (UDF)
  44. 44. 55 Live Demo … Processing IoT data with MQTT Proxy and KSQL …
  45. 45. 56 Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot
  46. 46. 57 Poll What is the best choice for your IoT integration between MQTT and Kafka? 1. Kafka Connect 2. MQTT Proxy 3. REST Proxy 4. Custom Kafka Client (Java Client, Nifi, StreamSets, non-MQTT technology, …)
  47. 47. 58 Kai Waehner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.kai-waehner.de www.confluent.io LinkedIn Questions? Feedback? Please contact me!

×