SlideShare a Scribd company logo
1 of 60
Download to read offline
IoT Data Streaming
รัฐศิลป์ รานอกภานุวัชร์, D.ENG
WHO AM I ?
 อาจารย์ผู้สอน ป.ตรี วิศวกรรมคอมพิวเตอร์ มหาวิทยาลัยธุรกิจบัณฑิตย์
 อาจารย์ผู้สอน ป.โท วิศวกรรมข้อมูลขนาดใหญ่ มหาวิทยาลัยธุรกิจบัณฑิตย์
 อาจารย์พิเศษ สอนวิชา Data Streaming and Real Time Analytics
สถาบันบัณฑิตพัฒนบริหารศาสตร์ นิด้า
 วิทยากรผู้สอน Amazon cloud ประจาสถาบัน 9expert
 ที่ปรึกษาบริษัทเอกชน ทางด้าน BigData และ Blockchain
 งานวิจัย Blockchain, IoT และ BigData
2
Outline
• Internet of Things (IoT)
• IoT Data Streaming
• Collect Data
• MQTT
• Kafka
• Streaming processing platform
• Flink
• Storm
• Spark
• Use-Case Examples
3
Internet of Things (IoT)
Credit: https://orzota.com/industrial-iot/
Software and
platform
(Data Stream
Processing)
VisualizationThings
(Generate
data steam)
4
Sensors & Actuators
IoT data characteristics
Large-Scale
Streaming Data
Heterogeneity
Time and space
correlation
High noise data
IoT
data
IoT Applications support
 High-speed data streams
 Requiring real-time or near
real-time actions
 Sometimes the need to join
○ with static data
○ with historical data
Reference: M. Chen, S. Mao, Y. Zhang, and V. C. Leung, Big data: related technologies, challenges and future prospects. Springer, 2014
What is Data Streaming?
Ref: https://www.cisco.com/c/dam/en/us/products/collateral/analytics-automation-software/data-virtualization/r20-consultancy-combining-datastreaming-wp.pdf
 The data streaming is
continuously transmitted from
one system (the producer) to
another (the consumer) which
reacts instantaneously (No delay)
on the incoming data.
Distributed Streaming
 Streaming:
 Computations on never ending “streams” of data records (“events”)
 Distributed:
 Computation spread across many machines
7Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
Stateless streaming
 Every incoming record is independent of other records.
 There is no relation between different record can processed and persisted
independently.
 Eg. Map , Filter, Join with static data
8Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
Stateful Streaming
 Computation and state
 E.g., counters, windows of past events, state machines, trained ML models
 counts of each distinct word seen in records
 Result depends on history of stream
 Processing of an incoming record depends upon the result of previously processed records
9Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
Event-Time Streaming
 Data records associated with timestamps (time series data)
 Processing depends on timestamps
 An event-time stream processor should give you the tools to reason about time
 Handle streams that are out of order
10Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
Event-Time Streaming
 Because time matters
 Time
 Event time, which is the time at which
events actually occurred
 Processing time, which is the time at
which events are observed in the system
11Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
Things are Producing Streaming Data
12
 Smart city
 Healthcare/Medical
device
 Connected cars
 Logistics
 Home automation
 Airlines
 Farmers
 Smart Machinery
 Security system
IoT Big Data Architecture
Filtering
Analytics
Source: https://mapr.com/blog/ml-iot-connected-medical-devices/ 14
Collect Data (high level architecture)
15
How to integrate? MQTT or Kafka
16Copy right : https://thenewstack.io/mqtt-protocol-iot/
Messaging Systems: Publish/Subscribe
Producer Consumer
Producer
Consumer
Topic 1 Topic 2
Topic 3
subscribe
publish(topic, msg)
Publish subscribe
system
msg
msg
Example
18
MQTT uses the pub/sub pattern to connect interested parties with each other
Arduino, Raspberry Pi
MQTT - Publish / subscribe messaging
protocol
19
 MQTT protocol is a Machine to Machine (M2M) protocol widely used in Internet of things.
 This protocol used publish-subscriber paradigm in contrast to HTTP based on request/response
paradigm.
 Built on top of TCP/IP for constrained devices and unreliable network
 Many (open source) broker implementation
 Many client libraries
MQTT Architecture (no scale)
20
MQTT Architecture (clustering depends on
broker implementation)
21
MQTT Architecture (clustering depends on
broker implementation)
22
MQTT Trade-Offs
Pros
 Lightweight
 Simple API
 Built for poor connectivity / high latency scenario
 Many client connections (tens of thousands per MQTT server)
Cons
 Queuing, not stream processing
 no buffering
 No high scalability
 No good integration to rest of the enterprise
 No reprocessing of events
23
Apache Kafka
A distributed streaming platform
24
Kafka Data Streams
Kafka is used to stream data into data lakes, applications and real-time stream analytics systems.
Kafka architecture: Broker, Topics, Producers,
and Consumers
26
Kafka Cluster is made up of multiple Kafka Brokers
Apache Kafka - Architecture
Producer
Consumer
27
Apache Kafka - Architecture
Producer
Consumer
28
Apache Kafka
Producer
Consumer
29
Kafka Zookeeper Coordination
Producer
Consumer
Producer
Broker Broker Broker Broker
Consumer
ZK
30
31
32
33
A few important characteristics
 Fast
 Kafka can handle hundreds of megabytes of reads and writes per second from a
large number of clients.
 Designed for real time activity streaming.
 Distributed and highly scalable
 Kafka has a cluster-centric design offers strong durability and fault-tolerance
guarantees.
 Messages partitioning spread over a cluster of machines
 Durable
 Message persisted to disk and replicated within cluster to prevent data loss.
 Each broker can handle terabytes of messages without performance impact
Streaming
Platform
USE CASE
Use Case – Truck Sensors
36
Kafka Trade-Offs (from IoT perspective)
Pros
 Stream processing, not just queuing
 High throughput
 Large scale
 High availability
 Long term storage and buffering
 Reprocessing of events
 Good integration to rest of the enterprise
Cons
 Not built for tens of thousands connections
 Requires stable network and good infrastructure
37
Collect Data (high level architecture)
38
How to integrate? MQTT+Kafka
End-to-End Integration from MQTT to Apache Kafka
39
MQTT Source and Sink Connectors for Kafka
Connect
40
https://www.confluent.io/hub/
https://www.confluent.io/connector/kafka-connect-mqtt/
IoT Data Ingestion through MQTT into Kafka
41Ref: https://github.com/gschmutz/stream-processing-workshop/tree/master/06-iot-data-ingestion-over-mqtt
IoT Big Data Architecture
Filtering
Analytics
Ref: https://mapr.com/blog/ml-iot-connected-medical-devices/ 42
What is stream processing?
 Technology that let users query continuous data stream and detect conditions
fast within a small time period from the time of receiving the data.
 The detection time period varies from few milliseconds to minutes.
Streams processing tools
44
Two Types of Stream Processing
45
Native Streaming
 It means every incoming record is
processed as soon as it arrives, without
waiting for others.
 There are some continuous running
processes which run for ever and every
record passes through these processes to
get processed.
 Framework to achieve the minimum
latency possible.
 But hard to achieve fault tolerance
46
Micro-batching
 It means incoming records in every few seconds are batched together and then
processed in a single mini batch with delay of few seconds.
 Cost of latency and it will not feel like a natural steaming
47
https://medium.com/@chandanbaranwal/spark-streaming-vs-flink-vs-storm-vs-kafka-streams-vs-samza-choose-your-stream-processing-
91ea3f04675b
Apache Storm
 Distributed dataflow abstraction (spouts & bolts) and large scale stream processing
 It is true streaming and is good for simple event based use cases
 Very low latency, and high throughput
 No state management
48
 if it is simple IoT kind of event based alerting system
source of streams
filtering,
functions,
aggregations,
joins, etc
Processing
Apache Flink
Queries
Applications
Devices
etc.
Database
Stream
File / Object
Storage
 Stateful computations over streams
 First True streaming framework with all advanced
features like event time processing, watermarks, etc
 Low latency with high throughput
Historic
Data
Streams
Application
Good for Complex event time processing,
aggregation, stream joins,etc
Architecture and Process Model
50
Ref: https://ci.apache.org/projects/flink/flink-docs-release-1.1/internals/general_arch.html
51
Ref: https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/hadoop.html
Apache Spark
 Spark has emerged as true successor of Hadoop
 Unified batch and stream processing over a batch runtime
 High throughput, Fault tolerance by default due to micro-batch nature
 Not true streaming, not suitable for low latency requirements
52
Good for Stream machine learning
Use Case
53
54
Ref: Muhammad Syafrudin, “Performance Analysis of IoT-Based Sensor, Big Data Processing, and Machine Learning Model for Real-Time
Monitoring System in Automotive Manufacturing”
Real-Time Monitoring System in Automotive Manufacturing
Detect abnormal events
and diagnosis in a process
55
System design
Ref: Muhammad Syafrudin, “Performance Analysis of IoT-Based Sensor, Big Data Processing, and Machine Learning Model for Real-Time
Monitoring System in Automotive Manufacturing”
Sensor Data
56
57
58
59
60
Performance evaluation in terms of latency with different numbers of clients (a) and servers
(b); throughput with different numbers of clients (c) and servers (d);
Thank you

More Related Content

What's hot

Decide if PhoneGap is for you as your mobile platform selection
Decide if PhoneGap is for you as your mobile platform selectionDecide if PhoneGap is for you as your mobile platform selection
Decide if PhoneGap is for you as your mobile platform selectionSalim M Bhonhariya
 
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...Kai Wähner
 
The case of vehicle networking financial services accomplished by China Mobile
The case of vehicle networking financial services accomplished by China MobileThe case of vehicle networking financial services accomplished by China Mobile
The case of vehicle networking financial services accomplished by China MobileDataWorks Summit
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningKai Wähner
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?confluent
 
Bridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure WebinarBridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure Webinarconfluent
 
Airline reservations and routing: a graph use case
Airline reservations and routing: a graph use caseAirline reservations and routing: a graph use case
Airline reservations and routing: a graph use caseDataWorks Summit
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingKai Wähner
 
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...confluent
 
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X Kai Wähner
 
Event Driven Architecture: Mistakes, I've made a few...
Event Driven Architecture: Mistakes, I've made a few...Event Driven Architecture: Mistakes, I've made a few...
Event Driven Architecture: Mistakes, I've made a few...confluent
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Servicesconfluent
 
Wikibon #IoT #HyperConvergence Presentation via @theCUBE
Wikibon #IoT #HyperConvergence Presentation via @theCUBE Wikibon #IoT #HyperConvergence Presentation via @theCUBE
Wikibon #IoT #HyperConvergence Presentation via @theCUBE John Furrier
 
Using Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Using Kafka on Event-driven Microservices Architectures - Apache Kafka MeetupUsing Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Using Kafka on Event-driven Microservices Architectures - Apache Kafka MeetupStratio
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
 
IoT Data Platforms: Processing IoT Data with Apache Kafka™
IoT Data Platforms: Processing IoT Data with Apache Kafka™IoT Data Platforms: Processing IoT Data with Apache Kafka™
IoT Data Platforms: Processing IoT Data with Apache Kafka™confluent
 
Pivoting event streaming, from PROJECTS to a PLATFORM
Pivoting event streaming, from PROJECTS to a PLATFORMPivoting event streaming, from PROJECTS to a PLATFORM
Pivoting event streaming, from PROJECTS to a PLATFORMconfluent
 
Kubernetes Jakarta Meetup 010 - Service Mesh Observability with Kiali
Kubernetes Jakarta Meetup 010 - Service Mesh Observability with KialiKubernetes Jakarta Meetup 010 - Service Mesh Observability with Kiali
Kubernetes Jakarta Meetup 010 - Service Mesh Observability with KialiYusuf Hadiwinata Sutandar
 
"Application monitoring — from requirements to tools, not the other way aroun...
"Application monitoring — from requirements to tools, not the other way aroun..."Application monitoring — from requirements to tools, not the other way aroun...
"Application monitoring — from requirements to tools, not the other way aroun...Fwdays
 

What's hot (20)

Decide if PhoneGap is for you as your mobile platform selection
Decide if PhoneGap is for you as your mobile platform selectionDecide if PhoneGap is for you as your mobile platform selection
Decide if PhoneGap is for you as your mobile platform selection
 
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
 
The case of vehicle networking financial services accomplished by China Mobile
The case of vehicle networking financial services accomplished by China MobileThe case of vehicle networking financial services accomplished by China Mobile
The case of vehicle networking financial services accomplished by China Mobile
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?
 
Bridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure WebinarBridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure Webinar
 
Airline reservations and routing: a graph use case
Airline reservations and routing: a graph use caseAirline reservations and routing: a graph use case
Airline reservations and routing: a graph use case
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
 
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
 
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
 
Event Driven Architecture: Mistakes, I've made a few...
Event Driven Architecture: Mistakes, I've made a few...Event Driven Architecture: Mistakes, I've made a few...
Event Driven Architecture: Mistakes, I've made a few...
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Services
 
Hyper-Convergence CrowdChat
Hyper-Convergence CrowdChatHyper-Convergence CrowdChat
Hyper-Convergence CrowdChat
 
Wikibon #IoT #HyperConvergence Presentation via @theCUBE
Wikibon #IoT #HyperConvergence Presentation via @theCUBE Wikibon #IoT #HyperConvergence Presentation via @theCUBE
Wikibon #IoT #HyperConvergence Presentation via @theCUBE
 
Using Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Using Kafka on Event-driven Microservices Architectures - Apache Kafka MeetupUsing Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Using Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 
IoT Data Platforms: Processing IoT Data with Apache Kafka™
IoT Data Platforms: Processing IoT Data with Apache Kafka™IoT Data Platforms: Processing IoT Data with Apache Kafka™
IoT Data Platforms: Processing IoT Data with Apache Kafka™
 
Pivoting event streaming, from PROJECTS to a PLATFORM
Pivoting event streaming, from PROJECTS to a PLATFORMPivoting event streaming, from PROJECTS to a PLATFORM
Pivoting event streaming, from PROJECTS to a PLATFORM
 
Kubernetes Jakarta Meetup 010 - Service Mesh Observability with Kiali
Kubernetes Jakarta Meetup 010 - Service Mesh Observability with KialiKubernetes Jakarta Meetup 010 - Service Mesh Observability with Kiali
Kubernetes Jakarta Meetup 010 - Service Mesh Observability with Kiali
 
"Application monitoring — from requirements to tools, not the other way aroun...
"Application monitoring — from requirements to tools, not the other way aroun..."Application monitoring — from requirements to tools, not the other way aroun...
"Application monitoring — from requirements to tools, not the other way aroun...
 

Similar to Io t data streaming

IoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlowIoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlowKai Wähner
 
Processing IoT Data from End to End with MQTT and Apache Kafka
Processing IoT Data from End to End with MQTT and Apache Kafka Processing IoT Data from End to End with MQTT and Apache Kafka
Processing IoT Data from End to End with MQTT and Apache Kafka confluent
 
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...Flink Forward
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of dataconfluent
 
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...confluent
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analyticsconfluent
 
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoTFlexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoTconfluent
 
A Pragmatic Reference Architecture for The Internet of Things
A Pragmatic Reference Architecture for The Internet of ThingsA Pragmatic Reference Architecture for The Internet of Things
A Pragmatic Reference Architecture for The Internet of ThingsRick G. Garibay
 
Swisscom Network Analytics
Swisscom Network AnalyticsSwisscom Network Analytics
Swisscom Network Analyticsconfluent
 
Self-Service IoT Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipesSelf-Service IoT Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipesApache StreamPipes
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesApache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesKai Wähner
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramièreconfluent
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®confluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTTIn search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTTDominik Obermaier
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaKai Wähner
 
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...confluent
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...Kai Wähner
 
ArtigofinalpublicadoASTESJ_060139.pdf
ArtigofinalpublicadoASTESJ_060139.pdfArtigofinalpublicadoASTESJ_060139.pdf
ArtigofinalpublicadoASTESJ_060139.pdfMeftahMehdawi
 

Similar to Io t data streaming (20)

IoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlowIoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
 
Processing IoT Data from End to End with MQTT and Apache Kafka
Processing IoT Data from End to End with MQTT and Apache Kafka Processing IoT Data from End to End with MQTT and Apache Kafka
Processing IoT Data from End to End with MQTT and Apache Kafka
 
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of data
 
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
 
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoTFlexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
 
A Pragmatic Reference Architecture for The Internet of Things
A Pragmatic Reference Architecture for The Internet of ThingsA Pragmatic Reference Architecture for The Internet of Things
A Pragmatic Reference Architecture for The Internet of Things
 
IoT meets Big Data
IoT meets Big DataIoT meets Big Data
IoT meets Big Data
 
Swisscom Network Analytics
Swisscom Network AnalyticsSwisscom Network Analytics
Swisscom Network Analytics
 
Self-Service IoT Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipesSelf-Service IoT Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipes
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesApache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice Architectures
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTTIn search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
 
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
 
ArtigofinalpublicadoASTESJ_060139.pdf
ArtigofinalpublicadoASTESJ_060139.pdfArtigofinalpublicadoASTESJ_060139.pdf
ArtigofinalpublicadoASTESJ_060139.pdf
 

Recently uploaded

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 

Recently uploaded (20)

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 

Io t data streaming

  • 1. IoT Data Streaming รัฐศิลป์ รานอกภานุวัชร์, D.ENG
  • 2. WHO AM I ?  อาจารย์ผู้สอน ป.ตรี วิศวกรรมคอมพิวเตอร์ มหาวิทยาลัยธุรกิจบัณฑิตย์  อาจารย์ผู้สอน ป.โท วิศวกรรมข้อมูลขนาดใหญ่ มหาวิทยาลัยธุรกิจบัณฑิตย์  อาจารย์พิเศษ สอนวิชา Data Streaming and Real Time Analytics สถาบันบัณฑิตพัฒนบริหารศาสตร์ นิด้า  วิทยากรผู้สอน Amazon cloud ประจาสถาบัน 9expert  ที่ปรึกษาบริษัทเอกชน ทางด้าน BigData และ Blockchain  งานวิจัย Blockchain, IoT และ BigData 2
  • 3. Outline • Internet of Things (IoT) • IoT Data Streaming • Collect Data • MQTT • Kafka • Streaming processing platform • Flink • Storm • Spark • Use-Case Examples 3
  • 4. Internet of Things (IoT) Credit: https://orzota.com/industrial-iot/ Software and platform (Data Stream Processing) VisualizationThings (Generate data steam) 4 Sensors & Actuators
  • 5. IoT data characteristics Large-Scale Streaming Data Heterogeneity Time and space correlation High noise data IoT data IoT Applications support  High-speed data streams  Requiring real-time or near real-time actions  Sometimes the need to join ○ with static data ○ with historical data Reference: M. Chen, S. Mao, Y. Zhang, and V. C. Leung, Big data: related technologies, challenges and future prospects. Springer, 2014
  • 6. What is Data Streaming? Ref: https://www.cisco.com/c/dam/en/us/products/collateral/analytics-automation-software/data-virtualization/r20-consultancy-combining-datastreaming-wp.pdf  The data streaming is continuously transmitted from one system (the producer) to another (the consumer) which reacts instantaneously (No delay) on the incoming data.
  • 7. Distributed Streaming  Streaming:  Computations on never ending “streams” of data records (“events”)  Distributed:  Computation spread across many machines 7Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
  • 8. Stateless streaming  Every incoming record is independent of other records.  There is no relation between different record can processed and persisted independently.  Eg. Map , Filter, Join with static data 8Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
  • 9. Stateful Streaming  Computation and state  E.g., counters, windows of past events, state machines, trained ML models  counts of each distinct word seen in records  Result depends on history of stream  Processing of an incoming record depends upon the result of previously processed records 9Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
  • 10. Event-Time Streaming  Data records associated with timestamps (time series data)  Processing depends on timestamps  An event-time stream processor should give you the tools to reason about time  Handle streams that are out of order 10Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
  • 11. Event-Time Streaming  Because time matters  Time  Event time, which is the time at which events actually occurred  Processing time, which is the time at which events are observed in the system 11Ref: Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Analytics
  • 12. Things are Producing Streaming Data 12  Smart city  Healthcare/Medical device  Connected cars  Logistics  Home automation  Airlines  Farmers  Smart Machinery  Security system
  • 13. IoT Big Data Architecture Filtering Analytics Source: https://mapr.com/blog/ml-iot-connected-medical-devices/ 14
  • 14. Collect Data (high level architecture) 15 How to integrate? MQTT or Kafka
  • 15. 16Copy right : https://thenewstack.io/mqtt-protocol-iot/
  • 16. Messaging Systems: Publish/Subscribe Producer Consumer Producer Consumer Topic 1 Topic 2 Topic 3 subscribe publish(topic, msg) Publish subscribe system msg msg
  • 17. Example 18 MQTT uses the pub/sub pattern to connect interested parties with each other Arduino, Raspberry Pi
  • 18. MQTT - Publish / subscribe messaging protocol 19  MQTT protocol is a Machine to Machine (M2M) protocol widely used in Internet of things.  This protocol used publish-subscriber paradigm in contrast to HTTP based on request/response paradigm.  Built on top of TCP/IP for constrained devices and unreliable network  Many (open source) broker implementation  Many client libraries
  • 20. MQTT Architecture (clustering depends on broker implementation) 21
  • 21. MQTT Architecture (clustering depends on broker implementation) 22
  • 22. MQTT Trade-Offs Pros  Lightweight  Simple API  Built for poor connectivity / high latency scenario  Many client connections (tens of thousands per MQTT server) Cons  Queuing, not stream processing  no buffering  No high scalability  No good integration to rest of the enterprise  No reprocessing of events 23
  • 23. Apache Kafka A distributed streaming platform 24
  • 24. Kafka Data Streams Kafka is used to stream data into data lakes, applications and real-time stream analytics systems.
  • 25. Kafka architecture: Broker, Topics, Producers, and Consumers 26 Kafka Cluster is made up of multiple Kafka Brokers
  • 26. Apache Kafka - Architecture Producer Consumer 27
  • 27. Apache Kafka - Architecture Producer Consumer 28
  • 30. 31
  • 31. 32
  • 32. 33
  • 33. A few important characteristics  Fast  Kafka can handle hundreds of megabytes of reads and writes per second from a large number of clients.  Designed for real time activity streaming.  Distributed and highly scalable  Kafka has a cluster-centric design offers strong durability and fault-tolerance guarantees.  Messages partitioning spread over a cluster of machines  Durable  Message persisted to disk and replicated within cluster to prevent data loss.  Each broker can handle terabytes of messages without performance impact
  • 35. Use Case – Truck Sensors 36
  • 36. Kafka Trade-Offs (from IoT perspective) Pros  Stream processing, not just queuing  High throughput  Large scale  High availability  Long term storage and buffering  Reprocessing of events  Good integration to rest of the enterprise Cons  Not built for tens of thousands connections  Requires stable network and good infrastructure 37
  • 37. Collect Data (high level architecture) 38 How to integrate? MQTT+Kafka
  • 38. End-to-End Integration from MQTT to Apache Kafka 39
  • 39. MQTT Source and Sink Connectors for Kafka Connect 40 https://www.confluent.io/hub/ https://www.confluent.io/connector/kafka-connect-mqtt/
  • 40. IoT Data Ingestion through MQTT into Kafka 41Ref: https://github.com/gschmutz/stream-processing-workshop/tree/master/06-iot-data-ingestion-over-mqtt
  • 41. IoT Big Data Architecture Filtering Analytics Ref: https://mapr.com/blog/ml-iot-connected-medical-devices/ 42
  • 42. What is stream processing?  Technology that let users query continuous data stream and detect conditions fast within a small time period from the time of receiving the data.  The detection time period varies from few milliseconds to minutes.
  • 44. Two Types of Stream Processing 45
  • 45. Native Streaming  It means every incoming record is processed as soon as it arrives, without waiting for others.  There are some continuous running processes which run for ever and every record passes through these processes to get processed.  Framework to achieve the minimum latency possible.  But hard to achieve fault tolerance 46
  • 46. Micro-batching  It means incoming records in every few seconds are batched together and then processed in a single mini batch with delay of few seconds.  Cost of latency and it will not feel like a natural steaming 47 https://medium.com/@chandanbaranwal/spark-streaming-vs-flink-vs-storm-vs-kafka-streams-vs-samza-choose-your-stream-processing- 91ea3f04675b
  • 47. Apache Storm  Distributed dataflow abstraction (spouts & bolts) and large scale stream processing  It is true streaming and is good for simple event based use cases  Very low latency, and high throughput  No state management 48  if it is simple IoT kind of event based alerting system source of streams filtering, functions, aggregations, joins, etc Processing
  • 48. Apache Flink Queries Applications Devices etc. Database Stream File / Object Storage  Stateful computations over streams  First True streaming framework with all advanced features like event time processing, watermarks, etc  Low latency with high throughput Historic Data Streams Application Good for Complex event time processing, aggregation, stream joins,etc
  • 49. Architecture and Process Model 50 Ref: https://ci.apache.org/projects/flink/flink-docs-release-1.1/internals/general_arch.html
  • 51. Apache Spark  Spark has emerged as true successor of Hadoop  Unified batch and stream processing over a batch runtime  High throughput, Fault tolerance by default due to micro-batch nature  Not true streaming, not suitable for low latency requirements 52 Good for Stream machine learning
  • 53. 54 Ref: Muhammad Syafrudin, “Performance Analysis of IoT-Based Sensor, Big Data Processing, and Machine Learning Model for Real-Time Monitoring System in Automotive Manufacturing” Real-Time Monitoring System in Automotive Manufacturing Detect abnormal events and diagnosis in a process
  • 54. 55 System design Ref: Muhammad Syafrudin, “Performance Analysis of IoT-Based Sensor, Big Data Processing, and Machine Learning Model for Real-Time Monitoring System in Automotive Manufacturing”
  • 56. 57
  • 57. 58
  • 58. 59
  • 59. 60 Performance evaluation in terms of latency with different numbers of clients (a) and servers (b); throughput with different numbers of clients (c) and servers (d);