Apache Pulsar with MQTT for
Edge Computing
Timothy Spann | Developer Advocate
Big Mountain Data and Dev Conference
Tim Spann
Developer Advocate
● https://www.datainmotion.dev/
● https://github.com/tspannhw/SpeakerProfile
● https://dev.to/tspannhw
● https://sessionize.com/tspann/
DZone Zone Leader and Big Data
MVB Data DJay
Founded by the original developers of
Apache Pulsar and Apache BookKeeper,
StreamNative builds a cloud-native event
streaming platform that enables
enterprises to easily access data as
real-time event streams.
Apache Pulsar
Apache is an open source, cloud-native
distributed messaging and streaming platform.
What are the Benefits of Pulsar?
Data Durability
Scalability Geo-Replication
Multi-Tenancy
Unified Messaging
Model
Apache Pulsar
A Unified Messaging Platform
Message Queuing
Data Streaming
Pulsar is built for easy scale-out.
*Illustrations by Jack
Vanlightly
Key Milestones
2012 2016 2017 2018 2019 2020
Originally developed
inside Yahoo! as “Cloud
Messaging Service”
Pulsar is
committed to
Open Source
Pulsar is accepted into
the Apache Software
Foundation
Pulsar
becomes a
Top-Level
Project
● StreamNative is founded and
seed round raised.
● Tencent adopts Pulsar for
payment processing platform.
● BestPay adopts Pulsar for
payment processing.
● Pulsar hits 200 contributors.
● 2 global Pulsar conferences, 80+ speakers, 1,500+ attendees
● Pulsar hits 340 contributors
● StreamNative and OVHCloud launch Kafka on Pulsar (KoP)
● StreamNative + China Mobile launch AMQP on Pulsar (AoP)
● Pulsar Ecosystem expands - StreamNative Hub launches
● StreamNative Cloud launches on GCP and Alibaba Cloud
● StreamNative customer adoption continues - new
customers include Flipkart and Applied Materials
● Pulsar 2.7 + Transactions
● Pulsar Flink Connector 2.7
Major increase in adoption following
TLP designation in 2018
2021
● 3 global Pulsar conferences
● StreamNative hits 400
contributors (June).
● Pulsar surpasses Kafka in
monthly active contributors.
● Pulsar 2.8 + Exactly-Once
semantics
● StreamNative Platform launches
Apache Pulsar Overview
Enable Geo-Replicated Messaging
● Pub-Sub
● Geo-Replication
● Pulsar Functions
● Horizontal Scalability
● Multi-tenancy
● Tiered Persistent Storage
● Pulsar Connectors
● REST API
● CLI
● Many clients available
● Four Different Subscription Types
● Multi-Protocol Support
○ MQTT
○ AMQP
○ JMS
○ Kafka
○ ...
Pulsar’s Publish-Subscribe model
Broker
Subscription
Consumer 1
Consumer 2
Consumer 3
Topic
Producer 1
Producer 2
● Producers send messages.
● Topics are an ordered, named
channel that producers use to
transmit messages to subscribed
consumers.
● Messages belong to a topic and
contain an arbitrary payload.
● Brokers handle connections and
routes messages between producers
/ consumers.
● Subscriptions are named
configuration rules that determine
how messages are delivered to
consumers.
● Consumers receive messages.
What is the Pulsar Ecosystem?
● Functions and Connectors
○ Functions: Lightweight stream processing
○ Connectors: Part of “Pulsar IO”, includes “Source” and “Sink”
APIs
■ Files, Databases, Data tools, Cloud Services, etc
● Protocol Handlers
○ Allows Pulsar to handle additional protocols by an extendable
API running in the broker
■ AoP (AMQP), KoP (Kafka), MoP (MQTT)
Topics
Tenants
(Compliance)
Tenants
(Data Services)
Namespace
(Microservices)
Topic-1
(Cust Auth)
Topic-1
(Location Resolution)
Topic-2
(Demographics)
Topic-1
(Budgeted Spend)
Topic-1
(Acct History)
Topic-1
(Risk Detection)
Namespace
(ETL)
Namespace
(Campaigns)
Namespace
(ETL)
Tenants
(Marketing)
Namespace
(Risk Assessment)
Pulsar Instance
Pulsar Cluster
Pulsar subscription modes
Different subscription modes
have different semantics:
Exclusive/Failover - guaranteed
order, single active consumer
Shared - multiple active
consumers, no order
Key_Shared - multiple active
consumers, order for given key
Producer 1
Producer 2
Pulsar Topic
Subscription D
Consumer D-1
Consumer D-2
Key-Shared
<
K
1,
V
10
>
<
K
1,
V
11
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
2
,V
2
1>
<
K
2
,V
2
2
>
Subscription C
Consumer C-1
Consumer C-2
Shared
<
K
1,
V
10
>
<
K
2,
V
21
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
1,
V
11
>
<
K
2
,V
2
2
>
Subscription A Consumer A
Exclusive
Subscription B
Consumer B-1
Consumer B-2
In case of failure in
Consumer B-1
Failover
Reader and Batch
API
Pulsar
IO/Connectors
Stream Processor
Applications
Prebuilt Connectors Custom Connectors
Microservices or
Event-Driven Architecture
Pub/Sub
API
Publisher
Subscriber
Admin
API
Operators &
Administrators
Teams
Tenant
Pulsar API
Design
What is the Pulsar Ecosystem? (cont’d)
● Processing Engines
○ Supports modern processing engines
■ Flink and Spark, as well as Pulsar SQL (Presto/Trino)
● Offloaders
○ Allows data to be offloaded to cloud storage and used with
existing Pulsar APIs
■ S3, GCP Cloud Storage, HDFS, File (NFS), Azure Blob Storage
(in Pulsar 2.7.0)
MQTT on Pulsar (MoP)
MQTT on Pulsar (MoP) Configuration
messagingProtocols=mqtt
# directory
protocolHandlerDirectory=./protocols
#mqtt 3.1.1 - port / ip
mqttListeners=mqtt://127.0.0.1:1883
advertisedAddress=127.0.0.1
Kafka-on-Pulsar (Kop)
Pulsar Functions
Provides a simple API to:
● Receive a message (consume)
● Process the message using your own code
● Send a message (produce)
Takes care of the boilerplate code so there is no need to create
producers and consumers.
Moving Data In and Out of Pulsar
IO/Connectors are a simple way to integrate with external systems and move data
in and out of Pulsar.
● Built on top of Pulsar Functions
● Built-in connectors - hub.streamnative.io
Source Sink
Use Azure BlobStore offloader with
Pulsar
https://pulsar.apache.org/docs/en/tiered-storage-azure/
24
streamnative.io
Apache Pulsar - Other Sinks
https://hub.streamnative.io/connectors/cloud-storage-sink/2.5.1/
mongoDB
AWS Lambda
redis
AWS S3
GCS
Ingesting IoT Data via Java Pulsar
https://github.com/tspannhw/StreamingAnalyticsUsingFlinkSQL/
Ingesting IoT Data via Java Pulsar
Pulsar SQL
Presto/Trino workers
can read segments
directly from bookies
(or offloaded storage)
in parallel.
Bookie
1
Segment 1
Producer Consumer
Broker 1
Topic1-Part1
Broker 2
Topic1-Part2
Broker 3
Topic1-Part3
Segment 2 Segment 3 Segment 4 Segment X
Segment 1
Segment 1 Segment 1
Segment 3 Segment 3
Segment 3
Segment 2
Segment 2
Segment 2
Segment 4
Segment 4
Segment 4
Segment X
Segment X
Segment X
Bookie
2
Bookie
3
Query
Coordinator
...
...
SQL Worker SQL Worker SQL Worker
SQL Worker
Query
Topic
Metadata
Query Your Topics with Pulsar SQL (Trino)
StreamNative
Cloud
Powered by Apache Pulsar, StreamNative provides a cloud-native,
real-time messaging and streaming platform to support multi-cloud
and hybrid cloud strategies.
Built for Containers
Cloud Native
StreamNative Cloud
Flink SQL
Best Practice
Architectures
StreamNative Hub
StreamNative Cloud
Unified Batch and Stream COMPUTING
Batch
(Batch + Stream)
Unified Batch and Stream STORAGE
Offload
(Queuing + Streaming)
Apache Flink - Apache Pulsar - Apache NiFi <-> Events <-> Azure Data Stores
Tiered Storage
Pulsar
---
KoP
---
MoP
---
Websocket
---
HTTP
Pulsar
Sink
Pulsar
Sink
Streaming
Edge Gateway
Protocols
End-to-End Streaming FLiP(N) Apps
Demo
NVIDIA Device
MQTT from Python
pip3 install paho-mqtt
import paho.mqtt.client as mqtt
client = mqtt.Client("rpi4-iot")
row = { }
row['gasKO'] = str(readings)
json_string = json.dumps(row)
json_string = json_string.strip()
client.connect("pulsar-server.com", 1883, 180)
client.publish("persistent://public/default/mqtt-2", payload=json_string,
qos=0, retain=True)
37
Using NVIDIA Jetson Devices With Pulsar
https://dev.to/tspannhw/unboxing-the-most-amazing-edge-ai-devic
e-part-1-of-3-nvidia-jetson-xavier-nx-595k
https://github.com/tspannhw/minifi-xaviernx/
https://github.com/tspannhw/minifi-jetson-nano
https://github.com/tspannhw/Flip-iot
https://www.datainmotion.dev/2020/10/flank-streaming-edgeai-on-
new-nvidia.html
https://github.com/tspannhw/FLiP-Mobile/blob/30bcc1ec98fc31e0
39b51a06180d98545c1e0542/python3/enviro.py
Demo Walkthrough
{"entriesAddedCounter":1,"numberOfEntrie
s":1,"totalSize":651,"currentLedgerEntries":1,"
currentLedgerSize":651,"lastLedgerCreated
Timestamp":"2021-09-13T16:13:06.6-04:00","
waitingCursorsCount":0,"pendingAddEntri
esCount":0,"lastConfirmedEntry":"7076:0","
state":"LedgerOpened","ledgers":[{"ledgerId
":7076,"entries":0,"size":0,"offloaded":false,"u
nderReplicated":false}],"cursors":{},"schema
Ledgers":[],"compactedLedger":{"ledgerId":
-1,"entries":-1,"size":-1,"offloaded":false,"unde
rReplicated":false}}
Wrap-Up
Connect with the Community & Stay Up-To-Date
● Join the Pulsar Slack channel - Apache-Pulsar.slack.com
● Follow @streamnativeio and @apache_pulsar on Twitter
● Subscribe to Monthly Pulsar Newsletter for major news, events,
project updates, and resources in the Pulsar community
● https://github.com/tspannhw/FLiP-Energy/
● https://github.com/tspannhw/Flip-iot
● https://github.com/tspannhw/Flip-jetson
● https://github.com/streamnative/pulsar-flink
● https://github.com/streamnative/mop
● https://www.linkedin.com/pulse/2021-schedule-tim-spann/
● https://github.com/tspannhw/FLiP-InfluxDB/blob/main/README.md
● https://streamnative.io/en/blog/release/2021-04-20-flink-sql-on-streamnative-cloud
● https://docs.streamnative.io/cloud/stable/compute/flink-sql
Deeper Content
@PaasDev
https://www.pulsardeveloper.com/
timothyspann
Interested In Learning More?
Flink SQL Cookbook
The Github Source for Flink
SQL Demo
The GitHub Source for Demo
Manning's Apache Pulsar in
Action
O’Reilly Book
[10/21] Trino Summit
Resources Free eBooks Upcoming Events
43
Pulsar Summit Asia
November 20-21, 2021
Contact us at partners@pulsar-summit.org to become a sponsor or partner
Let’s Keep
in Touch!
Speaker Name
Speaker title
@PassDev
https://www.linkedin.com/in/timothyspann
https://github.com/tspannhw
Questions
Event Partners
Thank you to our Sponsors
Premier Sponsors

Big mountain data and dev conference apache pulsar with mqtt for edge computing

  • 1.
    Apache Pulsar withMQTT for Edge Computing Timothy Spann | Developer Advocate Big Mountain Data and Dev Conference
  • 2.
    Tim Spann Developer Advocate ●https://www.datainmotion.dev/ ● https://github.com/tspannhw/SpeakerProfile ● https://dev.to/tspannhw ● https://sessionize.com/tspann/ DZone Zone Leader and Big Data MVB Data DJay
  • 3.
    Founded by theoriginal developers of Apache Pulsar and Apache BookKeeper, StreamNative builds a cloud-native event streaming platform that enables enterprises to easily access data as real-time event streams.
  • 4.
  • 5.
    Apache is anopen source, cloud-native distributed messaging and streaming platform.
  • 6.
    What are theBenefits of Pulsar? Data Durability Scalability Geo-Replication Multi-Tenancy Unified Messaging Model
  • 7.
  • 8.
    A Unified MessagingPlatform Message Queuing Data Streaming
  • 9.
    Pulsar is builtfor easy scale-out. *Illustrations by Jack Vanlightly
  • 10.
    Key Milestones 2012 20162017 2018 2019 2020 Originally developed inside Yahoo! as “Cloud Messaging Service” Pulsar is committed to Open Source Pulsar is accepted into the Apache Software Foundation Pulsar becomes a Top-Level Project ● StreamNative is founded and seed round raised. ● Tencent adopts Pulsar for payment processing platform. ● BestPay adopts Pulsar for payment processing. ● Pulsar hits 200 contributors. ● 2 global Pulsar conferences, 80+ speakers, 1,500+ attendees ● Pulsar hits 340 contributors ● StreamNative and OVHCloud launch Kafka on Pulsar (KoP) ● StreamNative + China Mobile launch AMQP on Pulsar (AoP) ● Pulsar Ecosystem expands - StreamNative Hub launches ● StreamNative Cloud launches on GCP and Alibaba Cloud ● StreamNative customer adoption continues - new customers include Flipkart and Applied Materials ● Pulsar 2.7 + Transactions ● Pulsar Flink Connector 2.7 Major increase in adoption following TLP designation in 2018 2021 ● 3 global Pulsar conferences ● StreamNative hits 400 contributors (June). ● Pulsar surpasses Kafka in monthly active contributors. ● Pulsar 2.8 + Exactly-Once semantics ● StreamNative Platform launches
  • 11.
    Apache Pulsar Overview EnableGeo-Replicated Messaging ● Pub-Sub ● Geo-Replication ● Pulsar Functions ● Horizontal Scalability ● Multi-tenancy ● Tiered Persistent Storage ● Pulsar Connectors ● REST API ● CLI ● Many clients available ● Four Different Subscription Types ● Multi-Protocol Support ○ MQTT ○ AMQP ○ JMS ○ Kafka ○ ...
  • 12.
    Pulsar’s Publish-Subscribe model Broker Subscription Consumer1 Consumer 2 Consumer 3 Topic Producer 1 Producer 2 ● Producers send messages. ● Topics are an ordered, named channel that producers use to transmit messages to subscribed consumers. ● Messages belong to a topic and contain an arbitrary payload. ● Brokers handle connections and routes messages between producers / consumers. ● Subscriptions are named configuration rules that determine how messages are delivered to consumers. ● Consumers receive messages.
  • 13.
    What is thePulsar Ecosystem? ● Functions and Connectors ○ Functions: Lightweight stream processing ○ Connectors: Part of “Pulsar IO”, includes “Source” and “Sink” APIs ■ Files, Databases, Data tools, Cloud Services, etc ● Protocol Handlers ○ Allows Pulsar to handle additional protocols by an extendable API running in the broker ■ AoP (AMQP), KoP (Kafka), MoP (MQTT)
  • 14.
    Topics Tenants (Compliance) Tenants (Data Services) Namespace (Microservices) Topic-1 (Cust Auth) Topic-1 (LocationResolution) Topic-2 (Demographics) Topic-1 (Budgeted Spend) Topic-1 (Acct History) Topic-1 (Risk Detection) Namespace (ETL) Namespace (Campaigns) Namespace (ETL) Tenants (Marketing) Namespace (Risk Assessment) Pulsar Instance Pulsar Cluster
  • 15.
    Pulsar subscription modes Differentsubscription modes have different semantics: Exclusive/Failover - guaranteed order, single active consumer Shared - multiple active consumers, no order Key_Shared - multiple active consumers, order for given key Producer 1 Producer 2 Pulsar Topic Subscription D Consumer D-1 Consumer D-2 Key-Shared < K 1, V 10 > < K 1, V 11 > < K 1, V 12 > < K 2 ,V 2 0 > < K 2 ,V 2 1> < K 2 ,V 2 2 > Subscription C Consumer C-1 Consumer C-2 Shared < K 1, V 10 > < K 2, V 21 > < K 1, V 12 > < K 2 ,V 2 0 > < K 1, V 11 > < K 2 ,V 2 2 > Subscription A Consumer A Exclusive Subscription B Consumer B-1 Consumer B-2 In case of failure in Consumer B-1 Failover
  • 16.
    Reader and Batch API Pulsar IO/Connectors StreamProcessor Applications Prebuilt Connectors Custom Connectors Microservices or Event-Driven Architecture Pub/Sub API Publisher Subscriber Admin API Operators & Administrators Teams Tenant Pulsar API Design
  • 17.
    What is thePulsar Ecosystem? (cont’d) ● Processing Engines ○ Supports modern processing engines ■ Flink and Spark, as well as Pulsar SQL (Presto/Trino) ● Offloaders ○ Allows data to be offloaded to cloud storage and used with existing Pulsar APIs ■ S3, GCP Cloud Storage, HDFS, File (NFS), Azure Blob Storage (in Pulsar 2.7.0)
  • 18.
  • 19.
    MQTT on Pulsar(MoP) Configuration messagingProtocols=mqtt # directory protocolHandlerDirectory=./protocols #mqtt 3.1.1 - port / ip mqttListeners=mqtt://127.0.0.1:1883 advertisedAddress=127.0.0.1
  • 20.
  • 21.
    Pulsar Functions Provides asimple API to: ● Receive a message (consume) ● Process the message using your own code ● Send a message (produce) Takes care of the boilerplate code so there is no need to create producers and consumers.
  • 22.
    Moving Data Inand Out of Pulsar IO/Connectors are a simple way to integrate with external systems and move data in and out of Pulsar. ● Built on top of Pulsar Functions ● Built-in connectors - hub.streamnative.io Source Sink
  • 23.
    Use Azure BlobStoreoffloader with Pulsar https://pulsar.apache.org/docs/en/tiered-storage-azure/
  • 24.
    24 streamnative.io Apache Pulsar -Other Sinks https://hub.streamnative.io/connectors/cloud-storage-sink/2.5.1/ mongoDB AWS Lambda redis AWS S3 GCS
  • 25.
    Ingesting IoT Datavia Java Pulsar https://github.com/tspannhw/StreamingAnalyticsUsingFlinkSQL/
  • 26.
    Ingesting IoT Datavia Java Pulsar
  • 27.
    Pulsar SQL Presto/Trino workers canread segments directly from bookies (or offloaded storage) in parallel. Bookie 1 Segment 1 Producer Consumer Broker 1 Topic1-Part1 Broker 2 Topic1-Part2 Broker 3 Topic1-Part3 Segment 2 Segment 3 Segment 4 Segment X Segment 1 Segment 1 Segment 1 Segment 3 Segment 3 Segment 3 Segment 2 Segment 2 Segment 2 Segment 4 Segment 4 Segment 4 Segment X Segment X Segment X Bookie 2 Bookie 3 Query Coordinator ... ... SQL Worker SQL Worker SQL Worker SQL Worker Query Topic Metadata
  • 28.
    Query Your Topicswith Pulsar SQL (Trino)
  • 29.
  • 30.
    Powered by ApachePulsar, StreamNative provides a cloud-native, real-time messaging and streaming platform to support multi-cloud and hybrid cloud strategies. Built for Containers Cloud Native StreamNative Cloud Flink SQL
  • 32.
  • 33.
    StreamNative Hub StreamNative Cloud UnifiedBatch and Stream COMPUTING Batch (Batch + Stream) Unified Batch and Stream STORAGE Offload (Queuing + Streaming) Apache Flink - Apache Pulsar - Apache NiFi <-> Events <-> Azure Data Stores Tiered Storage Pulsar --- KoP --- MoP --- Websocket --- HTTP Pulsar Sink Pulsar Sink Streaming Edge Gateway Protocols End-to-End Streaming FLiP(N) Apps
  • 34.
  • 35.
  • 36.
    MQTT from Python pip3install paho-mqtt import paho.mqtt.client as mqtt client = mqtt.Client("rpi4-iot") row = { } row['gasKO'] = str(readings) json_string = json.dumps(row) json_string = json_string.strip() client.connect("pulsar-server.com", 1883, 180) client.publish("persistent://public/default/mqtt-2", payload=json_string, qos=0, retain=True)
  • 37.
    37 Using NVIDIA JetsonDevices With Pulsar https://dev.to/tspannhw/unboxing-the-most-amazing-edge-ai-devic e-part-1-of-3-nvidia-jetson-xavier-nx-595k https://github.com/tspannhw/minifi-xaviernx/ https://github.com/tspannhw/minifi-jetson-nano https://github.com/tspannhw/Flip-iot https://www.datainmotion.dev/2020/10/flank-streaming-edgeai-on- new-nvidia.html https://github.com/tspannhw/FLiP-Mobile/blob/30bcc1ec98fc31e0 39b51a06180d98545c1e0542/python3/enviro.py
  • 38.
  • 39.
  • 40.
    Connect with theCommunity & Stay Up-To-Date ● Join the Pulsar Slack channel - Apache-Pulsar.slack.com ● Follow @streamnativeio and @apache_pulsar on Twitter ● Subscribe to Monthly Pulsar Newsletter for major news, events, project updates, and resources in the Pulsar community
  • 41.
    ● https://github.com/tspannhw/FLiP-Energy/ ● https://github.com/tspannhw/Flip-iot ●https://github.com/tspannhw/Flip-jetson ● https://github.com/streamnative/pulsar-flink ● https://github.com/streamnative/mop ● https://www.linkedin.com/pulse/2021-schedule-tim-spann/ ● https://github.com/tspannhw/FLiP-InfluxDB/blob/main/README.md ● https://streamnative.io/en/blog/release/2021-04-20-flink-sql-on-streamnative-cloud ● https://docs.streamnative.io/cloud/stable/compute/flink-sql Deeper Content @PaasDev https://www.pulsardeveloper.com/ timothyspann
  • 42.
    Interested In LearningMore? Flink SQL Cookbook The Github Source for Flink SQL Demo The GitHub Source for Demo Manning's Apache Pulsar in Action O’Reilly Book [10/21] Trino Summit Resources Free eBooks Upcoming Events
  • 43.
    43 Pulsar Summit Asia November20-21, 2021 Contact us at partners@pulsar-summit.org to become a sponsor or partner
  • 45.
    Let’s Keep in Touch! SpeakerName Speaker title @PassDev https://www.linkedin.com/in/timothyspann https://github.com/tspannhw
  • 46.
  • 47.
    Event Partners Thank youto our Sponsors Premier Sponsors