Building Real-Time Pulsar Apps
on K8
Tim Spann | Developer Advocate
Tim Spann
Developer Advocate
Tim Spann, Developer Advocate at StreamNative
● FLiP(N) Stack = Flink, Pulsar and NiFi Stack
● Streaming Systems & Data Architecture Expert
● Experience:
○ 15+ years of experience with streaming technologies including Pulsar,
Flink, Spark, NiFi, Big Data, Cloud, MXNet, IoT, Python and more.
○ Today, he helps to grow the Pulsar community sharing rich technical
knowledge and experience at both global conferences and through
individual conversations.
FLiP Stack Weekly
This week in Apache Flink, Apache Pulsar, Apache
NiFi, Apache Spark and open source friends.
https://bit.ly/32dAJft
streamnative.io
Passionate and dedicated team.
Founded by the original developers of
Apache Pulsar.
StreamNative helps teams to capture,
manage, and leverage data using
Pulsar’s unified messaging and
streaming platform.
Apache Pulsar adoption is
being driven by
organizations seeking
cloud-native architectures
and new uses cases.
Microservices
Apache Pulsar - Built for Containers / Modern Cloud
Cloud Native
Hybrid & Multi-Cloud Containers
Apache Pulsar + Kafka K8
https://docs.streamnative.io/platform/v1.3.0/quickstart
● “Bookies”
● Stores messages and cursors
● Messages are grouped in
segments/ledgers
● A group of bookies form an “ensemble” to
store a ledger
● “Brokers”
● Handles message routing and
connections
● Stateless, but with caches
● Automatic load-balancing
● Topics are composed of
multiple segments
●
● Stores metadata for both Pulsar
and BookKeeper
● Service discovery
Store
Messages
Metadata & Service
Discovery
Metadata & Service
Discovery
Metadata Store
(ZK, RocksDB, etcd, …)
Pulsar Cluster
Pulsar Cluster
CLUSTER
Global MetaData
Configuration Store
ZK3
ZK2
ZK1
Local
MetaData
Quorum
Bookie 0 Bookie 1 Bookie 2
BookKeeper
Bookie
Ensemble
Pulsar Broker 0 Pulsar Broker 1 Pulsar Broker 2
Apache Pulsar
Apache BookKeeper
Offloader & Tiered Storage
Offloading
Broker 0
Producer Consumer
Broker 1 Broker 2
Bookie 0 Bookie 1 Bookie 2 Bookie 3 Bookie 4
S S S S
T
1
T
2
T
3
T
4
T
0
S
● Consume messages from one
or more Pulsar topics.
● Apply user-supplied
processing logic to each
message.
● Publish the results of the
computation to another topic.
● Support multiple
programming languages
(Java, Python, Go)
● Can leverage 3rd-party
libraries
Pulsar Functions
from pulsar import Function
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import json
class Chat(Function):
def __init__(self):
pass
def process(self, input, context):
fields = json.loads(input)
sid = SentimentIntensityAnalyzer()
ss = sid.polarity_scores(fields["comment"])
row = { }
row['id'] = str(msg_id)
if ss['compound'] < 0.00:
row['sentiment'] = 'Negative'
else:
row['sentiment'] = 'Positive'
row['comment'] = str(fields["comment"])
json_string = json.dumps(row)
return json_string
Entire Function
Pulsar Python
NLP Function
https://github.com/tspannhw/pulsar-pychat-function
Pulsar Functions, along with Pulsar IO/Connectors, provide a powerful API for ingesting, transforming,
and outputting data.
Function Mesh, another StreamNative project, makes it easier for developers to create entire
applications built from sources, functions, and sinks all through a declarative API.
K8 Deploy
Apache Pulsar Resources
https://github.com/tspannhw/FLiPN-Conf42-KubeNative-2022

[Conf42-KubeNative] Building Real-time Pulsar Apps on K8

  • 1.
    Building Real-Time PulsarApps on K8 Tim Spann | Developer Advocate
  • 2.
    Tim Spann Developer Advocate TimSpann, Developer Advocate at StreamNative ● FLiP(N) Stack = Flink, Pulsar and NiFi Stack ● Streaming Systems & Data Architecture Expert ● Experience: ○ 15+ years of experience with streaming technologies including Pulsar, Flink, Spark, NiFi, Big Data, Cloud, MXNet, IoT, Python and more. ○ Today, he helps to grow the Pulsar community sharing rich technical knowledge and experience at both global conferences and through individual conversations.
  • 3.
    FLiP Stack Weekly Thisweek in Apache Flink, Apache Pulsar, Apache NiFi, Apache Spark and open source friends. https://bit.ly/32dAJft
  • 4.
    streamnative.io Passionate and dedicatedteam. Founded by the original developers of Apache Pulsar. StreamNative helps teams to capture, manage, and leverage data using Pulsar’s unified messaging and streaming platform.
  • 5.
    Apache Pulsar adoptionis being driven by organizations seeking cloud-native architectures and new uses cases. Microservices Apache Pulsar - Built for Containers / Modern Cloud Cloud Native Hybrid & Multi-Cloud Containers
  • 6.
    Apache Pulsar +Kafka K8 https://docs.streamnative.io/platform/v1.3.0/quickstart
  • 7.
    ● “Bookies” ● Storesmessages and cursors ● Messages are grouped in segments/ledgers ● A group of bookies form an “ensemble” to store a ledger ● “Brokers” ● Handles message routing and connections ● Stateless, but with caches ● Automatic load-balancing ● Topics are composed of multiple segments ● ● Stores metadata for both Pulsar and BookKeeper ● Service discovery Store Messages Metadata & Service Discovery Metadata & Service Discovery Metadata Store (ZK, RocksDB, etcd, …) Pulsar Cluster
  • 8.
    Pulsar Cluster CLUSTER Global MetaData ConfigurationStore ZK3 ZK2 ZK1 Local MetaData Quorum Bookie 0 Bookie 1 Bookie 2 BookKeeper Bookie Ensemble Pulsar Broker 0 Pulsar Broker 1 Pulsar Broker 2
  • 9.
    Apache Pulsar Apache BookKeeper Offloader& Tiered Storage Offloading Broker 0 Producer Consumer Broker 1 Broker 2 Bookie 0 Bookie 1 Bookie 2 Bookie 3 Bookie 4 S S S S T 1 T 2 T 3 T 4 T 0 S
  • 10.
    ● Consume messagesfrom one or more Pulsar topics. ● Apply user-supplied processing logic to each message. ● Publish the results of the computation to another topic. ● Support multiple programming languages (Java, Python, Go) ● Can leverage 3rd-party libraries Pulsar Functions
  • 11.
    from pulsar importFunction from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer import json class Chat(Function): def __init__(self): pass def process(self, input, context): fields = json.loads(input) sid = SentimentIntensityAnalyzer() ss = sid.polarity_scores(fields["comment"]) row = { } row['id'] = str(msg_id) if ss['compound'] < 0.00: row['sentiment'] = 'Negative' else: row['sentiment'] = 'Positive' row['comment'] = str(fields["comment"]) json_string = json.dumps(row) return json_string Entire Function Pulsar Python NLP Function https://github.com/tspannhw/pulsar-pychat-function
  • 12.
    Pulsar Functions, alongwith Pulsar IO/Connectors, provide a powerful API for ingesting, transforming, and outputting data. Function Mesh, another StreamNative project, makes it easier for developers to create entire applications built from sources, functions, and sinks all through a declarative API.
  • 13.
  • 14.