Living the Stream Dream with Pulsar and Spring Boot
https://events.geekle.us/java23
feb 21, 2023
https://geekle.us/schedule/java23
Living the Stream Dream with Pulsar and Spring Boot
For building Java applications, Spring is the universal answer as it supplies all the connectors and integrations one could want. The same is true for Apache Pulsar as it provides connectors, integration and flexibility to any use case. Apache Pulsar has a robust native Java library to use with Spring as well as other protocol options.
Apache Pulsar provides a cloud native, geo-replicated unified messaging platform that allows for many messaging paradigms. This lends itself well to upgrading existing applications as Pulsar supports using libraries for WebSockets, MQTT, Kafka, JMS, AMQP and RocketMQ. In this talk I will build some example applications utilizing several different protocols for building a variety of applications from IoT to Microservices to Log Analytics.
We will build Spring Boot microservices that utilize Apache Pulsar as a central data hub for communications and enrichment. This utilizes the new Spring for Pulsar framework.
https://github.com/spring-projects-experimental/spring-pulsar
We talked about it on Josh Long's podcast https://spring.io/blog/2022/09/15/a-bootiful-podcast-big-data-legend-former-pivot-and-friend-to-the-spring-community-tim-spann
programming real-time applications with Java 17, Spring Boot and Apache Pulsar 2.11
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Living the Stream Dream with Pulsar and Spring Boot
1. Living the Stream Dream
with Pulsar and Spring
Boot
Tim Spann
Developer Advocate
2. Tim Spann
Developer Advocate
● FLiP(N) Stack = Flink, Pulsar and NiFi Stack
● Streaming Systems/ Data Architect
● Experience:
○ 15+ years of experience with batch and streaming
technologies including Pulsar, Flink, Spark, NiFi, Spring,
Java, Big Data, Cloud, MXNet, Hadoop, Datalakes, IoT
and more.
3. For building Java applications, Spring is the universal answer as it supplies all the connectors and integrations
one could want. The same is true for Apache Pulsar as it provides connectors, integration and flexibility to any
use case. Apache Pulsar has a robust native Java library to use with Spring as well as other protocol options.
Apache Pulsar provides a cloud native, geo-replicated unified messaging platform that allows for many
messaging paradigms. This lends itself well to upgrading existing applications as Pulsar supports using
libraries for WebSockets, MQTT, Kafka, JMS, AMQP and RocketMQ. In this talk I will build some example
applications utilizing several different protocols for building a variety of applications from IoT to Microservices
to Log Analytics.
We will build Spring Boot microservices that utilize Apache Pulsar as a central data hub for communications
and enrichment. This utilizes the new Spring for Pulsar framework.
https://github.com/spring-projects-experimental/spring-pulsar
We talked about it on Josh Long's podcast
https://spring.io/blog/2022/09/15/a-bootiful-podcast-big-data-legend-former-pivot-and-friend-to-the-spring-c
ommunity-tim-spann
6. Proprietary & Confidential |
Agenda
6
● Introduction
● What is Apache Pulsar?
● Spring Apps
● Pulsar
● AMQP
● MQTT
● Kafka
● Demo
7. Proprietary & Confidential | 7
Apache Pulsar has a vibrant community
560+
Contributors
10,000+
Commits
7,000+
Slack Members
1,000+
Organizations
Using Pulsar
8. Proprietary & Confidential |
Streaming
Consumer
Consumer
Consumer
Subscription
Shared
Failover
Consumer
Consumer
Subscription
In case of failure in
Consumer B-0
Consumer
Consumer
Subscription
Exclusive
X
Consumer
Consumer
Key-Shared
Subscription
Pulsar
Topic/Partition
Messaging
8
9. Proprietary & Confidential |
Messages - the basic unit of Pulsar
9
Component Description
Value / data payload The data carried by the message. All Pulsar messages contain raw bytes, although message data
can also conform to data schemas.
Key Messages are optionally tagged with keys, used in partitioning and also is useful for things like
topic compaction.
Properties An optional key/value map of user-defined properties.
Producer name The name of the producer who produces the message. If you do not specify a producer name, the
default name is used.
Sequence ID Each Pulsar message belongs to an ordered sequence on its topic. The sequence ID of the
message is its order in that sequence.
10. Apache Pulsar: Messaging vs Streaming
Message Queueing - Queueing
systems are ideal for work queues
that do not require tasks to be
performed in a particular order.
Streaming - Streaming works
best in situations where the
order of messages is important.
11. Pulsar Subscription Modes
Different subscription modes
have different semantics:
Exclusive/Failover -
guaranteed order, single active
consumer
Shared - multiple active
consumers, no order
Key_Shared - multiple active
consumers, order for given key
Producer 1
Producer 2
Pulsar Topic
Subscription D
Consumer D-1
Consumer D-2
Key-Shared
<
K
1,
V
10
>
<
K
1,
V
11
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
2
,V
2
1>
<
K
2
,V
2
2
>
Subscription C
Consumer C-1
Consumer C-2
Shared
<
K
1,
V
10
>
<
K
2,
V
21
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
1,
V
11
>
<
K
2
,V
2
2
>
Subscription A Consumer A
Exclusive
Subscription B
Consumer B-1
Consumer B-2
In case of failure in
Consumer B-1
Failover
16. Schema Registry
Schema Registry
schema-1 (value=Avro/Protobuf/JSON) schema-2 (value=Avro/Protobuf/JSON) schema-3
(value=Avro/Protobuf/JSON)
Schema
Data
ID
Local Cache
for Schemas
+
Schema
Data
ID +
Local Cache
for Schemas
Send schema-1
(value=Avro/Protobuf/JSON) data
serialized per schema ID
Send (register)
schema (if not in
local cache)
Read schema-1
(value=Avro/Protobuf/JSON) data
deserialized per schema ID
Get schema by ID (if
not in local cache)
Producers Consumers