SlideShare a Scribd company logo
1 of 65
Apache Pulsar for ML
Tim Spann | Developer Advocate
AGENDA
● Apache Pulsar
● ML Functions
● Demo
Tim Spann
Developer Advocate
Tim Spann
Developer Advocate at StreamNative
● FLiP(N) Stack = Flink, Pulsar and NiFi Stack
● Streaming Systems & Data Architecture Expert
● Experience:
○ 15+ years of experience with streaming technologies including Pulsar, Flink, Spark, NiFi, Big
Data, Cloud, MXNet, IoT, Python and more.
○ Today, he helps to grow the Pulsar community sharing rich technical knowledge and experience
at both global conferences and through individual conversations.
This week in Apache Flink, Apache
Pulsar, Apache NiFi, Apache Spark
and open source friends.
https://bit.ly/32dAJft
FLiP Stack Weekly
Apache Pulsar
https://pulsar.apache.org/docs/en/reference-terminology/
What are the Benefits of Pulsar?
Data Durability
Scalability Geo-Replication
Multi-Tenancy
Unified Messaging
Model
Apache Pulsar is a Cloud-Native
Messaging and Event-Streaming Platform.
CREATED
Originally
developed inside
Yahoo! as Cloud
Messaging
Service
GROWTH
10x Contributors
10MM+ Downloads
Ecosystem Expands
Kafka on Pulsar
AMQP on Pulsar
Functions
. . .
2012 2016 2018 TODAY
APACHE TLP
Pulsar
becomes
Apache top
level project.
OPEN SOURCE
Pulsar
committed
to open source.
Apache Pulsar Timeline
Evolution of Pulsar Growth
Pulsar Has a Built-in Super Set of OSS
Features
Durability
Scalability Geo-Replication
Multi-Tenancy
Unified Messaging
Model
Reduced Vendor Dependency
Functions
Open-Source Features
Apache Pulsar is built to support legacy applications, handle the
needs of modern apps, and supports NextGen applications
Support legacy workloads.
Compatible with popular
messaging and streaming tools.
Legacy
Built for today's real-time
event driven applications.
Modern
Scalable, adaptive architecture
ready for the future of real-time
streaming.
NextGen
Apache Pulsar has a vibrant community
560+
Contributors
10,000+
Commits
7,000+
Slack Members
1,000+
Organizations
Using Pulsar
It is often assumed that Pulsar and Kafka have equal capabilities. In reality,
Pulsar offers a superset of Kafka.
● Pulsar is streaming and queuing together
● Pulsar is cloud-native with stateless brokers
● Natively includes geo-replication, multi-tenancy, and end-to-end
security out of the box
● Pulsar provides automated rebalancing
● Pulsar offers 100X lower latency w/ 2.5 greater throughput than Kafka
Advantages of Apache Pulsar
Apache Pulsar features
Cloud native with decoupled
storage and compute layers.
Built-in compatibility with your
existing code and messaging
infrastructure.
Geographic redundancy and high
availability included.
Centralized cluster management
and oversight.
Elastic horizontal and vertical
scalability.
Seamless and instant partitioning
rebalancing with no downtime.
Flexible subscription model
supports a wide array of use cases.
Compatible with the tools you use
to store, analyze, and process data.
● “Bookies”
● Stores messages and cursors
● Messages are grouped in
segments/ledgers
● A group of bookies form an
“ensemble” to store a ledger
● “Brokers”
● Handles message routing and
connections
● Stateless, but with caches
● Automatic load-balancing
● Topics are composed of
multiple segments
●
● Stores metadata for both
Pulsar and BookKeeper
● Service discovery
Store
Messages
Metadata &
Service Discovery
Metadata &
Service Discovery
Pulsar Cluster
Metadata
Storage
Pulsar Cluster
Component Description
Value /
Data payload
The data carried by the message. All Pulsar messages contain raw bytes,
although message data can also conform to data schemas.
Key Messages are optionally tagged with keys, used in partitioning and also is useful
for things like topic compaction.
Properties An optional key/value map of user-defined properties.
Producer name The name of the producer who produces the message. If you do not specify a
producer name, the default name is used.
Sequence ID Each Pulsar message belongs to an ordered sequence on its topic. The
sequence ID of the message is its order in that sequence.
Messages - the Basic Unit of Apache Pulsar
Different subscription modes
have different semantics:
Exclusive/Failover -
guaranteed order, single active
consumer
Shared - multiple active
consumers, no order
Key_Shared - multiple active
consumers, order for given key
Producer 1
Producer 2
Pulsar Topic
Subscription D
Consumer D-1
Consumer D-2
Key-Shared
<
K
1,
V
10
>
<
K
1,
V
11
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
2
,V
2
1>
<
K
2
,V
2
2
>
Subscription C
Consumer C-1
Consumer C-2
Shared
<
K
1,
V
10
>
<
K
2
,V
2
1>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
1,
V
11
>
<
K
2
,V
2
2
>
Subscription A Consumer A
Exclusive
Subscription B
Consumer B-1
Consumer B-2
In case of failure in
Consumer B-1
Failover
Apache Pulsar Subscription Modes
Streaming
Consumer
Consumer
Consumer
Subscription
Shared
Failover
Consumer
Consumer
Subscription
In case of failure in
Consumer B-0
Consumer
Consumer
Subscription
Exclusive
X
Consumer
Consumer
Key-Shared
Subscription
Pulsar
Topic/Partition
Messaging
Unified Messaging Model
Simplify your data infrastructure and
enable new use cases with queuing and
streaming capabilities in one platform.
Multi-tenancy
Enable multiple user groups to share the
same cluster, either via access control, or
in entirely different namespaces.
Scalability
Decoupled data computing and storage
enable horizontal scaling to handle data
scale and management complexity.
Geo-replication
Support for multi-datacenter replication
with both asynchronous and
synchronous replication for built-in
disaster recovery.
Tiered storage
Enable historical data to be offloaded to
cloud-native storage and store event
streams for indefinite periods of time.
Apache Pulsar Benefits
Messaging Use Cases Streaming Use Cases
Service x commands service y to make some
change.
Example: order service removing item from
inventory service
Moving large amounts of data to another service
(real-time ETL).
Example: logs to elasticsearch
Distributing messages that represent work
among n workers.
Example: order processing not in main “thread”
Periodic jobs moving large amounts of data and
aggregating to more traditional stores.
Example: logs to s3
Sending “scheduled” messages.
Example: notification service for marketing emails
or push notifications
Computing a near real-time aggregate of a message
stream, split among n workers, with order being
important.
Example: real-time analytics over page views
Messaging vs Streaming
Messaging Use Case Streaming Use Case
Retention The amount of data retained is
relatively small - typically only a day
or two of data at most.
Large amounts of data are retained,
with higher ingest volumes and
longer retention periods.
Throughput Messaging systems are not designed
to manage big “catch-up” reads.
Streaming systems are designed to
scale and can handle use cases
such as catch-up reads.
Differences in Consumption
byte[] msgIdBytes = // Some byte
array
MessageId id =
MessageId.fromByteArray(msgIdBytes);
Reader<byte[]> reader =
pulsarClient.newReader()
.topic(topic)
.startMessageId(id)
.create();
Create a reader that will read from
some message between earliest and
latest.
Reader
Apache Pulsar Reader Interface
● New Consumer type added in Pulsar 2.10 that provides a
continuously updated key-value map view of compacted topic data.
● An abstraction of a changelog stream from a primary-keyed table,
where each record in the changelog stream is an update on the
primary-keyed table with the record key as the primary key.
● READ ONLY DATA STRUCTURE!
Apache Pulsar TableView
Schema Registry
schema-1 (value=Avro/Protobuf/JSON) schema-2 (value=Avro/Protobuf/JSON) schema-3
(value=Avro/Protobuf/JSON)
Schema
Data
ID
Local Cache
for Schemas
+
Schema
Data
ID +
Local Cache
for Schemas
Send schema-1
(value=Avro/Protobuf/JSON) data
serialized per schema ID
Send (register)
schema (if not in
local cache)
Read schema-1
(value=Avro/Protobuf/JSON) data
deserialized per schema ID
Get schema by ID (if
not in local cache)
Producers Consumers
Schema Registry
● Utilizing JSON Data with a JSON Schema
● Consistency, Contracts, Clean Data
● This enables easy SQL:
○ Pulsar SQL (Presto SQL)
○ Flink SQL
○ Spark Structured Streaming
Use Schemas
• Functions - Lightweight Stream
Processing (Java, Python, Go)
• Connectors - Sources & Sinks
(Cassandra, Kafka, …)
• Protocol Handlers - AoP (AMQP), KoP
(Kafka), MoP (MQTT)
• Processing Engines - Flink, Spark,
Presto/Trino via Pulsar SQL
• Data Offloaders - Tiered Storage - (S3)
Sources, Sinks and Processing
Kafka on Pulsar (KoP)
MQTT on Pulsar (MoP)
AMQP on Pulsar (AoP)
Use Apache Pulsar For Ingest
Use Apache Pulsar To Stream to Lakehouses
Pulsar SQL
Presto/Trino workers can
read segments directly
from bookies (or
offloaded storage) in
parallel.
Bookie
1
Segment 1
Producer Consumer
Broker 1
Topic1-Part1
Broker 2
Topic1-Part2
Broker 3
Topic1-Part3
Segment 2 Segment 3 Segment 4 Segment X
Segment 1
Segment 1 Segment 1
Segment 3 Segment 3
Segment 3
Segment 2
Segment 2
Segment 2
Segment 4
Segment 4
Segment 4
Segment X
Segment X
Segment X
Bookie
2
Bookie
3
Query
Coordinator
...
...
SQL Worker SQL Worker SQL Worker
SQL Worker
Query
Topic
Metadata
● Event-Driven services that are interested in events from other
services can use Pulsar subscriptions to receive new events in
the stream.
● Pulsar supports both linear (exclusive) and filtered subscriptions
(key-shared).
● You can also use StreamNative’s filtering & routing service to
provide filtered subscriptions.
Subscriptions
Functions
● Visual Question and Answer
● Natural Language Processing
● Sentiment Analysis
● Text Classification
● Named Entity Recognition
● Content-based
Recommendations
• Predictive
Maintenance
• Fault Detection
• Fraud Detection
• Time-Series
Predictions
• Naive Bayes
Apache Pulsar Functions for ML Models
Benefits of Pulsar Functions
• Allows you to focus on the business logic.
• Sets up producer/consumer for you, eliminating the need for
boilerplate code.
• Handles message consumption and publication for you.
• Run as threads, processes or Kubernetes Pods inside Pulsar.
No need for another processing framework. Can scale up
automatically.
Event-Based Microservices Application
• A basic order entry use case
for a food delivery service
will involve several different
microservices.
• The communicate via
messages sent to Pulsar
topics.
• A service can also subscribe
to the events published by
another service.
Function Mesh
Pulsar Functions, along with Pulsar
IO/Connectors, provide a powerful API for
ingesting, transforming, and outputting data.
Function Mesh, another StreamNative
project, makes it easier for developers to
create entire applications built from sources,
functions, and sinks all through a declarative
API.
ML
• Visual Question and Answer
• NLP (Natural Language Processing)
• Sentiment Analysis
• Text Classification
• Named Entity Recognition
• Content-based Recommendations
• Predictive Maintenance
• Fault Detection
• Fraud Detection
• Time-Series Predictions
• Naive Bayes
Why Pulsar Functions for Microservices?
Desired Characteristic Pulsar Functions…
Highly maintainable and testable Are small pieces of code written in popular
languages such as Java, Python, or Go. They can be
easily maintained in source control repositories and
tested with existing frameworks automatically.
Loosely coupled with other
services
Are not directly linked to one another and
communicate via messages.
Independently deployable Are designed to be deployed independently
Can be developed by a small team Are often developed by a single developer.
Inter-service Communication Support all message patterns using Pulsar as the
underlying message bus.
Deployment & Composition Can run as individual threads, processes, or K8s
pods. The Function Mesh allows you to deploy
multiple Pulsar Functions as a single unit.
Built-in
Back
Pressure
Producer<String> producer = client.newProducer(Schema.STRING)
.topic("hellotopic")
.blockIfQueueFull(true) // enable blocking
// when queue is full
.maxPendingMessages(10) // max queue size
.create();
// During Back Pressure: the sendAsync call blocks
// with no room in queues
producer.newMessage()
.key("mykey")
.value("myvalue")
.sendAsync(); // can be a blocking call
ML Java Coding (Deep Java Library)
● Lightweight computation similar
to AWS Lambda.
● Specifically designed to use
Apache Pulsar as a message
bus.
● Function runtime can be
located within Pulsar Broker.
● Java Functions
A serverless event
streaming framework
Pulsar Functions
● Consume messages from one or
more Pulsar topics.
● Apply user-supplied processing
logic to each message.
● Publish the results of the
computation to another topic.
● Support multiple programming
languages (Java, Python, Go)
● Can leverage 3rd-party libraries
to support the execution of ML
models on the edge.
Pulsar Functions
Simple Native Java Function
Simple Function with Pulsar SDK
from pulsar import Function
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import json
class Chat(Function):
def __init__(self):
pass
def process(self, input, context):
fields = json.loads(input)
sid = SentimentIntensityAnalyzer()
ss = sid.polarity_scores(fields["comment"])
row = { }
row['id'] = str(msg_id)
if ss['compound'] < 0.00:
row['sentiment'] = 'Negative'
else:
row['sentiment'] = 'Positive'
row['comment'] = str(fields["comment"])
json_string = json.dumps(row)
return json_string
Entire Function
Pulsar Python NLP Function
Starting a Function - Distributed Cluster
Once compiled into a JAR, start a Pulsar Function in a distributed cluster:
Pulsar Functions
● Route
● Enrich
● Convert
● Lookups
● Run
Machine Learning
● Logging
● Auditing
● Parse
● Split
● Convert
Pulsar Functions
● Consume messages from one
or more Pulsar topics.
● Apply user-supplied
processing logic to each
message.
● Publish the results of the
computation to another topic.
● Support multiple programming
languages (Java, Python, Go)
● Can leverage 3rd-party
libraries to support the
execution of ML models on the
edge.
Building Real-Time Apps Requires a Team
NLP Streaming Architecture
Learn More
Deploying AI With an
Event-Driven
Platform
https://dzone.com/trendreports/enterprise-ai-1
Apache Pulsar in Action
http://tinyurl.com/bdha5p4r
Please enjoy David’s complete book which is the ultimate guide to Pulsar.
Now Available
On-Demand
Pulsar Training
Academy.StreamNative.io
https://streamnative.io/blog/engineering/2021-11-17-building-edge-applications-with-apache-pulsar/
Github
● https://github.com/tspannhw/FLiP-Py-ADS-B
● https://github.com/tspannhw/pulsar-adsb-function
● https://github.com/tspannhw/pulsar-weather-function
● https://github.com/tspannhw/spring-pulsar-airquality
● https://github.com/tspannhw/pulsar-airquality-function
● https://github.com/tspannhw/FLiP-Current22-LetsMonitorAllTheThings
● https://github.com/tspannhw/FLiP-Transit
● https://github.com/tspannhw/SmartTransit
● https://github.com/tspannhw/airquality-kafka-consumer
Scan the QR code
to learn more about
Apache Pulsar and
StreamNative.
Scan the QR code
to build your own
apps today.
Tim Spann
Developer Advocate
@PaaSDev
https://www.linkedin.com/in/timothyspann
https://github.com/tspannhw
Let’s Keep in Touch

More Related Content

Similar to Timothy Spann: Apache Pulsar for ML

Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...Timothy Spann
 
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + PulsarPrinceton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + PulsarTimothy Spann
 
Let's keep it simple and streaming
Let's keep it simple and streamingLet's keep it simple and streaming
Let's keep it simple and streamingTimothy Spann
 
Let's keep it simple and streaming.pdf
Let's keep it simple and streaming.pdfLet's keep it simple and streaming.pdf
Let's keep it simple and streaming.pdfVMware Tanzu
 
Fast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarFast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarTimothy Spann
 
Apache Pulsar Development 101 with Python
Apache Pulsar Development 101 with PythonApache Pulsar Development 101 with Python
Apache Pulsar Development 101 with PythonTimothy Spann
 
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Timothy Spann
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Python web conference 2022   apache pulsar development 101 with python (f li-...Python web conference 2022   apache pulsar development 101 with python (f li-...
Python web conference 2022 apache pulsar development 101 with python (f li-...Timothy Spann
 
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)Timothy Spann
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingDibyendu Bhattacharya
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPCMax Alexejev
 
Streaming Data with Apache Kafka
Streaming Data with Apache KafkaStreaming Data with Apache Kafka
Streaming Data with Apache KafkaMarkus Günther
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureTimothy Spann
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeTimothy Spann
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...Timothy Spann
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaRicardo Bravo
 

Similar to Timothy Spann: Apache Pulsar for ML (20)

Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
 
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + PulsarPrinceton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
 
Let's keep it simple and streaming
Let's keep it simple and streamingLet's keep it simple and streaming
Let's keep it simple and streaming
 
Let's keep it simple and streaming.pdf
Let's keep it simple and streaming.pdfLet's keep it simple and streaming.pdf
Let's keep it simple and streaming.pdf
 
Fast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarFast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache Pulsar
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Pulsar Development 101 with Python
Apache Pulsar Development 101 with PythonApache Pulsar Development 101 with Python
Apache Pulsar Development 101 with Python
 
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Python web conference 2022   apache pulsar development 101 with python (f li-...Python web conference 2022   apache pulsar development 101 with python (f li-...
Python web conference 2022 apache pulsar development 101 with python (f li-...
 
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
 
Streaming Data with Apache Kafka
Streaming Data with Apache KafkaStreaming Data with Apache Kafka
Streaming Data with Apache Kafka
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 

More from Edunomica

Daniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of WorkDaniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of WorkEdunomica
 
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...Edunomica
 
Zack Johnson: Session title: People Analytics: the epicenter of management an...
Zack Johnson: Session title: People Analytics: the epicenter of management an...Zack Johnson: Session title: People Analytics: the epicenter of management an...
Zack Johnson: Session title: People Analytics: the epicenter of management an...Edunomica
 
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...Edunomica
 
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...Edunomica
 
Kevin Martin: The New Corporate Currency
Kevin Martin: The New Corporate CurrencyKevin Martin: The New Corporate Currency
Kevin Martin: The New Corporate CurrencyEdunomica
 
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...Edunomica
 
Kevin Martin: Empowering Your Board with the People Analytics That Matter
Kevin Martin: Empowering Your Board with the People Analytics That MatterKevin Martin: Empowering Your Board with the People Analytics That Matter
Kevin Martin: Empowering Your Board with the People Analytics That MatterEdunomica
 
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...Edunomica
 
Alan Susi: Organizational Health: A People Team’s path to Minimum Viable ‘Wo...
Alan Susi: Organizational Health:  A People Team’s path to Minimum Viable ‘Wo...Alan Susi: Organizational Health:  A People Team’s path to Minimum Viable ‘Wo...
Alan Susi: Organizational Health: A People Team’s path to Minimum Viable ‘Wo...Edunomica
 
Cole Napper: Are you ready for generative AI in people analytics?
Cole Napper: Are you ready for generative AI in people analytics?Cole Napper: Are you ready for generative AI in people analytics?
Cole Napper: Are you ready for generative AI in people analytics?Edunomica
 
Fahim Karim: Attrition Prevention
Fahim Karim: Attrition PreventionFahim Karim: Attrition Prevention
Fahim Karim: Attrition PreventionEdunomica
 
Taras Filatov: Building your own metaverse & NFT app
Taras Filatov: Building your own metaverse & NFT appTaras Filatov: Building your own metaverse & NFT app
Taras Filatov: Building your own metaverse & NFT appEdunomica
 
Alex Poon: Should you gamify community contributions?
Alex Poon: Should you gamify community contributions?Alex Poon: Should you gamify community contributions?
Alex Poon: Should you gamify community contributions?Edunomica
 
Julio Holon: Decentralised colaboration
Julio Holon: Decentralised colaborationJulio Holon: Decentralised colaboration
Julio Holon: Decentralised colaborationEdunomica
 
Startup Presentation: Gaianet
Startup Presentation: GaianetStartup Presentation: Gaianet
Startup Presentation: GaianetEdunomica
 
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescueShawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescueEdunomica
 
Joachim Stroh: Hypha DAO, the 3rd generation of DAOs
Joachim Stroh: Hypha DAO, the 3rd generation of DAOsJoachim Stroh: Hypha DAO, the 3rd generation of DAOs
Joachim Stroh: Hypha DAO, the 3rd generation of DAOsEdunomica
 
Vikram Aditya: Biggest Opportunity Areas in the DAOverse
Vikram Aditya: Biggest Opportunity Areas in the DAOverseVikram Aditya: Biggest Opportunity Areas in the DAOverse
Vikram Aditya: Biggest Opportunity Areas in the DAOverseEdunomica
 
Tamara Helenius: The Commons are Coming
Tamara Helenius: The Commons are ComingTamara Helenius: The Commons are Coming
Tamara Helenius: The Commons are ComingEdunomica
 

More from Edunomica (20)

Daniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of WorkDaniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of Work
 
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
 
Zack Johnson: Session title: People Analytics: the epicenter of management an...
Zack Johnson: Session title: People Analytics: the epicenter of management an...Zack Johnson: Session title: People Analytics: the epicenter of management an...
Zack Johnson: Session title: People Analytics: the epicenter of management an...
 
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
 
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
 
Kevin Martin: The New Corporate Currency
Kevin Martin: The New Corporate CurrencyKevin Martin: The New Corporate Currency
Kevin Martin: The New Corporate Currency
 
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
 
Kevin Martin: Empowering Your Board with the People Analytics That Matter
Kevin Martin: Empowering Your Board with the People Analytics That MatterKevin Martin: Empowering Your Board with the People Analytics That Matter
Kevin Martin: Empowering Your Board with the People Analytics That Matter
 
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
 
Alan Susi: Organizational Health: A People Team’s path to Minimum Viable ‘Wo...
Alan Susi: Organizational Health:  A People Team’s path to Minimum Viable ‘Wo...Alan Susi: Organizational Health:  A People Team’s path to Minimum Viable ‘Wo...
Alan Susi: Organizational Health: A People Team’s path to Minimum Viable ‘Wo...
 
Cole Napper: Are you ready for generative AI in people analytics?
Cole Napper: Are you ready for generative AI in people analytics?Cole Napper: Are you ready for generative AI in people analytics?
Cole Napper: Are you ready for generative AI in people analytics?
 
Fahim Karim: Attrition Prevention
Fahim Karim: Attrition PreventionFahim Karim: Attrition Prevention
Fahim Karim: Attrition Prevention
 
Taras Filatov: Building your own metaverse & NFT app
Taras Filatov: Building your own metaverse & NFT appTaras Filatov: Building your own metaverse & NFT app
Taras Filatov: Building your own metaverse & NFT app
 
Alex Poon: Should you gamify community contributions?
Alex Poon: Should you gamify community contributions?Alex Poon: Should you gamify community contributions?
Alex Poon: Should you gamify community contributions?
 
Julio Holon: Decentralised colaboration
Julio Holon: Decentralised colaborationJulio Holon: Decentralised colaboration
Julio Holon: Decentralised colaboration
 
Startup Presentation: Gaianet
Startup Presentation: GaianetStartup Presentation: Gaianet
Startup Presentation: Gaianet
 
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescueShawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
 
Joachim Stroh: Hypha DAO, the 3rd generation of DAOs
Joachim Stroh: Hypha DAO, the 3rd generation of DAOsJoachim Stroh: Hypha DAO, the 3rd generation of DAOs
Joachim Stroh: Hypha DAO, the 3rd generation of DAOs
 
Vikram Aditya: Biggest Opportunity Areas in the DAOverse
Vikram Aditya: Biggest Opportunity Areas in the DAOverseVikram Aditya: Biggest Opportunity Areas in the DAOverse
Vikram Aditya: Biggest Opportunity Areas in the DAOverse
 
Tamara Helenius: The Commons are Coming
Tamara Helenius: The Commons are ComingTamara Helenius: The Commons are Coming
Tamara Helenius: The Commons are Coming
 

Recently uploaded

Moradia Isolada com Logradouro; Detached house with patio in Penacova
Moradia Isolada com Logradouro; Detached house with patio in PenacovaMoradia Isolada com Logradouro; Detached house with patio in Penacova
Moradia Isolada com Logradouro; Detached house with patio in Penacovaimostorept
 
Pitch Deck Teardown: Goodcarbon's $5.5m Seed deck
Pitch Deck Teardown: Goodcarbon's $5.5m Seed deckPitch Deck Teardown: Goodcarbon's $5.5m Seed deck
Pitch Deck Teardown: Goodcarbon's $5.5m Seed deckHajeJanKamps
 
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...BabaJohn3
 
MichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfMichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfmstarkes24
 
Stages of Startup Funding - An Explainer
Stages of Startup Funding - An ExplainerStages of Startup Funding - An Explainer
Stages of Startup Funding - An ExplainerAlejandro Cremades
 
How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?Alejandro Cremades
 
Most Visionary Leaders in Cloud Revolution, Shaping Tech’s Next Era - 2024 (2...
Most Visionary Leaders in Cloud Revolution, Shaping Tech’s Next Era - 2024 (2...Most Visionary Leaders in Cloud Revolution, Shaping Tech’s Next Era - 2024 (2...
Most Visionary Leaders in Cloud Revolution, Shaping Tech’s Next Era - 2024 (2...CIO Look Magazine
 
Hyundai capital 2024 1q Earnings release
Hyundai capital 2024 1q Earnings releaseHyundai capital 2024 1q Earnings release
Hyundai capital 2024 1q Earnings releaseirhcs
 
Presentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelledPresentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelledCaitlinCummins3
 
Abortion pills in Muscut<Oman(+27737758557) Cytotec available.inn Kuwait City.
Abortion pills in Muscut<Oman(+27737758557) Cytotec available.inn Kuwait City.Abortion pills in Muscut<Oman(+27737758557) Cytotec available.inn Kuwait City.
Abortion pills in Muscut<Oman(+27737758557) Cytotec available.inn Kuwait City.daisycvs
 
Top^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In Harare
Top^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In HarareTop^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In Harare
Top^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In Hararedoctorjoe1984
 
HAL Financial Performance Analysis and Future Prospects
HAL Financial Performance Analysis and Future ProspectsHAL Financial Performance Analysis and Future Prospects
HAL Financial Performance Analysis and Future ProspectsRajesh Gupta
 
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot ReportFuture of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot ReportDubai Multi Commodity Centre
 
Navigating Tax Season with Confidence Streamlines CPA Firms
Navigating Tax Season with Confidence Streamlines CPA FirmsNavigating Tax Season with Confidence Streamlines CPA Firms
Navigating Tax Season with Confidence Streamlines CPA FirmsYourLegal Accounting
 
A BUSINESS PROPOSAL FOR SLAUGHTER HOUSE WASTE MANAGEMENT IN MYSORE MUNICIPAL ...
A BUSINESS PROPOSAL FOR SLAUGHTER HOUSE WASTE MANAGEMENT IN MYSORE MUNICIPAL ...A BUSINESS PROPOSAL FOR SLAUGHTER HOUSE WASTE MANAGEMENT IN MYSORE MUNICIPAL ...
A BUSINESS PROPOSAL FOR SLAUGHTER HOUSE WASTE MANAGEMENT IN MYSORE MUNICIPAL ...prakheeshc
 
PEMATANG SIANTAR 0851/8063/4797 JUAL OBAT ABORSI CYTOTEC PEMATANG SIANTAR
PEMATANG SIANTAR 0851/8063/4797 JUAL OBAT ABORSI CYTOTEC PEMATANG SIANTARPEMATANG SIANTAR 0851/8063/4797 JUAL OBAT ABORSI CYTOTEC PEMATANG SIANTAR
PEMATANG SIANTAR 0851/8063/4797 JUAL OBAT ABORSI CYTOTEC PEMATANG SIANTARdoktercalysta
 
hyundai capital 2023 consolidated financial statements
hyundai capital 2023 consolidated financial statementshyundai capital 2023 consolidated financial statements
hyundai capital 2023 consolidated financial statementsirhcs
 
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptx
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptxExploring-Pipe-Flanges-Applications-Types-and-Benefits.pptx
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptxTexas Flange
 
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![© ر
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![©  ر00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![©  ر
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![© رnafizanafzal
 
Jual Obat Aborsi Di Sibolga wa 0851/7541/5434 Cytotec Misoprostol 200mcg Pfizer
Jual Obat Aborsi Di Sibolga wa 0851/7541/5434 Cytotec Misoprostol 200mcg PfizerJual Obat Aborsi Di Sibolga wa 0851/7541/5434 Cytotec Misoprostol 200mcg Pfizer
Jual Obat Aborsi Di Sibolga wa 0851/7541/5434 Cytotec Misoprostol 200mcg PfizerPusat Herbal Resmi BPOM
 

Recently uploaded (20)

Moradia Isolada com Logradouro; Detached house with patio in Penacova
Moradia Isolada com Logradouro; Detached house with patio in PenacovaMoradia Isolada com Logradouro; Detached house with patio in Penacova
Moradia Isolada com Logradouro; Detached house with patio in Penacova
 
Pitch Deck Teardown: Goodcarbon's $5.5m Seed deck
Pitch Deck Teardown: Goodcarbon's $5.5m Seed deckPitch Deck Teardown: Goodcarbon's $5.5m Seed deck
Pitch Deck Teardown: Goodcarbon's $5.5m Seed deck
 
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...
 
MichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfMichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdf
 
Stages of Startup Funding - An Explainer
Stages of Startup Funding - An ExplainerStages of Startup Funding - An Explainer
Stages of Startup Funding - An Explainer
 
How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?
 
Most Visionary Leaders in Cloud Revolution, Shaping Tech’s Next Era - 2024 (2...
Most Visionary Leaders in Cloud Revolution, Shaping Tech’s Next Era - 2024 (2...Most Visionary Leaders in Cloud Revolution, Shaping Tech’s Next Era - 2024 (2...
Most Visionary Leaders in Cloud Revolution, Shaping Tech’s Next Era - 2024 (2...
 
Hyundai capital 2024 1q Earnings release
Hyundai capital 2024 1q Earnings releaseHyundai capital 2024 1q Earnings release
Hyundai capital 2024 1q Earnings release
 
Presentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelledPresentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelled
 
Abortion pills in Muscut<Oman(+27737758557) Cytotec available.inn Kuwait City.
Abortion pills in Muscut<Oman(+27737758557) Cytotec available.inn Kuwait City.Abortion pills in Muscut<Oman(+27737758557) Cytotec available.inn Kuwait City.
Abortion pills in Muscut<Oman(+27737758557) Cytotec available.inn Kuwait City.
 
Top^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In Harare
Top^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In HarareTop^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In Harare
Top^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In Harare
 
HAL Financial Performance Analysis and Future Prospects
HAL Financial Performance Analysis and Future ProspectsHAL Financial Performance Analysis and Future Prospects
HAL Financial Performance Analysis and Future Prospects
 
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot ReportFuture of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
 
Navigating Tax Season with Confidence Streamlines CPA Firms
Navigating Tax Season with Confidence Streamlines CPA FirmsNavigating Tax Season with Confidence Streamlines CPA Firms
Navigating Tax Season with Confidence Streamlines CPA Firms
 
A BUSINESS PROPOSAL FOR SLAUGHTER HOUSE WASTE MANAGEMENT IN MYSORE MUNICIPAL ...
A BUSINESS PROPOSAL FOR SLAUGHTER HOUSE WASTE MANAGEMENT IN MYSORE MUNICIPAL ...A BUSINESS PROPOSAL FOR SLAUGHTER HOUSE WASTE MANAGEMENT IN MYSORE MUNICIPAL ...
A BUSINESS PROPOSAL FOR SLAUGHTER HOUSE WASTE MANAGEMENT IN MYSORE MUNICIPAL ...
 
PEMATANG SIANTAR 0851/8063/4797 JUAL OBAT ABORSI CYTOTEC PEMATANG SIANTAR
PEMATANG SIANTAR 0851/8063/4797 JUAL OBAT ABORSI CYTOTEC PEMATANG SIANTARPEMATANG SIANTAR 0851/8063/4797 JUAL OBAT ABORSI CYTOTEC PEMATANG SIANTAR
PEMATANG SIANTAR 0851/8063/4797 JUAL OBAT ABORSI CYTOTEC PEMATANG SIANTAR
 
hyundai capital 2023 consolidated financial statements
hyundai capital 2023 consolidated financial statementshyundai capital 2023 consolidated financial statements
hyundai capital 2023 consolidated financial statements
 
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptx
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptxExploring-Pipe-Flanges-Applications-Types-and-Benefits.pptx
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptx
 
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![© ر
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![©  ر00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![©  ر
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![© ر
 
Jual Obat Aborsi Di Sibolga wa 0851/7541/5434 Cytotec Misoprostol 200mcg Pfizer
Jual Obat Aborsi Di Sibolga wa 0851/7541/5434 Cytotec Misoprostol 200mcg PfizerJual Obat Aborsi Di Sibolga wa 0851/7541/5434 Cytotec Misoprostol 200mcg Pfizer
Jual Obat Aborsi Di Sibolga wa 0851/7541/5434 Cytotec Misoprostol 200mcg Pfizer
 

Timothy Spann: Apache Pulsar for ML

  • 1. Apache Pulsar for ML Tim Spann | Developer Advocate
  • 2. AGENDA ● Apache Pulsar ● ML Functions ● Demo
  • 3. Tim Spann Developer Advocate Tim Spann Developer Advocate at StreamNative ● FLiP(N) Stack = Flink, Pulsar and NiFi Stack ● Streaming Systems & Data Architecture Expert ● Experience: ○ 15+ years of experience with streaming technologies including Pulsar, Flink, Spark, NiFi, Big Data, Cloud, MXNet, IoT, Python and more. ○ Today, he helps to grow the Pulsar community sharing rich technical knowledge and experience at both global conferences and through individual conversations.
  • 4. This week in Apache Flink, Apache Pulsar, Apache NiFi, Apache Spark and open source friends. https://bit.ly/32dAJft FLiP Stack Weekly
  • 6. What are the Benefits of Pulsar? Data Durability Scalability Geo-Replication Multi-Tenancy Unified Messaging Model
  • 7. Apache Pulsar is a Cloud-Native Messaging and Event-Streaming Platform.
  • 8. CREATED Originally developed inside Yahoo! as Cloud Messaging Service GROWTH 10x Contributors 10MM+ Downloads Ecosystem Expands Kafka on Pulsar AMQP on Pulsar Functions . . . 2012 2016 2018 TODAY APACHE TLP Pulsar becomes Apache top level project. OPEN SOURCE Pulsar committed to open source. Apache Pulsar Timeline
  • 10. Pulsar Has a Built-in Super Set of OSS Features Durability Scalability Geo-Replication Multi-Tenancy Unified Messaging Model Reduced Vendor Dependency Functions Open-Source Features
  • 11. Apache Pulsar is built to support legacy applications, handle the needs of modern apps, and supports NextGen applications Support legacy workloads. Compatible with popular messaging and streaming tools. Legacy Built for today's real-time event driven applications. Modern Scalable, adaptive architecture ready for the future of real-time streaming. NextGen
  • 12.
  • 13. Apache Pulsar has a vibrant community 560+ Contributors 10,000+ Commits 7,000+ Slack Members 1,000+ Organizations Using Pulsar
  • 14. It is often assumed that Pulsar and Kafka have equal capabilities. In reality, Pulsar offers a superset of Kafka. ● Pulsar is streaming and queuing together ● Pulsar is cloud-native with stateless brokers ● Natively includes geo-replication, multi-tenancy, and end-to-end security out of the box ● Pulsar provides automated rebalancing ● Pulsar offers 100X lower latency w/ 2.5 greater throughput than Kafka Advantages of Apache Pulsar
  • 15. Apache Pulsar features Cloud native with decoupled storage and compute layers. Built-in compatibility with your existing code and messaging infrastructure. Geographic redundancy and high availability included. Centralized cluster management and oversight. Elastic horizontal and vertical scalability. Seamless and instant partitioning rebalancing with no downtime. Flexible subscription model supports a wide array of use cases. Compatible with the tools you use to store, analyze, and process data.
  • 16. ● “Bookies” ● Stores messages and cursors ● Messages are grouped in segments/ledgers ● A group of bookies form an “ensemble” to store a ledger ● “Brokers” ● Handles message routing and connections ● Stateless, but with caches ● Automatic load-balancing ● Topics are composed of multiple segments ● ● Stores metadata for both Pulsar and BookKeeper ● Service discovery Store Messages Metadata & Service Discovery Metadata & Service Discovery Pulsar Cluster Metadata Storage Pulsar Cluster
  • 17.
  • 18.
  • 19. Component Description Value / Data payload The data carried by the message. All Pulsar messages contain raw bytes, although message data can also conform to data schemas. Key Messages are optionally tagged with keys, used in partitioning and also is useful for things like topic compaction. Properties An optional key/value map of user-defined properties. Producer name The name of the producer who produces the message. If you do not specify a producer name, the default name is used. Sequence ID Each Pulsar message belongs to an ordered sequence on its topic. The sequence ID of the message is its order in that sequence. Messages - the Basic Unit of Apache Pulsar
  • 20. Different subscription modes have different semantics: Exclusive/Failover - guaranteed order, single active consumer Shared - multiple active consumers, no order Key_Shared - multiple active consumers, order for given key Producer 1 Producer 2 Pulsar Topic Subscription D Consumer D-1 Consumer D-2 Key-Shared < K 1, V 10 > < K 1, V 11 > < K 1, V 12 > < K 2 ,V 2 0 > < K 2 ,V 2 1> < K 2 ,V 2 2 > Subscription C Consumer C-1 Consumer C-2 Shared < K 1, V 10 > < K 2 ,V 2 1> < K 1, V 12 > < K 2 ,V 2 0 > < K 1, V 11 > < K 2 ,V 2 2 > Subscription A Consumer A Exclusive Subscription B Consumer B-1 Consumer B-2 In case of failure in Consumer B-1 Failover Apache Pulsar Subscription Modes
  • 21. Streaming Consumer Consumer Consumer Subscription Shared Failover Consumer Consumer Subscription In case of failure in Consumer B-0 Consumer Consumer Subscription Exclusive X Consumer Consumer Key-Shared Subscription Pulsar Topic/Partition Messaging
  • 22. Unified Messaging Model Simplify your data infrastructure and enable new use cases with queuing and streaming capabilities in one platform. Multi-tenancy Enable multiple user groups to share the same cluster, either via access control, or in entirely different namespaces. Scalability Decoupled data computing and storage enable horizontal scaling to handle data scale and management complexity. Geo-replication Support for multi-datacenter replication with both asynchronous and synchronous replication for built-in disaster recovery. Tiered storage Enable historical data to be offloaded to cloud-native storage and store event streams for indefinite periods of time. Apache Pulsar Benefits
  • 23. Messaging Use Cases Streaming Use Cases Service x commands service y to make some change. Example: order service removing item from inventory service Moving large amounts of data to another service (real-time ETL). Example: logs to elasticsearch Distributing messages that represent work among n workers. Example: order processing not in main “thread” Periodic jobs moving large amounts of data and aggregating to more traditional stores. Example: logs to s3 Sending “scheduled” messages. Example: notification service for marketing emails or push notifications Computing a near real-time aggregate of a message stream, split among n workers, with order being important. Example: real-time analytics over page views Messaging vs Streaming
  • 24. Messaging Use Case Streaming Use Case Retention The amount of data retained is relatively small - typically only a day or two of data at most. Large amounts of data are retained, with higher ingest volumes and longer retention periods. Throughput Messaging systems are not designed to manage big “catch-up” reads. Streaming systems are designed to scale and can handle use cases such as catch-up reads. Differences in Consumption
  • 25. byte[] msgIdBytes = // Some byte array MessageId id = MessageId.fromByteArray(msgIdBytes); Reader<byte[]> reader = pulsarClient.newReader() .topic(topic) .startMessageId(id) .create(); Create a reader that will read from some message between earliest and latest. Reader Apache Pulsar Reader Interface
  • 26. ● New Consumer type added in Pulsar 2.10 that provides a continuously updated key-value map view of compacted topic data. ● An abstraction of a changelog stream from a primary-keyed table, where each record in the changelog stream is an update on the primary-keyed table with the record key as the primary key. ● READ ONLY DATA STRUCTURE! Apache Pulsar TableView
  • 27. Schema Registry schema-1 (value=Avro/Protobuf/JSON) schema-2 (value=Avro/Protobuf/JSON) schema-3 (value=Avro/Protobuf/JSON) Schema Data ID Local Cache for Schemas + Schema Data ID + Local Cache for Schemas Send schema-1 (value=Avro/Protobuf/JSON) data serialized per schema ID Send (register) schema (if not in local cache) Read schema-1 (value=Avro/Protobuf/JSON) data deserialized per schema ID Get schema by ID (if not in local cache) Producers Consumers Schema Registry
  • 28. ● Utilizing JSON Data with a JSON Schema ● Consistency, Contracts, Clean Data ● This enables easy SQL: ○ Pulsar SQL (Presto SQL) ○ Flink SQL ○ Spark Structured Streaming Use Schemas
  • 29. • Functions - Lightweight Stream Processing (Java, Python, Go) • Connectors - Sources & Sinks (Cassandra, Kafka, …) • Protocol Handlers - AoP (AMQP), KoP (Kafka), MoP (MQTT) • Processing Engines - Flink, Spark, Presto/Trino via Pulsar SQL • Data Offloaders - Tiered Storage - (S3) Sources, Sinks and Processing
  • 31. MQTT on Pulsar (MoP)
  • 32. AMQP on Pulsar (AoP)
  • 33. Use Apache Pulsar For Ingest
  • 34. Use Apache Pulsar To Stream to Lakehouses
  • 35. Pulsar SQL Presto/Trino workers can read segments directly from bookies (or offloaded storage) in parallel. Bookie 1 Segment 1 Producer Consumer Broker 1 Topic1-Part1 Broker 2 Topic1-Part2 Broker 3 Topic1-Part3 Segment 2 Segment 3 Segment 4 Segment X Segment 1 Segment 1 Segment 1 Segment 3 Segment 3 Segment 3 Segment 2 Segment 2 Segment 2 Segment 4 Segment 4 Segment 4 Segment X Segment X Segment X Bookie 2 Bookie 3 Query Coordinator ... ... SQL Worker SQL Worker SQL Worker SQL Worker Query Topic Metadata
  • 36. ● Event-Driven services that are interested in events from other services can use Pulsar subscriptions to receive new events in the stream. ● Pulsar supports both linear (exclusive) and filtered subscriptions (key-shared). ● You can also use StreamNative’s filtering & routing service to provide filtered subscriptions. Subscriptions
  • 38. ● Visual Question and Answer ● Natural Language Processing ● Sentiment Analysis ● Text Classification ● Named Entity Recognition ● Content-based Recommendations • Predictive Maintenance • Fault Detection • Fraud Detection • Time-Series Predictions • Naive Bayes Apache Pulsar Functions for ML Models
  • 39. Benefits of Pulsar Functions • Allows you to focus on the business logic. • Sets up producer/consumer for you, eliminating the need for boilerplate code. • Handles message consumption and publication for you. • Run as threads, processes or Kubernetes Pods inside Pulsar. No need for another processing framework. Can scale up automatically.
  • 40. Event-Based Microservices Application • A basic order entry use case for a food delivery service will involve several different microservices. • The communicate via messages sent to Pulsar topics. • A service can also subscribe to the events published by another service.
  • 41. Function Mesh Pulsar Functions, along with Pulsar IO/Connectors, provide a powerful API for ingesting, transforming, and outputting data. Function Mesh, another StreamNative project, makes it easier for developers to create entire applications built from sources, functions, and sinks all through a declarative API.
  • 42. ML • Visual Question and Answer • NLP (Natural Language Processing) • Sentiment Analysis • Text Classification • Named Entity Recognition • Content-based Recommendations • Predictive Maintenance • Fault Detection • Fraud Detection • Time-Series Predictions • Naive Bayes
  • 43. Why Pulsar Functions for Microservices? Desired Characteristic Pulsar Functions… Highly maintainable and testable Are small pieces of code written in popular languages such as Java, Python, or Go. They can be easily maintained in source control repositories and tested with existing frameworks automatically. Loosely coupled with other services Are not directly linked to one another and communicate via messages. Independently deployable Are designed to be deployed independently Can be developed by a small team Are often developed by a single developer. Inter-service Communication Support all message patterns using Pulsar as the underlying message bus. Deployment & Composition Can run as individual threads, processes, or K8s pods. The Function Mesh allows you to deploy multiple Pulsar Functions as a single unit.
  • 44. Built-in Back Pressure Producer<String> producer = client.newProducer(Schema.STRING) .topic("hellotopic") .blockIfQueueFull(true) // enable blocking // when queue is full .maxPendingMessages(10) // max queue size .create(); // During Back Pressure: the sendAsync call blocks // with no room in queues producer.newMessage() .key("mykey") .value("myvalue") .sendAsync(); // can be a blocking call
  • 45. ML Java Coding (Deep Java Library)
  • 46. ● Lightweight computation similar to AWS Lambda. ● Specifically designed to use Apache Pulsar as a message bus. ● Function runtime can be located within Pulsar Broker. ● Java Functions A serverless event streaming framework Pulsar Functions
  • 47. ● Consume messages from one or more Pulsar topics. ● Apply user-supplied processing logic to each message. ● Publish the results of the computation to another topic. ● Support multiple programming languages (Java, Python, Go) ● Can leverage 3rd-party libraries to support the execution of ML models on the edge. Pulsar Functions
  • 48. Simple Native Java Function
  • 49. Simple Function with Pulsar SDK
  • 50. from pulsar import Function from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer import json class Chat(Function): def __init__(self): pass def process(self, input, context): fields = json.loads(input) sid = SentimentIntensityAnalyzer() ss = sid.polarity_scores(fields["comment"]) row = { } row['id'] = str(msg_id) if ss['compound'] < 0.00: row['sentiment'] = 'Negative' else: row['sentiment'] = 'Positive' row['comment'] = str(fields["comment"]) json_string = json.dumps(row) return json_string Entire Function Pulsar Python NLP Function
  • 51. Starting a Function - Distributed Cluster Once compiled into a JAR, start a Pulsar Function in a distributed cluster:
  • 52. Pulsar Functions ● Route ● Enrich ● Convert ● Lookups ● Run Machine Learning ● Logging ● Auditing ● Parse ● Split ● Convert
  • 53. Pulsar Functions ● Consume messages from one or more Pulsar topics. ● Apply user-supplied processing logic to each message. ● Publish the results of the computation to another topic. ● Support multiple programming languages (Java, Python, Go) ● Can leverage 3rd-party libraries to support the execution of ML models on the edge.
  • 54. Building Real-Time Apps Requires a Team
  • 55.
  • 58. Deploying AI With an Event-Driven Platform https://dzone.com/trendreports/enterprise-ai-1
  • 59. Apache Pulsar in Action http://tinyurl.com/bdha5p4r Please enjoy David’s complete book which is the ultimate guide to Pulsar.
  • 62. Github ● https://github.com/tspannhw/FLiP-Py-ADS-B ● https://github.com/tspannhw/pulsar-adsb-function ● https://github.com/tspannhw/pulsar-weather-function ● https://github.com/tspannhw/spring-pulsar-airquality ● https://github.com/tspannhw/pulsar-airquality-function ● https://github.com/tspannhw/FLiP-Current22-LetsMonitorAllTheThings ● https://github.com/tspannhw/FLiP-Transit ● https://github.com/tspannhw/SmartTransit ● https://github.com/tspannhw/airquality-kafka-consumer
  • 63. Scan the QR code to learn more about Apache Pulsar and StreamNative.
  • 64. Scan the QR code to build your own apps today.