SlideShare a Scribd company logo
Matteo Merli & Sijie Guo
fast, durable, flexible pub/sub messaging
Overview
What is Apache Pulsar?
3
Ordering
Guaranteed ordering
Multi-tenancy
A single cluster can
support many tenants
and use cases
High throughput
Can reach 1.8 M
messages/s in a
single partition
Durability
Data replicated and
synced to disk
Geo-replication
Out of box support for
geographically
distributed
applications
Unified messaging
model
Support both Topic &
Queue semantic in a
single model
Delivery Guarantees
At least once, at most
once and effectively once
Low Latency
Low publish latency of
5ms at 99pct
Highly scalable
Can support millions of
topics
Usage of Pulsar
• In production for 3+ years at Yahoo
• Powering critical products like:
• Yahoo Mail, Yahoo Finance, Gemini Ads,
Flickr and Sherpa (NoSQL database)
• 80+ tenants
• 2.3 Million topics
• 100 B messages / day
• Full-mesh replication in 8 data-centers
4
Why build a new system?
• No existing solution to satisfy requirements
• Multi tenant — 1M topics — Low latency — Durability — Geo replication
• Kafka doesn’t scale well with many topics:
• Storage model based on individual directory per topic partition
• Enabling durability kills the performance
• Many other choking points: getting stats, access to metadata, flow-control
• Operations are not very convenient
• eg: replacing a server, manual commands to copy the data and involves clients
• clients access to ZK clusters not desirable
• Ability to manage large backlogs
• No scalable support to keep consumer position
5
Architecture view
Separate layers between
brokers bookies
• Broker and bookies can
be added
independently
• Traffic can be shifted
very quickly across
brokers
• New bookies will ramp
up on traffic quickly
6
Pulsar Broker 1 Pulsar Broker 1 Pulsar Broker 1
Bookie 1 Bookie 2 Bookie 3 Bookie 4 Bookie 5
Apache BookKeeper
Apache Pulsar
Producer Consumer
Apache BookKeeper
• A replicated log storage
• Low-latency durable writes
• Simple repeatable read
consistency
• Highly available
• Store many logs per node
• I/O Isolation
7
Kafka vs. Pulsar Segment
Distribution
BookKeeper - Storage
• A single bookie can serve and
store thousands of ledgers
• Write and read paths are
separated:
• Avoid read activity to impact
write latency
• Writes are added to in-
memory write-cache and
committed to journal
• Write cache is flushed in
background to separated
device
• Entries are sorted to allow for
mostly sequential reads
9
Concepts
Pulsar Concepts
Support both topic and queue semantics in a unified topic
concept
11
Pulsar Namespace
12
Pulsar Subscription
13
Partitioned Topic
14
Pulsar client library
• Java — C++ — Python — WebSocket APIs
• Partitioned topics
• Apache Kafka compatibility wrapper API
• Transparent batching of messages
• Compression
• TLS encryption and authentication
• End-to-end encryption
15
Demo time
16
Demo time
17
Demo time
18
Lunch Break
Use cases - Message Queue
• Decouple online / background
• Provide high-availability
• Reliable data transport
20
Online
events
Pulsar
topic 1
Worker 1
Worker 2
Worker 3
Pulsar
topic 2
Low latency
publish
Long running task
Notification
Use cases - Message Queue
• Async processing of time intensive background tasks
• Publishes and complete HTTP request
• Examples: Image / Video transcoding, bulk operations (large folder deletions)
21
Use cases - Message Queue
22
• Delegate the processing to multiple compute jobs which interact with micro-services
• Wait until the response is pushed back to the HTTP server
• Examples: breaking monolithic applications into micro-services
Use cases - Notifications
• Change data capture
• Listeners are frequently different tenants
• Quotas needs to ensure producer is not affected
23
Event
Pulsar
topic
Component 1
Component 2
Component 3
Listeners
Use cases - Notifications
• Replicating DB transactions across regions
• Notification to multiple interested tenants (example: new user
account, …)
24
Use cases - Feedback system
• Coordinate a large number of machines
• Propagate state
25
External
inputs
Pulsar
topic 1
Serving
system
Serving
system
Serving
system
Pulsar
topic 2
Controller
Updates
Feedback
Multi Tenancy
Multi-Tenancy
• Authentication / Authorization / Namespaces / Admin APIs
• I/O Isolations between writes and reads
• Provided by BookKeeper - Ensure readers draining backlog won’t
affect publishers
• Soft isolation
• Storage quotas — flow-control — back-pressure — rate limiting
• Hardware isolation
• Constrain some tenants on a subset of brokers or bookies
27
Demo time
28
Geo-Replication
Geo-Replication
• Scalable
asynchronous
replication
• Integrated in the
broker message
flow
• Simple
configuration to
add/remove
regions
30
Topic (T1) Topic (T1)
Topic (T1)
Subscrip@on (S1) Subscrip@on (S1)
Producer
(P1)
Consumer
(C1)
Producer
(P3)
Producer
(P2)
Consumer
(C2)
Data Center A Data Center B
Data Center C
Group Exercise
31
Pulsar — Conclusion
• A fast durable, distributed pub/sub messaging
• Flexible Traditional Messaging: Queuing and Pub/Sub
• Focus on message dispatch and consumption
• Remove data as soon as possible if they are not needed
• It is backed by a scalable log store
• durable message store, zero data loss
• allow rewinding / reprocessing messages for backfill, bootstrap systems and
stream computing
• It carries all the fantastic features from the log store.
32
Real-Time Solution
33
Curious to Learn More?
• Apache Pulsar : http://pulsar.incubator.apache.org
• Apache DistributedLog : http://bookkeeper.apache.org/
distributedlog
• Apache BookKeeper : http://bookkeeper.apache.org
• Follow Us @apache_pulsar @asfbookkeeper
@distributedlog
34
Curious to Learn More?
• Messaging, Storage, or Both: https://streaml.io/blog/
messaging-storage-or-both/
• Why BookKeeper: Consistency, Durability and Availability:
https://streaml.io/blog/why-apache-bookkeeper/
• Introduction to Apache Pulsar: https://streaml.io/blog/intro-
to-pulsar/
• Kafka vs DistributedLog: https://bookkeeper.apache.org/
distributedlog/technical-review/2016/09/19/kafka-vs-
distributedlog.html
35
Curious to learn more about
Streamlio?
• Streamlio: https://streaml.io
• Sandbox Preview: https://streaml.io/docs/getting-started
• Learn slack channel: https://learn-streamlio.slack.com

36

More Related Content

What's hot

Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
Ricardo Paiva
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platform
Matteo Merli
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with Kafka
László-Róbert Albert
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
Martin Podval
 
I Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache KafkaI Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache Kafka
Jay Kreps
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
MapR Technologies
 
kafka
kafkakafka
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
Apache con2016final
Apache con2016final Apache con2016final
Apache con2016final
Salesforce
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
Angelo Cesaro
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
Clement Demonchy
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
StreamNative
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
confluent
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Srikrishna k
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
David Groozman
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
Timothy Spann
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
confluent
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
PivotalOpenSourceHub
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! Japan
StreamNative
 

What's hot (20)

Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platform
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with Kafka
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
I Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache KafkaI Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache Kafka
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
 
kafka
kafkakafka
kafka
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
 
Apache con2016final
Apache con2016final Apache con2016final
Apache con2016final
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! Japan
 

Similar to Hands-on Workshop: Apache Pulsar

AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
Chhavi Parasher
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
Max Alexejev
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
NParakh1
 
Kafka overview v0.1
Kafka overview v0.1Kafka overview v0.1
Kafka overview v0.1
Mahendran Ponnusamy
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
jimriecken
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
Yahoo Developer Network
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
Knoldus Inc.
 
Kafka - Messaging System
Kafka - Messaging SystemKafka - Messaging System
Kafka - Messaging SystemTanuj Mehta
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
emreakis
 
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
NETWAYS
 
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
NETWAYS
 
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
HostedbyConfluent
 
(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference
Timothy Spann
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio, Inc.
 
Taking the open cloud to 11
Taking the open cloud to 11Taking the open cloud to 11
Taking the open cloud to 11
Joe Brockmeier
 
Evaluating Streaming Data Solutions
Evaluating Streaming Data SolutionsEvaluating Streaming Data Solutions
Evaluating Streaming Data Solutions
Streamlio
 
Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for ML
Edunomica
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using Kafka
Levon Avakyan
 

Similar to Hands-on Workshop: Apache Pulsar (20)

AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
 
Kafka overview v0.1
Kafka overview v0.1Kafka overview v0.1
Kafka overview v0.1
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
 
Kafka - Messaging System
Kafka - Messaging SystemKafka - Messaging System
Kafka - Messaging System
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
 
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
 
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
 
(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
 
Taking the open cloud to 11
Taking the open cloud to 11Taking the open cloud to 11
Taking the open cloud to 11
 
Evaluating Streaming Data Solutions
Evaluating Streaming Data SolutionsEvaluating Streaming Data Solutions
Evaluating Streaming Data Solutions
 
Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for ML
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using Kafka
 

Recently uploaded

1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
Laura Szabó
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
vmemo1
 
2.Cellular Networks_The final stage of connectivity is achieved by segmenting...
2.Cellular Networks_The final stage of connectivity is achieved by segmenting...2.Cellular Networks_The final stage of connectivity is achieved by segmenting...
2.Cellular Networks_The final stage of connectivity is achieved by segmenting...
JeyaPerumal1
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
SEO Article Boost
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
nhiyenphan2005
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
harveenkaur52
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
Danica Gill
 
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
cuobya
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
CIOWomenMagazine
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
cuobya
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
cuobya
 

Recently uploaded (20)

1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
 
2.Cellular Networks_The final stage of connectivity is achieved by segmenting...
2.Cellular Networks_The final stage of connectivity is achieved by segmenting...2.Cellular Networks_The final stage of connectivity is achieved by segmenting...
2.Cellular Networks_The final stage of connectivity is achieved by segmenting...
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
 
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
 

Hands-on Workshop: Apache Pulsar

  • 1. Matteo Merli & Sijie Guo fast, durable, flexible pub/sub messaging
  • 3. What is Apache Pulsar? 3 Ordering Guaranteed ordering Multi-tenancy A single cluster can support many tenants and use cases High throughput Can reach 1.8 M messages/s in a single partition Durability Data replicated and synced to disk Geo-replication Out of box support for geographically distributed applications Unified messaging model Support both Topic & Queue semantic in a single model Delivery Guarantees At least once, at most once and effectively once Low Latency Low publish latency of 5ms at 99pct Highly scalable Can support millions of topics
  • 4. Usage of Pulsar • In production for 3+ years at Yahoo • Powering critical products like: • Yahoo Mail, Yahoo Finance, Gemini Ads, Flickr and Sherpa (NoSQL database) • 80+ tenants • 2.3 Million topics • 100 B messages / day • Full-mesh replication in 8 data-centers 4
  • 5. Why build a new system? • No existing solution to satisfy requirements • Multi tenant — 1M topics — Low latency — Durability — Geo replication • Kafka doesn’t scale well with many topics: • Storage model based on individual directory per topic partition • Enabling durability kills the performance • Many other choking points: getting stats, access to metadata, flow-control • Operations are not very convenient • eg: replacing a server, manual commands to copy the data and involves clients • clients access to ZK clusters not desirable • Ability to manage large backlogs • No scalable support to keep consumer position 5
  • 6. Architecture view Separate layers between brokers bookies • Broker and bookies can be added independently • Traffic can be shifted very quickly across brokers • New bookies will ramp up on traffic quickly 6 Pulsar Broker 1 Pulsar Broker 1 Pulsar Broker 1 Bookie 1 Bookie 2 Bookie 3 Bookie 4 Bookie 5 Apache BookKeeper Apache Pulsar Producer Consumer
  • 7. Apache BookKeeper • A replicated log storage • Low-latency durable writes • Simple repeatable read consistency • Highly available • Store many logs per node • I/O Isolation 7
  • 8. Kafka vs. Pulsar Segment Distribution
  • 9. BookKeeper - Storage • A single bookie can serve and store thousands of ledgers • Write and read paths are separated: • Avoid read activity to impact write latency • Writes are added to in- memory write-cache and committed to journal • Write cache is flushed in background to separated device • Entries are sorted to allow for mostly sequential reads 9
  • 11. Pulsar Concepts Support both topic and queue semantics in a unified topic concept 11
  • 15. Pulsar client library • Java — C++ — Python — WebSocket APIs • Partitioned topics • Apache Kafka compatibility wrapper API • Transparent batching of messages • Compression • TLS encryption and authentication • End-to-end encryption 15
  • 20. Use cases - Message Queue • Decouple online / background • Provide high-availability • Reliable data transport 20 Online events Pulsar topic 1 Worker 1 Worker 2 Worker 3 Pulsar topic 2 Low latency publish Long running task Notification
  • 21. Use cases - Message Queue • Async processing of time intensive background tasks • Publishes and complete HTTP request • Examples: Image / Video transcoding, bulk operations (large folder deletions) 21
  • 22. Use cases - Message Queue 22 • Delegate the processing to multiple compute jobs which interact with micro-services • Wait until the response is pushed back to the HTTP server • Examples: breaking monolithic applications into micro-services
  • 23. Use cases - Notifications • Change data capture • Listeners are frequently different tenants • Quotas needs to ensure producer is not affected 23 Event Pulsar topic Component 1 Component 2 Component 3 Listeners
  • 24. Use cases - Notifications • Replicating DB transactions across regions • Notification to multiple interested tenants (example: new user account, …) 24
  • 25. Use cases - Feedback system • Coordinate a large number of machines • Propagate state 25 External inputs Pulsar topic 1 Serving system Serving system Serving system Pulsar topic 2 Controller Updates Feedback
  • 27. Multi-Tenancy • Authentication / Authorization / Namespaces / Admin APIs • I/O Isolations between writes and reads • Provided by BookKeeper - Ensure readers draining backlog won’t affect publishers • Soft isolation • Storage quotas — flow-control — back-pressure — rate limiting • Hardware isolation • Constrain some tenants on a subset of brokers or bookies 27
  • 30. Geo-Replication • Scalable asynchronous replication • Integrated in the broker message flow • Simple configuration to add/remove regions 30 Topic (T1) Topic (T1) Topic (T1) Subscrip@on (S1) Subscrip@on (S1) Producer (P1) Consumer (C1) Producer (P3) Producer (P2) Consumer (C2) Data Center A Data Center B Data Center C
  • 32. Pulsar — Conclusion • A fast durable, distributed pub/sub messaging • Flexible Traditional Messaging: Queuing and Pub/Sub • Focus on message dispatch and consumption • Remove data as soon as possible if they are not needed • It is backed by a scalable log store • durable message store, zero data loss • allow rewinding / reprocessing messages for backfill, bootstrap systems and stream computing • It carries all the fantastic features from the log store. 32
  • 34. Curious to Learn More? • Apache Pulsar : http://pulsar.incubator.apache.org • Apache DistributedLog : http://bookkeeper.apache.org/ distributedlog • Apache BookKeeper : http://bookkeeper.apache.org • Follow Us @apache_pulsar @asfbookkeeper @distributedlog 34
  • 35. Curious to Learn More? • Messaging, Storage, or Both: https://streaml.io/blog/ messaging-storage-or-both/ • Why BookKeeper: Consistency, Durability and Availability: https://streaml.io/blog/why-apache-bookkeeper/ • Introduction to Apache Pulsar: https://streaml.io/blog/intro- to-pulsar/ • Kafka vs DistributedLog: https://bookkeeper.apache.org/ distributedlog/technical-review/2016/09/19/kafka-vs- distributedlog.html 35
  • 36. Curious to learn more about Streamlio? • Streamlio: https://streaml.io • Sandbox Preview: https://streaml.io/docs/getting-started • Learn slack channel: https://learn-streamlio.slack.com
 36