SlideShare a Scribd company logo
content page
● SME for pulsar at Nutanix
● Love
○ Distributed systems
○ Open source
● Hands on developer, aspiring architect
● Love spending time with data (stores,
steam, analytics etc)
● Contributions to pulsar & MySQL
Who am I ?
https://www.linkedin.com/in/shivjijha/
https://twitter.com/ShivjiJha
Catalogue
• Pulsar modules
• Data stores
• Distributed storage
• Storage internals (Read / Write)
• Tying together (Metadata layer)
Pulsar Modules
Pulsar Modules
SERVING LAYER
METADATA LAYER
STORAGE LAYER
CLIENT
Pulsar Modules
SERVING LAYER (BROKER)
METADATA LAYER (ZOOKEEPER)
STORAGE LAYER (BOOKKEEPER)
CLIENT
Pulsar Data Stores
Pulsar Data Stores
SERVING LAYER (BROKER)
METADATA LAYER (ZOOKEEPER)
STORAGE LAYER (BOOKKEEPER)
CLIENT
Pulsar Data Stores
SERVING LAYER (BROKER)
METADATA LAYER (ZOOKEEPER)
BOOKKEEPER
CLIENT
OBJECT STORE
Pulsar Data Stores
SERVING LAYER
(BROKER)
METADATA LAYER (ZOOKEEPER)
BOOKKEEPER
CLIENT
OBJECT STORE
CACHE
Pulsar Data Stores : Overview
• Broker Cache
• Single broker owner for topic
• Primary Store
• Bookkeeper
• Cold Store
• Object Store
• Metadata Layer
• Zookeeper
Pulsar @ scale
scale compute
Pulsar @ scale
scale compute
scale storage
Pulsar @ scale
Pulsar Data Stores : Bookkeeper
LEDGER
Ledger Metadata
Status : Open
Last Entry Id : -1
Ensemble Size
Write Quorum Size
Read Quorum Size
Ensembles : [ [], [] ]
Pulsar Data Stores : Bookkeeper
LEDGER
entry 0
Ledger Metadata
Status : Open
Last Entry Id : 0
Ensemble Size
Write Quorum Size
Read Quorum Size
Ensembles : [ [], [] ]
Pulsar Data Stores : Bookkeeper
LEDGER
entry 0
entry 1
entry 2
Ledger Metadata
Status : Closed
Last Entry Id : 2
Ensemble Size
Write Quorum Size
Read Quorum Size
Ensembles : [ [], [] ]
Pulsar Data Stores : Bookkeeper
LEDGER
entry 0
entry 1
entry 2
Entry Data
Metadata:
1.LedgerId
2.EntryId
3.Last Add Confirmed
4.Digest (CRC32 / CRC32C)
Data : byte []
Pulsar Data Stores : Bookkeeper
Pulsar Data Stores : Write Path
1.Client sends write on topic
2.The owner broker (bookkeeper client)
writes to current ensemble of bookies
3.Client waits for #ackQuorum acks
4.Acknowledge to client.
Pulsar Data Stores : Write Path
1.Client sends write on topic
2.The owner broker (bookkeeper client)
writes to current ensemble of bookies
3.Client waits for #ackQuorum acks
4.Acknowledge to client.
New ledger registered in zookeeper
against topicName.
Pulsar Data Stores : Read Path
1.Client sends read on topic
2.The owner broker searches local
cache.
3.If cache miss, read from closest
bookie.
a. On failure try other bookie
Pulsar Data Stores : Bookkeeper
• Internally, journal for short lived write queue
• Facilitates quick writes (appends)
• Write cache for organized and flush writes
• Read cache for quick reads
• if data not in broker cache
• if data not in write cache
• Ledgers for organized data interleaving topics.
• RocksDB to index data ledger by ledger
• Topic to ledger(s) mapping in zookeeper.
Pulsar Data Stores : Bookkeeper
● Consistency over availability
● Durability
○ Ensemble
○ Write Quorum
○ Ack Quorum
● Configure how much / how long to store in bookkeeper
● Configure replication factor
● Choose durability vs latency / throughput
Pulsar Data Stores : Cold Store
• Jcloud library
• aws-s3
• google-cloud-storage
• filesystem
• Broker writes to cold store, not bookkeeper.
• bookkeeper => broker => object store
• too much bandwidth?
• Why not offload directly from bookie to object store?
• Schema not stored with data in object store.
• consumers like flink, presto etc can work with just bookkeeper
Distributed Storage
Pulsar Data Stores : Distribution
tenant
namespace
bundle1 bundle2 bundle3
T
O
P
I
C
1
T
O
P
I
C
2
T
O
P
I
C
3
T
O
P
I
C
4
T
O
P
I
C
5
T
O
P
I
C
6
• Shards (aka namespace Bundles) for load balancing
Pulsar Data Stores : Distribution
tenant
namespace
bundles
• Shards (aka namespace Bundles) for load balancing
topic
ledgers
Pulsar Data Stores : Distribution
tenant
namespace
bundles
• Shards (aka namespace Bundles) for load balancing
topic
ledgers
ledgers[], schemaLedgers[],
compactedLedgers[]
Pulsar Data Stores : Distribution
tenant
namespace
bundles
• Shards (aka namespace Bundles) for load balancing
topic
ledgers
ledgerId, entries range,
ledger size, offloaded?
ledgers[], schemaLedgers[],
compactedLedgers[]
Bookkeeper Internals
Bookkeeper: Client & Server
• Bookkeeper has no leader / follower.
• All bookies have same responsibilities.
• Thick bookie client implements replication, consistency etc
• Simple bookie APIs
• Resources
• Ledger
• Entry
Bookkeeper: Client & Server
ENTRY <L1 EO>
ENTRY <L1 E1>
ENTRY <L1 E2>
ENTRY<L2 EO>
ENTRY<L2 E1>
ENTRY <L1 E3>
ENTRY <L2
E2>
ENTRY <L3 EO>
BOOKKEEPER
CLIENT
BROKER
JOURNAL 0
DISK
BOOKKEEPER
Bookkeeper: Client & Server
ENTRY <L1 EO>
ENTRY <L1 E1>
ENTRY <L1 E2>
ENTRY<L2 EO>
ENTRY<L2 E1>
ENTRY <L1 E3>
ENTRY <L2
E2>
ENTRY <L3 EO>
BOOKKEEPER
CLIENT
BROKER
JOURNAL 0
DISK
BOOKKEEPER
• First step in pulsar write path
• Distributed WAL
• like databases
(MySQL, postgres etc)
• sequential writes
• no reads from journal
• write cache
• write isolation with
separate disks for journal
and ledger
Bookkeeper: Client & Server
ENTRY <L1 EO>
ENTRY <L1 E1>
ENTRY <L1 E2>
ENTRY<L2 EO>
ENTRY<L2 E1>
ENTRY <L1 E3>
ENTRY <L2
E2>
ENTRY <L3 EO>
BOOKKEEPER
CLIENT
BROKER
JOURNAL 0
WRITE
CACHE
DISK
MEMORY
BOOKKEEPER
Bookkeeper: Read Cache
write cache
read
cache
entry log
L1 index
L2 index
flush
Bookkeeper: Read Cache
write cache
read
cache
entry log
L1 index
L2 index
flush
entry =>
message batch
Pulsar Data Stores : Read Write Internals
Bookkeeper: Ledgers
• Sequential reads
• Still interleaved across topics
• indexed
• rocksDB
• (ledger, entry id => log file, offset)
• Read path:
• Broker çache
• no n/w trip,
• no disk access
• Bookkeeper
• Write Cache
• Read Cache
• RocksDB index
• Access from disk
Bookkeeper: Ledgers
• Sequential reads
• Still interleaved across topics
• indexed
• rocksDB
• (ledger, entry id => log file, offset)
• Read path:
• Broker çache
• no n/w trip,
• no disk access
• Bookkeeper
• Write Cache
• Read Cache
• RocksDB index
• Access from disk
Bookkeeper: LAC & LAP
• LAC : Last add confirmed
• In response to write().
• This entry and all previous written (cumulative ack).
• As a result of the sequential write.
• LAP : Last add pushed
• Readers can read until LAC
Bookkeeper: Fencing
• Recovery after
• bookie failure
• network partition b/w broker and bookkeeper
• New bookie
• Puts ledger state in recovery,
• Fences the ledger with consensus
• Writes to new ledger.
• Old owner can’t write to ledger anymore.
• Consistency
• No split brain
Metadata Layer
Metadata Layer
1.Pointers to data
a.Topic ledgers mapping
b.Ledger topics mapping
c.Topic schema mapping
2.Service Discovery
a.List of available bookies (read / write / both?)
b.List of available brokers
c.Which broker owns which topic
d.How much load on which topic etc
3.Distributed coordination
a.Locks
b.Leader election
Metadata Layer
4. System Configuration
a.Dynamic configurations for hot reload
b.feature flags
5. Provisioning Configuration
a.Metadata for tenants, namespaces etc
b.Namespace policies
I want to know more..
References:
• https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-
works
• Pulsar without Zookeeper: Introducing the Metadata Access Layer in Pulsar
• TGI Pulsar 009: Introduction of Apache BookKeeper
Q & A time
Drop me a hello at:
https://www.linkedin.com/in/shivjijha/
https://twitter.com/ShivjiJha
Pulsar community
https://apache-pulsar.slack.com/
users@pulsar.apache.org
dev@pulsar.apache.org

More Related Content

What's hot

Working with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBWorking with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDB
ScaleGrid.io
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
botsplash.com
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 
Learning postgresql
Learning postgresqlLearning postgresql
Learning postgresql
DAVID RAUDALES
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
Altinity Ltd
 
PostgreSQLの範囲型と排他制約
PostgreSQLの範囲型と排他制約PostgreSQLの範囲型と排他制約
PostgreSQLの範囲型と排他制約Akio Ishida
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
NTT DATA Technology & Innovation
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
Lucian Oprea
 
マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)
マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)
マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)
NTT DATA Technology & Innovation
 
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming Patterns
Hao Chen
 
MySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comMySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.com
Jean-François Gagné
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
Ontico
 
InnoDB Performance Optimisation
InnoDB Performance OptimisationInnoDB Performance Optimisation
InnoDB Performance Optimisation
Mydbops
 
PostgreSQL失敗談
PostgreSQL失敗談PostgreSQL失敗談
PostgreSQL失敗談
Takashi Meguro
 
pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
ProxySQL Tutorial - PLAM 2016
ProxySQL Tutorial - PLAM 2016ProxySQL Tutorial - PLAM 2016
ProxySQL Tutorial - PLAM 2016Derek Downey
 
超実践 Cloud Spanner 設計講座
超実践 Cloud Spanner 設計講座超実践 Cloud Spanner 設計講座
超実践 Cloud Spanner 設計講座
Samir Hammoudi
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
Norberto Leite
 
MongoDBの監視
MongoDBの監視MongoDBの監視
MongoDBの監視
Tetsutaro Watanabe
 

What's hot (20)

Working with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBWorking with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDB
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Learning postgresql
Learning postgresqlLearning postgresql
Learning postgresql
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
 
PostgreSQLの範囲型と排他制約
PostgreSQLの範囲型と排他制約PostgreSQLの範囲型と排他制約
PostgreSQLの範囲型と排他制約
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
 
マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)
マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)
マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)
 
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming Patterns
 
MySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comMySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.com
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
InnoDB Performance Optimisation
InnoDB Performance OptimisationInnoDB Performance Optimisation
InnoDB Performance Optimisation
 
PostgreSQL失敗談
PostgreSQL失敗談PostgreSQL失敗談
PostgreSQL失敗談
 
pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)
pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)
 
ProxySQL Tutorial - PLAM 2016
ProxySQL Tutorial - PLAM 2016ProxySQL Tutorial - PLAM 2016
ProxySQL Tutorial - PLAM 2016
 
超実践 Cloud Spanner 設計講座
超実践 Cloud Spanner 設計講座超実践 Cloud Spanner 設計講座
超実践 Cloud Spanner 設計講座
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
MongoDBの監視
MongoDBの監視MongoDBの監視
MongoDBの監視
 

Similar to How Pulsar Stores Your Data - Pulsar Summit NA 2021

How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
Shivji Kumar Jha
 
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use casesApache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Shivji Kumar Jha
 
Apache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseApache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use case
Salesforce Engineering
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
Mauro Pompilio
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
larsgeorge
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
confluent
 
Apache con2016final
Apache con2016final Apache con2016final
Apache con2016final
Salesforce
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
StreamNative
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
Object Relational Database Management System
Object Relational Database Management SystemObject Relational Database Management System
Object Relational Database Management System
Amar Myana
 
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed Storage
Streamlio
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.
Julian Hyde
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
confluent
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
Joe Alex
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
Rainforest QA
 
Pulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scalePulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scale
Matteo Merli
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
Richard Schneeman
 
An introduction to Pincaster
An introduction to PincasterAn introduction to Pincaster
An introduction to Pincaster
Frank Denis
 

Similar to How Pulsar Stores Your Data - Pulsar Summit NA 2021 (20)

How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
 
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use casesApache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
 
Apache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseApache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use case
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Apache con2016final
Apache con2016final Apache con2016final
Apache con2016final
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Object Relational Database Management System
Object Relational Database Management SystemObject Relational Database Management System
Object Relational Database Management System
 
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed Storage
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
 
Pulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scalePulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scale
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
 
An introduction to Pincaster
An introduction to PincasterAn introduction to Pincaster
An introduction to Pincaster
 

More from StreamNative

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
StreamNative
 

More from StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 

How Pulsar Stores Your Data - Pulsar Summit NA 2021

  • 2. ● SME for pulsar at Nutanix ● Love ○ Distributed systems ○ Open source ● Hands on developer, aspiring architect ● Love spending time with data (stores, steam, analytics etc) ● Contributions to pulsar & MySQL Who am I ? https://www.linkedin.com/in/shivjijha/ https://twitter.com/ShivjiJha
  • 3. Catalogue • Pulsar modules • Data stores • Distributed storage • Storage internals (Read / Write) • Tying together (Metadata layer)
  • 5. Pulsar Modules SERVING LAYER METADATA LAYER STORAGE LAYER CLIENT
  • 6. Pulsar Modules SERVING LAYER (BROKER) METADATA LAYER (ZOOKEEPER) STORAGE LAYER (BOOKKEEPER) CLIENT
  • 8. Pulsar Data Stores SERVING LAYER (BROKER) METADATA LAYER (ZOOKEEPER) STORAGE LAYER (BOOKKEEPER) CLIENT
  • 9. Pulsar Data Stores SERVING LAYER (BROKER) METADATA LAYER (ZOOKEEPER) BOOKKEEPER CLIENT OBJECT STORE
  • 10. Pulsar Data Stores SERVING LAYER (BROKER) METADATA LAYER (ZOOKEEPER) BOOKKEEPER CLIENT OBJECT STORE CACHE
  • 11. Pulsar Data Stores : Overview • Broker Cache • Single broker owner for topic • Primary Store • Bookkeeper • Cold Store • Object Store • Metadata Layer • Zookeeper
  • 15. Pulsar Data Stores : Bookkeeper LEDGER Ledger Metadata Status : Open Last Entry Id : -1 Ensemble Size Write Quorum Size Read Quorum Size Ensembles : [ [], [] ]
  • 16. Pulsar Data Stores : Bookkeeper LEDGER entry 0 Ledger Metadata Status : Open Last Entry Id : 0 Ensemble Size Write Quorum Size Read Quorum Size Ensembles : [ [], [] ]
  • 17. Pulsar Data Stores : Bookkeeper LEDGER entry 0 entry 1 entry 2 Ledger Metadata Status : Closed Last Entry Id : 2 Ensemble Size Write Quorum Size Read Quorum Size Ensembles : [ [], [] ]
  • 18. Pulsar Data Stores : Bookkeeper LEDGER entry 0 entry 1 entry 2 Entry Data Metadata: 1.LedgerId 2.EntryId 3.Last Add Confirmed 4.Digest (CRC32 / CRC32C) Data : byte []
  • 19. Pulsar Data Stores : Bookkeeper
  • 20. Pulsar Data Stores : Write Path 1.Client sends write on topic 2.The owner broker (bookkeeper client) writes to current ensemble of bookies 3.Client waits for #ackQuorum acks 4.Acknowledge to client.
  • 21. Pulsar Data Stores : Write Path 1.Client sends write on topic 2.The owner broker (bookkeeper client) writes to current ensemble of bookies 3.Client waits for #ackQuorum acks 4.Acknowledge to client. New ledger registered in zookeeper against topicName.
  • 22. Pulsar Data Stores : Read Path 1.Client sends read on topic 2.The owner broker searches local cache. 3.If cache miss, read from closest bookie. a. On failure try other bookie
  • 23. Pulsar Data Stores : Bookkeeper • Internally, journal for short lived write queue • Facilitates quick writes (appends) • Write cache for organized and flush writes • Read cache for quick reads • if data not in broker cache • if data not in write cache • Ledgers for organized data interleaving topics. • RocksDB to index data ledger by ledger • Topic to ledger(s) mapping in zookeeper.
  • 24. Pulsar Data Stores : Bookkeeper ● Consistency over availability ● Durability ○ Ensemble ○ Write Quorum ○ Ack Quorum ● Configure how much / how long to store in bookkeeper ● Configure replication factor ● Choose durability vs latency / throughput
  • 25. Pulsar Data Stores : Cold Store • Jcloud library • aws-s3 • google-cloud-storage • filesystem • Broker writes to cold store, not bookkeeper. • bookkeeper => broker => object store • too much bandwidth? • Why not offload directly from bookie to object store? • Schema not stored with data in object store. • consumers like flink, presto etc can work with just bookkeeper
  • 27. Pulsar Data Stores : Distribution tenant namespace bundle1 bundle2 bundle3 T O P I C 1 T O P I C 2 T O P I C 3 T O P I C 4 T O P I C 5 T O P I C 6 • Shards (aka namespace Bundles) for load balancing
  • 28. Pulsar Data Stores : Distribution tenant namespace bundles • Shards (aka namespace Bundles) for load balancing topic ledgers
  • 29. Pulsar Data Stores : Distribution tenant namespace bundles • Shards (aka namespace Bundles) for load balancing topic ledgers ledgers[], schemaLedgers[], compactedLedgers[]
  • 30. Pulsar Data Stores : Distribution tenant namespace bundles • Shards (aka namespace Bundles) for load balancing topic ledgers ledgerId, entries range, ledger size, offloaded? ledgers[], schemaLedgers[], compactedLedgers[]
  • 32. Bookkeeper: Client & Server • Bookkeeper has no leader / follower. • All bookies have same responsibilities. • Thick bookie client implements replication, consistency etc • Simple bookie APIs • Resources • Ledger • Entry
  • 33. Bookkeeper: Client & Server ENTRY <L1 EO> ENTRY <L1 E1> ENTRY <L1 E2> ENTRY<L2 EO> ENTRY<L2 E1> ENTRY <L1 E3> ENTRY <L2 E2> ENTRY <L3 EO> BOOKKEEPER CLIENT BROKER JOURNAL 0 DISK BOOKKEEPER
  • 34. Bookkeeper: Client & Server ENTRY <L1 EO> ENTRY <L1 E1> ENTRY <L1 E2> ENTRY<L2 EO> ENTRY<L2 E1> ENTRY <L1 E3> ENTRY <L2 E2> ENTRY <L3 EO> BOOKKEEPER CLIENT BROKER JOURNAL 0 DISK BOOKKEEPER • First step in pulsar write path • Distributed WAL • like databases (MySQL, postgres etc) • sequential writes • no reads from journal • write cache • write isolation with separate disks for journal and ledger
  • 35. Bookkeeper: Client & Server ENTRY <L1 EO> ENTRY <L1 E1> ENTRY <L1 E2> ENTRY<L2 EO> ENTRY<L2 E1> ENTRY <L1 E3> ENTRY <L2 E2> ENTRY <L3 EO> BOOKKEEPER CLIENT BROKER JOURNAL 0 WRITE CACHE DISK MEMORY BOOKKEEPER
  • 36. Bookkeeper: Read Cache write cache read cache entry log L1 index L2 index flush
  • 37. Bookkeeper: Read Cache write cache read cache entry log L1 index L2 index flush entry => message batch
  • 38. Pulsar Data Stores : Read Write Internals
  • 39. Bookkeeper: Ledgers • Sequential reads • Still interleaved across topics • indexed • rocksDB • (ledger, entry id => log file, offset) • Read path: • Broker çache • no n/w trip, • no disk access • Bookkeeper • Write Cache • Read Cache • RocksDB index • Access from disk
  • 40. Bookkeeper: Ledgers • Sequential reads • Still interleaved across topics • indexed • rocksDB • (ledger, entry id => log file, offset) • Read path: • Broker çache • no n/w trip, • no disk access • Bookkeeper • Write Cache • Read Cache • RocksDB index • Access from disk
  • 41. Bookkeeper: LAC & LAP • LAC : Last add confirmed • In response to write(). • This entry and all previous written (cumulative ack). • As a result of the sequential write. • LAP : Last add pushed • Readers can read until LAC
  • 42. Bookkeeper: Fencing • Recovery after • bookie failure • network partition b/w broker and bookkeeper • New bookie • Puts ledger state in recovery, • Fences the ledger with consensus • Writes to new ledger. • Old owner can’t write to ledger anymore. • Consistency • No split brain
  • 44. Metadata Layer 1.Pointers to data a.Topic ledgers mapping b.Ledger topics mapping c.Topic schema mapping 2.Service Discovery a.List of available bookies (read / write / both?) b.List of available brokers c.Which broker owns which topic d.How much load on which topic etc 3.Distributed coordination a.Locks b.Leader election
  • 45. Metadata Layer 4. System Configuration a.Dynamic configurations for hot reload b.feature flags 5. Provisioning Configuration a.Metadata for tenants, namespaces etc b.Namespace policies
  • 46. I want to know more..
  • 47. References: • https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar- works • Pulsar without Zookeeper: Introducing the Metadata Access Layer in Pulsar • TGI Pulsar 009: Introduction of Apache BookKeeper
  • 48. Q & A time Drop me a hello at: https://www.linkedin.com/in/shivjijha/ https://twitter.com/ShivjiJha Pulsar community https://apache-pulsar.slack.com/ users@pulsar.apache.org dev@pulsar.apache.org