Advertisement

Apache Bookkeeper and Apache Zookeeper for Apache Pulsar

Oct. 5, 2021
Advertisement

More Related Content

Advertisement

Apache Bookkeeper and Apache Zookeeper for Apache Pulsar

  1. Apache Bookkeeper and Apache ZooKeeper for Apache Pulsar Enrico Olivelli DataStax - Luna Streaming Team Member of Apache Pulsar, Apache BookKeeper and Apache ZooKeeper PMC, Apache Curator VP
  2. Agenda ● Introduction to Apache Pulsar architecture ● Overview about Apache ZooKeeper ● Overview about Apache BookKeeper ● ManagedLedger: Pulsar and BookKeeper ● Handling Failures while guaranteeing Consistency ● Live Demo with BKVM (BookKeeper Visual Manager) 2
  3. Apache Pulsar Architecture 3 A cloud-native, distributed messaging and streaming platform Components of a Pulsar Cluster - Clients - Brokers - Bookies - ZooKeeper cluster - Proxy (optional) - Functions Workers (optional) - Pulsar IO (optional) - Tiered Storage (optional) Producer Proxy Broker Bookie Consumer Producer Consumer Proxy Broker Broker Bookie Bookie ZooKeeper ZooKeeper ZooKeeper Functions Functions Pulsar IO Object Storage
  4. Apache Pulsar - Core Concepts 4 - Topic: - Sequence of Messages - Persistent/Non-Persistent - Partitioned/Non-Partitioned - Tenant and Namespace: - Logical and physical isolation of resources - Fine grained configuration (topic/namespace/tenant/system levels) - Subscription: - A cursor over a topic (tracks status of acknowledgements) - Modes: Exclusive, Failover, Shared, Key Shared - Types: Durable, Non-Durable - Producer: - Normal, Exclusive
  5. Apache ZooKeeper 5 - Born in Yahoo! and donated to the Apache Software Foundation - Offers primitives for distributed systems coordination - Implements a filesystem-like structure - znodes are like directories and files - Easy to understand - No need for shared disks - Strict ordering of operations - Leader node + Followers (ZAB protocol) - Enforced in the client - Sessions - Explicit notion of “lost connection” - Heartbeat based expiration - Ephemeral nodes
  6. Apache ZooKeeper in Apache Pulsar 6 - Service Discovery - Leader Election - Metadata Management - Configuration Management - Used by Apache BookKeeper Broker Bookie Broker Broker Bookie Bookie ZooKeeper ZooKeeper ZooKeeper
  7. Apache ZooKeeper - Conditional Writes (CW) 7 - Every znode has a version, a (small) content and possibly (few) children setData(content, expectedVersion) - Basic building block to ensure consistency - Only the owner can update the znode - Version conflict -> fail, assume you are no more the owner - Successful write -> prevent others to perform the write (version automatically incremented) - Only one Broker can make progress at a time while working on metadata This is not enough to ensure the overall consistency of the system !
  8. Apache BookKeeper 8 - Born in Yahoo! and donated to the Apache Software Foundation - Subproject of ZooKeeper, then graduated as TLP - Implements a high performance distributed storage system - Thick Java Client - Bookie server: storage only - Horizontally scalable - Write/Read paths isolation - Durability (journal/fsync) - Replication - Advanced placement policies
  9. The Broker - the Heart of Pulsar 9 Each Broker is the Owner for a given set of topic bundles: - Handles reads/writes - Redirects to other brokers requests for non-owned bundles - Handles subscriptions, consumers and producers status - Keeps non-persistent topics data in memory - Manages Schemas - Handles cluster wide requests The Broker uses Apache BookKeeper to store: - Messages - Subscriptions (acks) - Schema - Code packages (new in 2.8)
  10. The Broker - Data flow when a message is produced 10 The Broker receives a request to publish a message: ● Verify topic ownership ● Verify authorization ● Locates the ManagedLedger instance ● Pass the encoded entry (single message or a batch) to ManagedLedger ● ManagedLedger passes the entry to the active Ledger WriteHandle ● The BK client sends the entry in parallel to the Bookies Producer Broker Bookie Bookie Bookie ManagedLedger
  11. The Broker - Data flow when a message is produced 11 The Broker receives a request to publish a message: ● Verify topic ownership ● Verify authorization ● Locates the ManagedLedger instance ● Pass the encoded entry (single message or a batch) to ManagedLedger ● ManagedLedger passes the entry to the active Ledger WriteHandle ● The BK client sends the entry in parallel to the Bookies ● Wait for acknowledgement from the Bookies ● Acknowledge back the write to the Pulsar client ● Now the Message ID is available to the client (LedgerID-EntryID...) Producer Broker Bookie Bookie Bookie ManagedLedger
  12. The Bookie - When the message is persisted 12 The Write path and the Read path are separated inside the Bookie. Write path: - The Bookie receives a copy of the entry - The entry is written to the Journal - The journal acknowledges the write after a successful fsync - Entries are grouped in order to reduce the number of fsyncs - The Bookies acknowledges the operation to the Client The BookKeeper Client is responsible for: - Selecting the Bookies (zone/region awareness) - Waiting for confirmation - Retransmissions - Make a checksum of the raw payload
  13. The Broker - ManagedLedger abstraction 13 BookKeeper relies on ZooKeeper CW features to guarantee consistency of metadata The Pulsar ManagedLedger is an abstraction over the BookKeeper Ledger: - Implements an infinite append-only stream of entries - Concatenates BK ledgers (metadata only) - Implements Cursors (support for durable subscriptions) - Implements Tiered Storage Ledger 123 Ledger 124 Ledger 137 Ledger 156 Ledger 168 topic persistent://public/default/test BookKeeper Ledger: a write-once, append only, sequence of entries (byte[])
  14. Handling Failures and ensuring Consistency 14 Failures on Broker: - Network error/partition - Overwhelmed Broker (Garbage collection, out of memory/CPU) - Shutdown (or forced Bundle unload) - …. A new Broker becomes the Owner for the Topic (ManagedLedger) - Perform recovery on the current BK ledger - Create a new Ledger on BK - Append the new Ledger ID to the list of Ledgers - Serve write requests (verify that is the owner for each operation!) More than one broker may start this recovery process ! ZooKeeper CW covers metadata operations, but it does not help in the hot write path
  15. BookKeeper Fencing and Recovery 15 - The new Broker opens the ledger in Recovery mode - The BookKeeper Client reads from the Bookies every entry: - Discover the max valid entry id - Set the ledger fenced flag on the Bookies (on disks) - Writes to ZooKeeper the new status of the Ledger - This may fail during a CW operation ! - Only one broker can perform a successful recovery! - The old broker: - Receives a “Ledger Fenced error” on the next write - Receives a “Bad Version error” while writing to ZooKeeper (if trying to append a new ledger ID) - It may receive a Watch Notification from ZooKeeper At every write BookKeeper ensures the ownership of the Topic BookKeeper fencing + ZooKeeper CW guarantee consistency of Pulsar
  16. Live Demo - Inspect a Pulsar Standalone instance 16 - Start Pulsar Standalone - Use Visual Studio Code to inspect ZooKeeper contents - Use BKVM to inspect BookKeeper contents - Write to public/default/test - Unload the topic - See that the ManagedLedger created a new Ledger
  17. Wrapping up 17 ● ZooKeeper and Bookkeeper came from Yahoo! as well as Pulsar ! ● Pulsar ManagedLedger is the high level abstraction over BookKeeper. ● ZooKeeper provides support for Metadata Management, Service Discovery, Configuration and Leader Election. ● Conditional Writes (CW) guarantee consistency for Metadata operations. ● The Fencing mechanism of BookKeeper ensures Consistency on the write path ● In no case two brokers are able to write concurrently to a Topic, one of them will eventually fail
  18. References 18 LinkedIn - linkedin.com/in/enrico-olivelli-984b7874/ Twitter: twitter.com/eolivelli Apache Pulsar Community: pulsar.apache.org/en/contact/ (Slack, ML…) References: Apache Pulsar website: pulsar.apache.org - github.com/apache/pulsar Apache BookKeeper website: bookkeeper.apache.org - github.com/apache/bookkeeper Apache ZooKeeper website: zookeeper.apache.org - github.com/apache/zookeeper BKVM website: bkvm.org - github.com/diennea/bookkeeper-visual-manager
  19. Thank you ! 19 We are hiring: https://www.datastax.com/company/careers

Editor's Notes

  1. June 15, 2021 Updates: Added Astra DB logo. Replaced Astra Streaming logo with updated version, while adding a horizontal lockup as a secondary option. Updated Luna Streaming logo.
Advertisement