Successfully reported this slideshow.
Your SlideShare is downloading. ×

Apache BookKeeper: A High Performance and Low Latency Storage Service

Ad

Apache BookKeeper
A High Performance and Low Latency Storage Service
@sijieg (Sijie Guo, Twitter)
@jvjujjuri (JV, Salesfor...

Ad

I am Sijie Guo
- PMC Chair of Apache BookKeeper
- Co-creator of Apache DistributedLog
- Twitter Messaging/Pub-Sub Team
- Y...

Ad

Challenges in Distributed Systems

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 50 Ad
1 of 50 Ad

Apache BookKeeper: A High Performance and Low Latency Storage Service

Download to read offline

Apache BookKeeper is a high performance and low latency storage service optimized for storing immutable and append-only data (such as log, streaming events, and objects). Sijie Guo and JV shares the experienced with Apache BookKeeper. This talk covers the motivation and overview of BookKeeper, dives into implementation details and describes the use cases built upon it.

Apache BookKeeper is a high performance and low latency storage service optimized for storing immutable and append-only data (such as log, streaming events, and objects). Sijie Guo and JV shares the experienced with Apache BookKeeper. This talk covers the motivation and overview of BookKeeper, dives into implementation details and describes the use cases built upon it.

More Related Content

Similar to Apache BookKeeper: A High Performance and Low Latency Storage Service (20)

Apache BookKeeper: A High Performance and Low Latency Storage Service

  1. 1. Apache BookKeeper A High Performance and Low Latency Storage Service @sijieg (Sijie Guo, Twitter) @jvjujjuri (JV, Salesforce)
  2. 2. I am Sijie Guo - PMC Chair of Apache BookKeeper - Co-creator of Apache DistributedLog - Twitter Messaging/Pub-Sub Team - Yahoo! R&D Beijing Hello!
  3. 3. Challenges in Distributed Systems
  4. 4. Expect Failures up to 10% annual failure rates for disks/servers
  5. 5. “ Symptoms
  6. 6. Problem 1: Not Available
  7. 7. Problem 1: Not Available
  8. 8. Problem 2: Inconsistencies
  9. 9. CAP
  10. 10. “ More Issues
  11. 11. Problem 3: Split Brain Writer A Writer A Write A’ Writer A Write A’ Two Writers
  12. 12. Problem 4: Failure Detection B A C
  13. 13. Problem 5: Recovery B A C Recovery Protocol Consistency
  14. 14. “ Solutions
  15. 15. Overview Enter Apache BookKeeper
  16. 16. BookKeeper - Durable Storage A Durable Storage Optimized for Immutable Data Serve as a building block for reliable systems Commodity Hardware Durability Replication Consistency Recovery Client Library
  17. 17. Immutable Data Abstraction
  18. 18. Ledger ◉ Segment ◉ Block / Object ◉ Append-Only File ◉ ...
  19. 19. Guarantees If an entry has been acknowledged, it must be readable If an entry is read once, it must always be readable
  20. 20. History ◉ Initial Use Case - Hadoop NameNode HA ◉ 2008: Open Sourced Contrib of ZooKeeper ◉ 2011: Sub-Project of ZooKeeper ◉ 2012: Yahoo! Push Notification ◉ 2012~Now: DistributedLog, Pulsar, Majordodo ◉ 2015~Now: Salesforce Distributed Store
  21. 21. Inside of Apache BookKeeper Details
  22. 22. Architecture Bookie Bookie Bookie APP Client Metadata Store Ledger
  23. 23. Reliable Writes ◉ Store checksum along with entry ◉ Fsync entries before responding ◉ Ack when ○ All Previous Entries ○ This Entry Bookie Bookie Bookie Accepted by Quorum
  24. 24. Consistency - LastAddPushed 0 1 2 3 4 7 8 9 LastAdd Pushed 10 11 12 Writer Add entries
  25. 25. Consistency - LastAddConfirmed 0 1 2 3 4 7 8 9 10 11 12 LastAdd Confirmed Reader Reader LastAdd Confirmed Writer Writer Ownership Changed Add entries Ack Adds Fencing
  26. 26. Fencing
  27. 27. Read Entry & Read LAC B1 B2 B3 Client Read Entry K Speculative Reads On Timeouts B1 B2 B3 Client Read LAC Quorum Read
  28. 28. Long Poll Read B1 B2 B3 Client Long Poll Read Speculative Long Poll
  29. 29. Inside a Bookie
  30. 30. Use Cases Apache BookKeeper as a Building Block
  31. 31. Projects built on BookKeeper ◉ Twitter: Apache DistributedLog ◉ Yahoo: Pulsar - Cloud Messaging Service ◉ Salesforce Distributed Store. ◉ Huawei - HDFS NameNode HA ◉ HubSpot - WAL ◉ Majordodo - Distributed Resource Manager
  32. 32. “ Apache DistributedLog (Twitter)
  33. 33. Apache DistributedLog 1 2 3 4 5 6 7 11 1 2 1 3 1 4 1 5 1 6 1 7 Oldest Newest Log Segment X Log Segment X+1 Log Segment X+2 Apache BookKeeper
  34. 34. Apache DistributedLog MetadataStore Log Segment Store (BK) Cold Storage (HDFS) Log Streams - Abstraction & Naming - Data Management - Efficient Write & Read - Intra-cluster & Geo Replication - Segments - Raw Streams Write Proxy Read Proxy - Ownership Tracking - Batching, Compression Record Cache - Rate Limiting, Quota - - Serving - Applications - Different Consumer models DBs - e.g., Twitter’s Manhattan Deferred RPC (queuing) Self-serve Pub/Sub Stream Computing Cross DC Replication
  35. 35. DistributedLog at Twitter ◉ Manhattan Key/Value Store - WAL ◉ Durable Deferred RPC - Journal ◉ Real-Time Search Indexing - Change Propagation ◉ Self-serve Pub/Sub - Message Delivery, Ads Pipeline ◉ Stream Computing ○ Source & Sink ○ Stateful Processing in Heron (coming soon) ◉ Reliable Cross Datacenter Replication
  36. 36. Scale DistributedLog at Twitter ◉ 1.5 trillion records/day, 17.5 petabytes/day ◉ O(10) thousands streams, O(1) million live ledgers ◉ O(10^2) bookies, O(10^3) proxies ◉ Records size from 100 bytes to 20 KB to even more ◉ Data is kept from hours to days, even up to a year ◉ Replication factor is 3 or 5. 9 or 15 for global use case.
  37. 37. DistributedLog Resources ◉ Website - https://distributedlog.io ◉ Mail List - dev@distributedlog.incubator.apache.org ◉ Project Ideas - https://cwiki.apache.org/confluence/display/DL/Project+Ideas ◉ Paper - “DistributedLog: A high performance replicated log service” (ICDE 2017)
  38. 38. “ Yahoo! Pulsar (Cloud Messaging Service)
  39. 39. Yahoo! Pulsar ◉ Distributed Pub/Sub Messaging Platform ◉ Flexible Messaging Model - Topic and Queue ◉ Durable, Low Latency ◉ Strong Ordering and Consistency Guarantees ◉ Geo Replication ◉ Apache BookKeeper as Durable Message Store
  40. 40. Yahoo! Pulsar
  41. 41. Scale Pulsar at Yahoo! ◉ 100 billion messages per day ◉ More than 1.4 million topics ◉ Avg publish latency across services of less than 5ms ◉ 10+ data centers, cross-region replications
  42. 42. Pulsar Performance
  43. 43. “ Salesforce Distributed Store
  44. 44. Salesforce Application Storage ◉ Store for Persistent WAL, Data and Objects ◉ Low, Constant Write Latencies ◉ Low, Constant Random Read Latencies ◉ Highly Available, Consistent ◉ Distributed and Linearly Scalable ◉ On Commodity Hardware
  45. 45. Heterogeneous Stores
  46. 46. Roadmap, Releases, Future Community
  47. 47. Community ◉ 7 PMC Members ◉ 10+ Committers ◉ 20+ Active Contributors ◉ 5+ Companies actively using/contributing ○ Twitter ○ Yahoo! ○ Salesforce ○ Huawei ○ EMC
  48. 48. Release 4.5.0 ◉ Netty 4 Upgrade - Performance Improvements ◉ Security (Authentication & Authorization) Support ◉ Explicit LAC ◉ Long Poll Read Support ◉ Auto Re-replication Improvements ◉ ...
  49. 49. Future ◉ Scalable Segment Store ○ Object, Log, File, Stream, … ◉ Long Term Storage ○ Disk Scrubber ○ Better Lifecycle Management ○ … ◉ Beyond the limit ○ 128 bits support ○ Scalable metadata management
  50. 50. Any questions ? You can find me at ◉ @sijieg ◉ guosijie@gmail.com Thanks!

×