Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Curalate) | C* Summit 2016

2,729 views

Published on

Building queues on distributed data stores is hard, and long been considered an antipattern. However, with careful consideration and tactics, it is possible to do. CassieQ is an implementation of a distributed queue on Cassandra which supports easy installation, massive data ingest, authentication, a simple to use HTTP based API, and no dependencies other than your already existing Cassandra environment.

About the Speakers
Anton Kropp Senior Software Engineer, Curalate

Anton Kropp is a senior engineer with over 8 years experience building distributed and fault tolerant systems. He has worked at companies big and small (Godaddy, PracticeFusion), and enjoys building frameworks and tooling to make life easier with a penchant for dockerized containers and simple API's. When he's not messing around on his computer he's drinking local Seattle beers, zipping around the city on his electric bike, and hanging out with his wife and dog.

Published in: Software
  • Be the first to comment

CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Curalate) | C* Summit 2016

  1. 1. Anton Kropp CassieQ: The Distributed Queue Built On Cassandra
  2. 2. © DataStax, All Rights Reserved. Why use queues? • Distribution of work • Decoupling producers/consumers • Reliability 2
  3. 3. © DataStax, All Rights Reserved. Existing Queues • ActiveMQ • RabbitMQ • MSMQ • Kafka • SQS • Azure Queue • others 3
  4. 4. © DataStax, All Rights Reserved. Advantage of a queue on c* • Highly available • Highly distributed • Massive intake • Masterless • Re-use existing data store/operational knowledge 4
  5. 5. © DataStax, All Rights Reserved. 5 But aren’t queues antipatterns?
  6. 6. © DataStax, All Rights Reserved. Issues with queues in C* • Modeling off deletes • Tombstones • Evenly distributing messages? • What is the partition key? • How to synchronize consumers? 6
  7. 7. © DataStax, All Rights Reserved. Existing C* queues • Netflix Astyanax recipe • Cycled time based partitioning • Row based reader lock • Messages put into time shard ordered by insert time • Relies on deletes • Requires low gc_grace_seconds for fast compaction 7
  8. 8. © DataStax, All Rights Reserved. Existing C* queues • Comcast CMB • Uses Redis as actual queue (cheating) • Queues are hashed to affine to same redis server • Cassandra is cold storage backing store • Random partitioning between 0 and 100 8
  9. 9. © DataStax, All Rights Reserved. Missing features • Authentication • Authorization • Statistics • Simple deployment • Requirement on external infrastructure 9
  10. 10. © DataStax, All Rights Reserved. CassieQ • HTTP(s) based API • No locking • Fixed size bucket partitioning • Leverages pointers (kafkaesque) • Message invisibility • Azure Queue/SQS inspired • Docker deployment • Authentication/authorization • Ideally once delivery • Best attempt at FIFO (not guaranteed) 10
  11. 11. © DataStax, All Rights Reserved. 11 docker run –it -p 8080:8080 –p 8081:8081 paradoxical/cassieq dev
  12. 12. © DataStax, All Rights Reserved. CassieQ Queue API 12
  13. 13. © DataStax, All Rights Reserved. CassieQ Admin API 13
  14. 14. © DataStax, All Rights Reserved. CassieQ workflow • Client is authorized on an account • Granular client authorization up to queue level • Client consumes message from queue with message lease (invisibility) • Gets pop receipt • Client acks message with pop receipt • If pop receipt not valid, lease expired • Client can update messages • Update message contents • Renew lease 14
  15. 15. © DataStax, All Rights Reserved. 16 Lets dig inside CassieQ internals
  16. 16. © DataStax, All Rights Reserved. TLDR • Messages partitioned into fixed sized buckets • Pointers to buckets/messages used to track current state • Use of lightweight transactions for atomic actions to avoid locking • Bucketing + pointers eliminates modeling off deletes 17
  17. 17. © DataStax, All Rights Reserved. CassieQ Buckets • Messages stored in fixed sized buckets • Deterministic when full • Easy to reason about • Why not time buckets? • Time bugs suck • Non deterministic • Can miss data due to time overlaps • Messages given monotonic ID • CAS “id” table • Bucket # = monotonicId / bucketSize 18
  18. 18. © DataStax, All Rights Reserved. Pointers to Buckets/Messages • Reader pointer • Tracks which bucket a consumer is on • Repair pointer • Tracks first non-finalized bucket • Invisibility pointer • Tracks first unacked message 19
  19. 19. All 3 pointers point to monotonic id value, potentially in different buckets 1 2 3 4 5 InvisPointer ReaderPointer RepairPointer Pointers to Buckets
  20. 20. © DataStax, All Rights Reserved. Schema 21 CREATE TABLE queue ( account_name text, queuename text, bucket_size int, version int, ... PRIMARY KEY (account_name, queuename) ); CREATE TABLE message ( queueid text, bucket_num bigint, monoton bigint, message text, version int, acked boolean, next_visible_on timestamp, delivery_count int, tag text, created_date timestamp, updated_date timestamp, PRIMARY KEY ((queueid, bucket_num), monoton) ); *queueid=accountName:queueName:version
  21. 21. © DataStax, All Rights Reserved. 22 Reading messages
  22. 22. © DataStax, All Rights Reserved. Pointers to Buckets/Messages • Reader pointer • Tracks which bucket a consumer is on • Repair pointer • Tracks first non-finalized bucket • Invisibility pointer • Tracks first unacked message 23
  23. 23. © DataStax, All Rights Reserved. Reading from a bucket • Read any unacked message in bucket (either FIFO or random) • Consume message (update its internal version + set its invisibility timeout) • Return to consumer 24 1 2 3 4 Bucket 1 Undelivered messages Reader pointer start
  24. 24. 1 2 ? 4 5 Buckets… complications • Once a monoton is generated, it is taken • Even if a message fails to insert the monoton is taken • Buckets are now partially filled! • How to resolve?
  25. 25. 1 2 ? 4 5 6 Bucket 2Bucket 1 Reader Message 3 missing When to move off a bucket? 1. All known messages in the bucket have been delivered at least once 2. All new messages being written in future buckets
  26. 26. 1 2 ? 4 5 6 7 … Bucket 2Bucket 1 Reader @ bucket 2 Message 3 missing Tombstone When to move off a bucket? • Tombstoning (not cassandra tombstoning, naming is hard!) • Bucket is sealed, no more writes • Reader tombstones bucket after its reached
  27. 27. Tombstoning enables us to detect delayed writes 1 2 ? 4 5 6 7 … Bucket 2Bucket 1 Reader @ bucket 2 Message 3 missing Tombstone
  28. 28. © DataStax, All Rights Reserved. 29 Repairing delayed messages
  29. 29. © DataStax, All Rights Reserved. Pointers to Buckets/Messages • Reader pointer • Tracks which bucket a consumer is on • Repair pointer • Tracks first non-finalized bucket • Invisibility pointer • Tracks first unacked message 30
  30. 30. © DataStax, All Rights Reserved. Repairing delayed writes • Scenarios: • Message taking its time writing (still alive, but slow) • Message claimed monoton but is dead • Resolution: • Watch for tombstone in bucket • Wait for repair timeout (30 seconds) • If message shows up, republish • If not, finalize bucket and move to next bucket (message is dead) 31
  31. 31. Repairing delayed writes 1 2 ? 4 5 6 7 … Bucket 2Bucket 1 Reader @ bucket 2 Message 3 missing Tombstone Repair Pointer @ bucket 1
  32. 32. wait 30 seconds…
  33. 33. Repairing delayed writes 1 2 3 4 5 6 7 … Bucket 2Bucket 1 Reader @ bucket 2 + Message 3 Showed up! Tombstone Repair Pointer @ bucket 1 Republished to end
  34. 34. Repairing delayed writes 1 2 3 4 5 6 7 … 3 Bucket 2..Bucket 1 Tombstone Repair Pointer @ bucket 2 Reader @ bucket 2 +
  35. 35. © DataStax, All Rights Reserved. 36 Invisibility and the unhappy path ☹
  36. 36. © DataStax, All Rights Reserved. 37 What is invisibility?
  37. 37. © DataStax, All Rights Reserved. 38 A mechanism for message re-delivery (in a stateless system)
  38. 38. © DataStax, All Rights Reserved. Pointers to Buckets/Messages • Reader pointer • Tracks which bucket a consumer is on • Repair pointer • Tracks first non-finalized bucket • Invisibility pointer • Tracks first unacked message 39
  39. 39. The happy path • Client consumes message • Message is marked as “invisible” with a “re-visibility” timestamp • Client gets pop receipt encapsulating metadata (including version) • Client acks within timeframe • Message marked as consumed if version is the same
  40. 40. The unhappy path :( • Client doesn’t ack within timeframe • Message needs to be redelivered • Subsequent reads checks the invis pointer for visibility • If max delivers exceeded, push to optional DLQ • Else redeliver!
  41. 41. © DataStax, All Rights Reserved. The unhappy path :( 42 1 2 3 4 5 6 7 … Bucket 1 Invisibility pointer Reader Bucket pointer
  42. 42. © DataStax, All Rights Reserved. The unhappy path :( 43 1 2 3 4 5 6 7 … Bucket 1 Invisibility pointer Reader Bucket pointer
  43. 43. © DataStax, All Rights Reserved. The unhappy path :( 44 1 2 3 4 5* 6 7 … Bucket 1 Invisibility pointer Reader Bucket pointer ackack ack out expired
  44. 44. Long term invisibility is bad • InvisPointer WILL NOT move past a unacked message • Invisible messages can block other invisible messages • Possible to starve future messages
  45. 45. © DataStax, All Rights Reserved. The unhappy path :( 46 1 2 3 4 5 6 7 … Bucket 1 Invisibility pointer Reader Bucket pointer ackack ack DLQ ou t
  46. 46. © DataStax, All Rights Reserved. Conclusion • Building a queue on c* is hard • Limited by performance of lightweight transactions and underlying c* choices • compaction strategies, cluster usage, etc • Need to make trade off design choices • CassieQ is used in production but in not stressed under highly contentious scenarios 47
  47. 47. Questions? or feedback/thoughts/visceral reactions Contribute to the antipattern @ paradoxical.io https://github.com/paradoxical-io/cassieq

×