Successfully reported this slideshow.
Your SlideShare is downloading. ×

High performance queues with Cassandra

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 29 Ad

High performance queues with Cassandra

Everyone knows that Cassandra is a NoSQL solution for data storage. But often for processing of this data message queues are used with some existing messaging provider. Due to this, there is inconsistency of data sometimes and an additional infrastructure level to maintain. Since one of our services stores all the data in Cassandra, we have developed a solution for message queues that automatically gained a lot of useful features: scalability, high availability and flexibility. This solution I will present in the talk.

Everyone knows that Cassandra is a NoSQL solution for data storage. But often for processing of this data message queues are used with some existing messaging provider. Due to this, there is inconsistency of data sometimes and an additional infrastructure level to maintain. Since one of our services stores all the data in Cassandra, we have developed a solution for message queues that automatically gained a lot of useful features: scalability, high availability and flexibility. This solution I will present in the talk.

Advertisement
Advertisement

More Related Content

Similar to High performance queues with Cassandra (20)

More from Mikalai Alimenkou (20)

Advertisement

Recently uploaded (20)

High performance queues with Cassandra

  1. 1. High performance queues with Cassandra Mikalai Alimenkou http://xpinjection.com @xpinjection
  2. 2. Kiev, Ukraine #евромайдан
  3. 3. How to process data, master ? Asynchronously!
  4. 4. Queues usage scenario
  5. 5. More realistic scenario
  6. 6. More realistic scenario
  7. 7. Are you crazy? Cassandra for queues?
  8. 8. So many cool MQ providers
  9. 9. Initial expected loading
  10. 10. Some specific requirements 1 Message Queue External Service Provider
  11. 11. Some specific requirements Message Queue 1 External Service Provider
  12. 12. Some specific requirements 2 Message Queue External Service Provider
  13. 13. Some specific requirements Message Queue 2 1 Redeliver after 1 hour External Service Provider
  14. 14. Some specific requirements 2 Message Queue External Service Provider Redeliver after 1 hour Redelivery business logic and external service hourly based usage limits
  15. 15. Idea came from railways
  16. 16. “Body flow” in regular life
  17. 17. Message batches “station” QUEUE NOW ALMOST READY +1 HOUR WAITING +6 HOURS WAITING +12 HOURS WAITING
  18. 18. System components: Message MESSAGE = REAL REQUEST DATA WITH UNIQUE ID 1ST FIELD 3rd FIELD { field data} MESSAGE ID 2nd FIELD { field data} { field data} ID FORMAT ALLOWS 4096 MESSAGES PER MILLISECOND FROM ONE NODE Timestamp 44 bits Counter 12 bits Cluster node ID 8 bits
  19. 19. System components: Batch • Open for at least 1 second • Closing if opened for > 10 seconds • Closing if has > 100 messages Ascending columns ordering 1ST MESSSAGE ID 2nd MESSAGE ID 3rd MESSAGE ID BATCH ID { opt message data} { opt message data} { opt message data} ID FORMAT REQUIRE BATCH TO BE OPENED FOR > 1 SECOND Timestamp Rounded to seconds Cluster node ID + Batch Type Last 3 digits
  20. 20. System components: Queue • Similar to batch • Unlimited • May have batches with past time Ascending columns ordering 1ST BATCH ID QUEUE NAME 2nd BATCH ID 3rd BATCH ID { processed at } { processed at } { processed at }
  21. 21. System components: Broker batches polling BROKER check batch time process batch PROCESSOR QUEUE lock batch for processing ZOOKEEPER • • • • • Natural pre-fetch thanks to batches Easy to control messages processing Simple concurrency model Easy scalable between nodes No high loading on Cassandra
  22. 22. System components: Processor PROCESSOR
  23. 23. System components: Processor PROCESSOR OK
  24. 24. System components: Processor PROCESSOR
  25. 25. System components: Processor PROCESSOR redeliver on failure ANOTHER BATCH • • • • Tries to process messages as quickly as possible On error just redeliver message Messages are processed concurrently Any redelivery business logic is easy to implement
  26. 26. Warnings and benefits • Message and batch must be checked before processing • Hard to explain “queue” size • Separate columns for status tracking of message • Perform correct compaction from time to time • Expected loading is handled with single node • Everything works on commodity hardware • Single storage for all data • System is easily scalable and reliable (no message was lost)
  27. 27. Show me the code!
  28. 28. @xpinjection http://xpinjection.com mikalai.alimenkou@xpinjection.com

×