MongoDB as Message Queue

16,803 views
16,158 views

Published on

Talk by Luke Gotszling of AOL/About.me at the April 2012 SV MUG

MongoDB as Message Queue

  1. 1. MongoDB as AMessage Queue Luke Gotszling Aol / About.meSilicon Valley MongoDB User Group Big Data Week Palo Alto, CA April 25, 2012 1
  2. 2. Prior AMQP Usage• 3-node RabbitMQ cluster on v1.8, opted to forego disk persistence for better performance• Hard to diagnose cause of failure at scale 2
  3. 3. At About.me• All asynchronous and periodic tasks• Short lived messages • No journalling• Sharded cluster on v2.0.4 (shard key = queue name) 3
  4. 4. Benefits• Async operations• Per message (document) atomicity• Batch processes• Periodic processes• Durability / ability to shard• Operational familiarity 4
  5. 5. AMQP? Direct Topic Fanout ? AMQP Push Yes Yes Mongo Regular Poll Sort of* Queue expression* Options include passing a message along with an incrementing key ormultiple declarations. Added to Kombu in v2.1 -- reduces performance fornon-fanout operations due to additional queries 5
  6. 6. To cap or not to cap• Capped collections[1] • Better performance but limited to single node[2] • FIFO• Uncapped collections -- rest of this presentation • Can shard, lower performance per-node • FIFO-ish[3], custom ordering available[1] http://blog.boxedice.com/2011/04/13/queueing-mongodb-using-mongodb/ http://blog.boxedice.com/2011/09/28/replacing-rabbitmq-with-mongodb/[2] SERVER-211, SERVER-2654[3] Only down to 1 second granularity 6
  7. 7. Code (mongo)• Create: db.messages.insert( { queue:"email", payload:serialized_data} )• Consume: db.messages.findAndModify( { query:{"queue":"email"}, sort:{"_id":+1}, remove:true} )• Index: db.messages.ensureIndex({ queue:1 }) db.messages.ensureIndex({ queue:1, _id:1}) 7
  8. 8. Code (Python)• Create: self.client.insert({"payload": serialize(message), "queue": queue})• Consume: self.client.database.command("findandmodify", "messages", query={"queue": queue}, sort={"_id": pymongo.ASCENDING}, remove=True)• Index: col.ensure_index([("queue", 1)]) col.ensure_index([("queue", 1),("_id", 1)]) http://packages.python.org/kombu/ 8
  9. 9. Celery Task Creation Benchmarks (Single-Node) RabbitMQ v2.7.1 MongoDB (2.0.4) --nojournal MongoDB (2.0.4) --journal 5600 4200Created / s 2800 1400 0 1 2 3 4 5 Concurrency (processes) celery 2.4.5 / kombu 2.0 / pymongo 2.1 / amqplib 1.0.2 / eventlet 0.9.16 9
  10. 10. Celery Task Consumption Benchmarks (Single-Node) RabbitMQ v2.7.1 MongoDB (2.0.4) --nojournal MongoDB (2.0.4) --journal 2000 1500Consumed / s 1000 500 0 1 5 9 13 17 21 25 Concurrency (eventlet) celery 2.4.5 / kombu 2.0 / pymongo 2.1 / amqplib 1.0.2 / eventlet 0.9.16 10
  11. 11. Pros Cons• Familiar technology • Not AMQP• Sharding • Need to poll• Durability • Performance depends on polling frequency• Lower operational and concurrency overhead • Message consumption• Advanced querying is a locking operation (map/reduce etc...) • Fewer libraries available[1] [1] Python has kombu, < v2.1 no fanout support but better async task performance 11
  12. 12. Don’t Forget To Shard Your Collections! 12
  13. 13. Questions? luke@about.me about.me/luke @lmgtwit 13

×