Inter-Process/Task Communication With Message Queues


Published on

Introduction to message queues and an overview of some of the available options for python programmers. Presented at PyOhio 2009

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Inter-Process/Task Communication With Message Queues

  1. 1. Inter-Process/Task Communication With Message Queues William McVey < [email_address] > PyOhio July 26, 2009
  2. 2. Intro <ul><li>How I found a solution that works well for me </li></ul><ul><li>There is a LOT of material out there that isn't covered </li></ul><ul><li>Not necessarily ideal solution, but I learned a lot along the way </li></ul>
  3. 3. Description of the Problem <ul><li>HPC Controller: Tries to discover new ways web browsers (and other client software) get exploited &quot;in the wild&quot; and ensures that my employer's mitigations for these threats are effective. </li></ul><ul><li>A Django-based data management application </li></ul><ul><li>Invokes long running Capture-HPC Java application </li></ul><ul><li>Collects and processes large amounts of data </li></ul>
  4. 4. Architecture
  5. 5. Key Difficulties <ul><li>Long running processes under short lived web requests. </li></ul><ul><li>My initial (naive) approach: </li></ul><ul><li>Spawn detached processes to handle jobs </li></ul><ul><li>Process coordination via database </li></ul>
  6. 6. Lesson learned Do not screw with Apache's process model.
  7. 7. Rediscovering Queues <ul><li>Basic queue overview </li></ul><ul><li>Standard lib: </li></ul><ul><ul><li>Queue - mostly for thread pool management </li></ul></ul><ul><ul><li>collections.dequeue - provides efficient access to both endpoints of list structure </li></ul></ul><ul><ul><li>heapq - ordered queues (e.g. priority queue) </li></ul></ul>
  8. 8. Generic message broker <ul><li>Message brokers can provide: </li></ul><ul><li>Simple queue-like dataflow </li></ul><ul><li>Simplified interprocess communication with message routing </li></ul><ul><li>More effective scaling </li></ul><ul><li>Better resilience to failure </li></ul>
  9. 9. beanstalkd/beanstalkc <ul><li>beanstalkd : A very simple text based-protocol with an simple yet powerful set of queue management primitives. </li></ul><ul><li>beanstalkc : A simple yet powerful client API that is well documented. </li></ul><ul><li>[demo here] </li></ul>
  10. 10. The need for something more <ul><li>Beanstalkd continues to be effective for hpc_controller . A new project came along and I ran into some issues... </li></ul><ul><li>Lack of authentication </li></ul><ul><li>Lack of message integrity/confidentiality </li></ul><ul><li>Lack of persistent messages </li></ul>
  11. 11. memcacheq <ul><li>Memcacheq uses the memcachedb protocol to implement queues. &quot;Cache&quot; look up of a queue name pop a value from the queue </li></ul><ul><li>Pro: </li></ul><ul><ul><li>Fast, lightweight, and scales well. </li></ul></ul><ul><ul><li>Persistent messages across reboots </li></ul></ul><ul><li>Con: </li></ul><ul><ul><li>Doesn't support either blocking or callback interfaces </li></ul></ul><ul><ul><li>Have to poll to see if you have messages </li></ul></ul><ul><ul><li>Didn't address authentication requirement </li></ul></ul><ul><li>[demo here] </li></ul>
  12. 12. AMQP <ul><li>Advanced Message Queuing Protocol (AMQP) open protocol layer for message queues. </li></ul><ul><li>Pro: </li></ul><ul><ul><li>A more powerful message routing capability </li></ul></ul><ul><ul><li>TLS (aka SSL) as part of the protocol spec </li></ul></ul><ul><ul><li>A variety of broker implementations </li></ul></ul><ul><li>Con: </li></ul><ul><ul><li>More complex </li></ul></ul>
  13. 13. AMQP
  14. 14. AMQP Message Routing Image from: Messaging Tutorial - AMQP Programming Tutorial for C++, Java, Python, and C# Copyright © 2008 Red Hat, Inc. Under the Open Publication License
  15. 15. ØMQ - <ul><li>High performance messaging broker which can speak AMQP or you can use it's own set of python bindings to communicate via the library code. </li></ul><ul><li>Pro: </li></ul><ul><ul><li>more flexible set of possible topologies (include brokerless/peer to peer, directory referral, and more). </li></ul></ul><ul><li>Con: </li></ul><ul><ul><li>Misguided 'fail fast' implementation within the library </li></ul></ul>
  16. 16. RabbitMQ <ul><li>RabbitMQ < > is conformant to the AMQP spec and provided the features I needed: </li></ul><ul><li>TLS protected communication </li></ul><ul><li>Authentication / Authorization </li></ul><ul><li>High reliability </li></ul><ul><li>Persistent messages </li></ul><ul><li>Broker is implemented in Erlang, but implementation doesn't matter since client side has py-amqplib . </li></ul>
  17. 17. amqplib / carrot py-amqplib is a client library around the AMQP protocol.Fairly low level for my needs though, so a little digging found carrot
  18. 18. carrot sample >>> from carrot.messaging import Publisher, Consumer >>> class PostOfficePublisher (Publisher): ... exchange = &quot;sorting_room&quot; ... routing_key = &quot;jason&quot; >>> class PostOfficeConsumer (Consumer): ... queue = &quot;po_box&quot; ... exchange = &quot;sorting_room&quot; ... routing_key = &quot;jason&quot; ... ... def receive ( self , message_data, message): ... &quot;&quot;&quot;Called when we receive a message.&quot;&quot;&quot; ... print ( &quot;Received: %s &quot; % message_data)
  19. 19. carrot sample >>> from ConfigParser import ConfigParser >>> config = ConfigParser() >>> config . read( &quot;application.ini&quot; ) >>> from carrot.connection import AMQPConnection >>> amqpconn = AMQPConnection( ... hostname = config . get( &quot;broker&quot; , &quot;host&quot; ), ... port = config . get( &quot;broker&quot; , &quot;port&quot; ), ... userid = config . get( &quot;broker&quot; , &quot;userid&quot; ), ... password = config . get( &quot;broker&quot; , &quot;password&quot; ), ... vhost = config . get( &quot;broker&quot; , &quot;vhost&quot; )) >>> PostOfficePublisher(connection = amqpconn) . send( ... { &quot;My message&quot; : [ &quot;foo&quot; , &quot;bar&quot; , &quot;baz&quot; ]}) >>> PostOfficeConsumer(connection = amqpconn) . next() Received: { &quot;My message&quot; : [ &quot;foo&quot; , &quot;bar&quot; , &quot;baz&quot; ]}
  20. 20. multiprocessing <ul><li>Part of the Python 2.6 standard library.Main intent is to provide a process alternative to the threadingQueueManager library.Provides some process coordination facilities, including a object and a network aware interprocess object. </li></ul><ul><li>Pro: </li></ul><ul><ul><li>Part of standard library (2.6 and beyond) </li></ul></ul><ul><li>Con: </li></ul><ul><ul><li>Pretty low level </li></ul></ul>
  21. 21. In Summary <ul><li>I like beanstalkc. </li></ul><ul><li>I like AMQP (specifically RabbitMQ) along with carrot API </li></ul><ul><li>Memcacheq would work well if all you need to do is cache jobs until you can process in batch </li></ul><ul><li>Multiprocessing in worth a look </li></ul><ul><li>I've only scratched the surface (Kamaelia, sprinkle/STOMP, etc) </li></ul>