%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs


Published on

Slide of the RailsConf 2009 session

Discover how is possible to use parallel execution to batch process large amount of data, learn how to use queues to distribute workload and coordinate processes, increase the throughput on system with high latency. Have fun with EventMachine, AMQP, RabbitMQ and get rid of that every 5mins cronjob

Published in: News & Politics, Technology
1 Comment
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs

  1. %w(map reduce).first A Tale About Rabbits, Latency and Slim Crontabs Paolo Negri thanks to: www.autoscout24.de
  2. Summary:
  3. Map http://www.matthiasdittrich.com/projekte/dliste/visualisations/index.html
  4. rabbitMQ http://www.flickr.com/photos/myxi/448253580
  5. crontab diet http://www.flickr.com/photos/tim_norris/2600843131/
  6. Map Reduce “Programming model for processing and generating large data sets” (Google)
  7. Map Reduce quot;Mapquot; step the master node takes the input, chops it up into smaller sub-problems, and distributes those to worker nodes. (Wikipedia)
  8. The problem Invoicing our clients
  9. Is it as simple as... clients.map do |client| client.invoice end
  10. No! Because the process is: • distributed • concurrent
  11. Problems: • How many nodes? • How many workers? • Distribution mechanism to feed the workers?
  12. What about queuing? • the master node takes the input, chops it up into smaller sub-problems, and publishes them in a queue • workers independently consume the content of the queue
  13. Here comes • RabbitMQ is an implementation of AMQP, the emerging standard for high performance enterprise messaging • It’s opensource • Can be used to manage queues • Written in Erlang
  14. Erlang? • Erlang is a general-purpose concurrent programming language designed by Ericsson • distributed • fault tolerant • soft real time • high availability
  15. Install it • sudo apt-get install rabbitmq • sudo gem install tmm1-amqp
  16. Do it - master node
  17. Use it - worker node
  18. What and where Master (ruby) RabbitMQ Worker TCP/IP (Erlang) (ruby) Worker (ruby)
  19. Get for free • Decoupling master/worker • Workers take care of feeding themselves • Flexible number of workers
  20. Get for free • RabbitMQ can be clustered • Support of message acknowledgement • Queues can be persisted on disk (at a price) • low latency
  21. Queue • Is an actual entity • has a name • can be inspected and managed
  22. EventMachine
  23. EventMachine Is an implementation of Reactor Pattern • Non blocking IO and lightweight concurrency • eliminate the complexities of high- performance threaded network programming
  24. EventMachine
  25. EventMachine amqp gem is built on EventMachine => you’re in a context where you can leverage concurrent programming
  26. EM - Deferrables
  27. EM - Deferrables “The Deferrable pattern allows you to specify any number of Ruby code blocks that will be executed at some future time when the status of the Deferrable object changes “
  28. EM - Deferrables
  29. EM - Deferrables
  30. Deferrables without deferrables with deferrables ClientStat ClientStat Arrears Arrears Time
  31. Achieved so far • Easy distribution of tasks • Architecture that supports arbitrary number of workers (and masters) • Concurrency within the single worker
  32. More rabbits Analogy with email system
  33. Multicasting - producer
  34. Multicasting - consumer
  35. Multicasting Cons1 Queue1 Publisher Queue2 Cons2 Exchange msg A Cons3 Queue3
  36. Multicasting Cons1 msg A Queue1 Publisher msg A Queue2 Cons2 Exchange Cons3 msg A Queue3
  37. Not only queues then Use messages distribution to build the nervous system of your app • communication across hosts, heterogeneous systems • low latency • clustering
  38. Where to start? crontab -l 5 * * * * bin/do_the_quick_thing.rb 0 2 * * * bin/do_the_scary_thing.rb
  39. Cron • Simple • Reliable • No maintenance • Status is not explicit • Locking? • Shot and forget
  40. Queue • Distributed easily • Reliable • Can be inspected • Add/decrease workers • Makes you think! • Adds more complexity
  41. On github - Projects • eventmachine/eventmachine • tmm1/amqp • macournoyer/thin • famoseagle/carrot • celldee/bunny • ezmobius/nanite
  42. Q&A ?
  43. Thanks! Paolo Negri / hungryblank