%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs

  • 6,081 views
Uploaded on

Slide of the RailsConf 2009 session …

Slide of the RailsConf 2009 session

Discover how is possible to use parallel execution to batch process large amount of data, learn how to use queues to distribute workload and coordinate processes, increase the throughput on system with high latency. Have fun with EventMachine, AMQP, RabbitMQ and get rid of that every 5mins cronjob

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
6,081
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
182
Comments
1
Likes
16

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. %w(map reduce).first A Tale About Rabbits, Latency and Slim Crontabs Paolo Negri thanks to: www.autoscout24.de
  • 2. Summary:
  • 3. Map http://www.matthiasdittrich.com/projekte/dliste/visualisations/index.html
  • 4. rabbitMQ http://www.flickr.com/photos/myxi/448253580
  • 5. crontab diet http://www.flickr.com/photos/tim_norris/2600843131/
  • 6. Map Reduce “Programming model for processing and generating large data sets” (Google)
  • 7. Map Reduce quot;Mapquot; step the master node takes the input, chops it up into smaller sub-problems, and distributes those to worker nodes. (Wikipedia)
  • 8. The problem Invoicing our clients
  • 9. Is it as simple as... clients.map do |client| client.invoice end
  • 10. No! Because the process is: • distributed • concurrent
  • 11. Problems: • How many nodes? • How many workers? • Distribution mechanism to feed the workers?
  • 12. What about queuing? • the master node takes the input, chops it up into smaller sub-problems, and publishes them in a queue • workers independently consume the content of the queue
  • 13. Here comes • RabbitMQ is an implementation of AMQP, the emerging standard for high performance enterprise messaging • It’s opensource • Can be used to manage queues • Written in Erlang
  • 14. Erlang? • Erlang is a general-purpose concurrent programming language designed by Ericsson • distributed • fault tolerant • soft real time • high availability
  • 15. Install it • sudo apt-get install rabbitmq • sudo gem install tmm1-amqp
  • 16. Do it - master node
  • 17. Use it - worker node
  • 18. What and where Master (ruby) RabbitMQ Worker TCP/IP (Erlang) (ruby) Worker (ruby)
  • 19. Get for free • Decoupling master/worker • Workers take care of feeding themselves • Flexible number of workers
  • 20. Get for free • RabbitMQ can be clustered • Support of message acknowledgement • Queues can be persisted on disk (at a price) • low latency
  • 21. Queue • Is an actual entity • has a name • can be inspected and managed
  • 22. EventMachine
  • 23. EventMachine Is an implementation of Reactor Pattern • Non blocking IO and lightweight concurrency • eliminate the complexities of high- performance threaded network programming
  • 24. EventMachine
  • 25. EventMachine amqp gem is built on EventMachine => you’re in a context where you can leverage concurrent programming
  • 26. EM - Deferrables
  • 27. EM - Deferrables “The Deferrable pattern allows you to specify any number of Ruby code blocks that will be executed at some future time when the status of the Deferrable object changes “
  • 28. EM - Deferrables
  • 29. EM - Deferrables
  • 30. Deferrables without deferrables with deferrables ClientStat ClientStat Arrears Arrears Time
  • 31. Achieved so far • Easy distribution of tasks • Architecture that supports arbitrary number of workers (and masters) • Concurrency within the single worker
  • 32. More rabbits Analogy with email system
  • 33. Multicasting - producer
  • 34. Multicasting - consumer
  • 35. Multicasting Cons1 Queue1 Publisher Queue2 Cons2 Exchange msg A Cons3 Queue3
  • 36. Multicasting Cons1 msg A Queue1 Publisher msg A Queue2 Cons2 Exchange Cons3 msg A Queue3
  • 37. Not only queues then Use messages distribution to build the nervous system of your app • communication across hosts, heterogeneous systems • low latency • clustering
  • 38. Where to start? crontab -l 5 * * * * bin/do_the_quick_thing.rb 0 2 * * * bin/do_the_scary_thing.rb
  • 39. Cron • Simple • Reliable • No maintenance • Status is not explicit • Locking? • Shot and forget
  • 40. Queue • Distributed easily • Reliable • Can be inspected • Add/decrease workers • Makes you think! • Adds more complexity
  • 41. On github - Projects • eventmachine/eventmachine • tmm1/amqp • macournoyer/thin • famoseagle/carrot • celldee/bunny • ezmobius/nanite
  • 42. Q&A ?
  • 43. Thanks! Paolo Negri / hungryblank