Concurrent Programming with Ruby and Tuple Spaces

13,304 views
13,196 views

Published on

Ruby threads are limited due to the Global Interpreter Lock. Therefore, the best way to do parallel computing with Ruby is to use multiple processes but how do you get these processes to communicate?

This session will provide some strategies for handling multi-process communication in Ruby, with a focus on the use of TupleSpaces. A TupleSpace provides a repository of tuples that can be accessed concurrently to implement a Blackboard system. Ruby ships with a built-in implementation of a TupleSpace with the Rinda library.

During the session, Luc will demonstrate how to use Rinda and will highlight other libraries/projects that facilitate interprocess communication and parallel computing in Ruby.

Published in: Technology, Education

Concurrent Programming with Ruby and Tuple Spaces

  1. 1. Concurrent Programming with Ruby and Tuple Spaces Luc Castera Founder / messagepub.com
  2. 2. The Free Lunch is Over: A Fundamental Turn Toward Concurrency in Software Source: http://www.gotw.ca/publications/concurrency-ddj.htm “ The major processor manufacturers and architectures have run out of room with most of their traditional approaches to boosting CPU performance. Instead of driving clock speeds ands straight-line instruction throughput ever higher, they are instead turning en masse to hyperthreading and multicore architectures. […] And that puts us at a fundamental turning point in software development, at least for the next few years...” – Herb Sutter – March 2005
  3. 3. Outline 1. The problem with Ruby Threads 2. Multiple Ruby Processes 3. Inter-process Communication with TupleSpaces
  4. 4. PART 1 The Problem With Threads A closer look at the Ruby threading model
  5. 5. 3 Types of Threading Models: 1 : N 1 : 1 M : N
  6. 6. 3 Types of Threading Models: 1 : N 1 : 1 M : N Kernel Threads User-Space Threads
  7. 7. 1 : N -> Green Threads One kernel thread for N user threads aka “lightweight threads”
  8. 9. 10 ms
  9. 10. 10 ms
  10. 11. 10 ms
  11. 12. 10 ms
  12. 13. 10 ms
  13. 14. 10 ms
  14. 15. 10 ms
  15. 16. 10 ms
  16. 17. RUBY 1.8
  17. 18. Pros and Cons <ul><li>Pros: </li><ul><li>Thread creation, execution, and cleanup are cheap
  18. 19. Lots of threads can be created </li></ul><li>Cons: </li><ul><li>Not really parallel because kernel scheduler doesn't know about threads and can't schedule them across CPUs or take advantage of SMP
  19. 20. Blocking I/O operation can block all green threads </li><ul><li>Example: C Extension
  20. 21. Example: mysql gem (solution: NeverBlock mysqlplus) </li></ul></ul></ul>
  21. 22. blocking
  22. 23. 1 : 1 -> Native Threads 1 kernel thread for each user thread
  23. 25. Pros and Cons <ul><li>Pros: </li><ul><li>Threads can execute on different CPUs (truly parallel)
  24. 26. Threads do not block each other </li></ul><li>Cons: </li><ul><li>Setup Overhead
  25. 27. Low limit on number of threads
  26. 28. Linux kernel bug with lots of threads </li></ul></ul>
  27. 29. RUBY 1.9
  28. 30. I lied.
  29. 31. Global Interpreter Lock <ul>A Global Interpreter Lock (GIL) is a mutual exclusion lock held by a programming language interpreter thread to avoid sharing code that is not thread-safe with other threads. There is always one GIL for one interpreter process. Usage of a Global Interpreter Lock in a language effectively limits concurrency of a single interpreter process with multiple threads – there is no or very little increase in speed when running the process on a multiprocessor machine. Source: Wikipedia </ul>
  30. 32. <ul>A person (male or female) who intentionally or unintentionally stops the progress of two others getting their game on. </ul>
  31. 34. “ Concurrency is a myth in Ruby” – Ilya Grigorik
  32. 35. Unless you are using JRuby.
  33. 37. A note on Fibers <ul><li>Ruby 1.9 introduces fibers.
  34. 38. Fibers are green threads, but scheduling must be done by the programmer and not the VM.
  35. 39. Faster and cheaper then native threads.
  36. 40. Implemented for Ruby 1.8 by Aman Gupta.
  37. 41. Learn More: </li><ul><li>http://tinyurl.com/rubyfibers
  38. 42. http://all-thing.net/fibers
  39. 43. http://all-thing.net/fibers-via-continuations </li></ul></ul>
  40. 44. M : N -> Hybrid Model M kernel threads for N user threads “ best of both worlds”
  41. 46. Pros and Cons <ul><li>Pros: </li><ul><li>Take advantage of multiple CPUs
  42. 47. Not all threads are blocked by blocking system calls
  43. 48. Cheap creation, execution, and cleanup </li></ul><li>Cons: </li><ul><li>Need scheduler in userland and kernel to work with each other
  44. 49. Green threads doing blocking I/O operations will block all other green threads sharing same kernel thread
  45. 50. Difficult to write, maintain, and debug code </li></ul></ul>
  46. 52. “ Writing multi-threaded code is really, really hard. And it is hard because of Shared Memory.” – Jim Weirich The Other Problem with Threads http://rubyconf2008.confreaks.com/what-all-rubyist-should-know-about-threads.html
  47. 53. Multi-Threaded Code is Hard + Concurrency is a myth = FAIL!
  48. 54. Stop thinking in threads Design your application to use multiple processes
  49. 55. PART 2 Multiple Ruby Processes
  50. 57. Pros and Cons <ul><li>Pros: </li><ul><li>No longer sharing memory
  51. 58. Take advantage of multiple CPUs (Performance)
  52. 59. Not all threads are blocked by blocking system calls.
  53. 60. Scalability
  54. 61. Fault-Tolerance </li></ul><li>Cons: </li><ul><li>Process creation, execution and cleanup is expensive
  55. 62. Uses a lot of memory (loading Ruby VM for every process)
  56. 63. Need a way for processes to communicate! </li></ul></ul>
  57. 64. Latency Starting/Stopping Fault-Tolerance Monitoring
  58. 65. but we will focus on...
  59. 66. How do the processes communicate?
  60. 67. Options <ul><li>DRB
  61. 68. Sockets
  62. 69. Queues </li><ul><li>RabbitMQ
  63. 70. ActiveMQ </li></ul><li>Key-Value Databases </li><ul><li>Redis
  64. 71. Tokyo Cabinet
  65. 72. Memcached </li></ul><li>Relational Databases
  66. 73. XMPP
  67. 74. TupleSpaces </li></ul>
  68. 75. Examples
  69. 76. Rails + Mongrel/Thin <ul><li>Cluster of application servers (Mongrel, Thin...)
  70. 77. Communication between processes is done via the database. </li></ul>
  71. 78. Nanite <ul><li>A self-assembling fabric of Ruby daemons
  72. 79. http://github.com/ezmobius/nanite
  73. 80. Uses RabbitMQ/AMQP for IPC </li></ul>
  74. 81. Revactor <ul><li>Uses the actor model
  75. 82. Actors are kinda like threads, with messaging baked-in.
  76. 83. Each Actor has a mailbox.
  77. 84. It's like coding erlang in Ruby.
  78. 85. Messages are passed between actors using TCP sockets.
  79. 86. Good Documentation
  80. 87. http://revactor.org/
  81. 88. “ Erlang provides a sledgehammer for the problems of concurrent programming. But, sometimes you don't need a sledgehammer... just a flyswatter will do.” – Tony Arcieri
  82. 89. Discontinued for Reia </li></ul>
  83. 90. Journeta <ul><li>Journeta is a dirt simple library for peer discovery and message passing between Ruby applications on a LAN
  84. 91. Uses UDP Sockets for IPC
  85. 92. “ Uses the fucked up Ruby socket API” </li><ul><li>-> from their RDOC </li></ul><li>Demo(?) </li></ul>
  86. 93. PART 3 TupleSpaces I nterprocess Communication with TupleSpaces
  87. 94. <ul>A tuple space provides a repository of tuples that can be accessed concurrently. </ul>
  88. 95. [:add, 1, 2] [:result, 79] [:add, 60, 5] [:token] [:search, “linda”] [:where_is, :waldo [:subtract, 10, 2] [:save, 7864] The Blackboard Metaphor
  89. 96. [:add, 1, 2] [:result, 79] [:add, 60, 5] [:token] [:search, “linda”] [:where_is, :waldo [:subtract, 10, 2] [:save, 7864] The Blackboard Metaphor [:add, nil, nil]
  90. 97. [:add, 1, 2] [:result, 79] [:add, 60, 5] [:token] [:search, “linda”] [:where_is, :waldo [:subtract, 10, 2] [:save, 7864] The Blackboard Metaphor [nil]
  91. 98. [:add, 1, 2] [:result, 79] [:add, 60, 5] [:token] [:search, “linda”] [:where_is, :waldo [:subtract, 10, 2] [:save, 7864] The Blackboard Metaphor [:where_is, :waldo]
  92. 99. About Tuple Spaces <ul><li>First implementation was Linda.
  93. 100. Linda was developed by David Gelernter and Nicholas Carriero at Yale University.
  94. 101. Implementations exists for most languages.
  95. 102. The Ruby implementation is Rinda.
  96. 103. Rinda is a built-in library, so no need to install. </li></ul>
  97. 104. 5 Basic Operations <ul><li>read
  98. 105. read_all
  99. 106. write
  100. 107. take
  101. 108. notify </li></ul>
  102. 109. 5 Basic Operations <ul><li>read
  103. 110. read_all
  104. 111. write
  105. 112. take
  106. 113. notify </li></ul>Reads tuple, but does not remove it. Blocking, by default, but takes an additional timeout argument.
  107. 114. 5 Basic Operations <ul><li>read
  108. 115. read_all
  109. 116. write
  110. 117. take
  111. 118. notify </li></ul>Returns all tuples matching tuple. Does not remove the found tuples.
  112. 119. 5 Basic Operations <ul><li>read
  113. 120. read_all
  114. 121. write
  115. 122. take
  116. 123. notify </li></ul>Adds Tuple Takes an optional timeout parameter
  117. 124. 5 Basic Operations <ul><li>read
  118. 125. read_all
  119. 126. write
  120. 127. take
  121. 128. notify </li></ul>Atomic Read + Delete Blocking, by default, but takes an additional timeout argument.
  122. 129. 5 Basic Operations <ul><li>read
  123. 130. read_all
  124. 131. write
  125. 132. take
  126. 133. notify </li></ul>Registers for notifications of events: <ul><li>Write
  127. 134. Take
  128. 135. Delete </li></ul>
  129. 136. Key Features <ul><li>Spaces are shared </li><ul><li>Space handles details of concurrent access </li></ul><li>Spaces are persistent </li><ul><li>If agent process dies, data is still in space
  130. 137. However, if space process dies, data is lost (?) </li></ul><li>Spaces are associative </li><ul><li>Associative lookups rather than memory location or identifier </li></ul><li>Spaces are transactionally secure </li><ul><li>Atomic Operations </li></ul><li>Spaces allow us to exchange executable content </li></ul>
  131. 138. <ul>A Rinda tuple can be an array or a hash </ul>
  132. 139. <ul>A Rinda tuple can be an array or a hash </ul>( But let's stick with the array, I like that better! )
  133. 140. Start a Tuple Space on port 1234
  134. 141. Clients/Agents
  135. 143. DEMO Rinda
  136. 144. RingServer
  137. 150. This is also a TupleSpace
  138. 151. SPOF
  139. 152. Rinda is not persistent... If it crashes while you have tuples in the space, you lose them all.
  140. 153. Only Ruby
  141. 154. Introducing Blackboard <ul><li>TupleSpace implementation on top of Redis </li><ul><li>-> Persistent </li></ul><li>Redis is a really fast key-value database. </li><ul><li>Like memcached but data is not volatile. </li></ul><li>Same API -> Plug & Play
  142. 155. For now, only supports: take, read, and write
  143. 156. http://github.com/dambalah/blackboard </li></ul>
  144. 157. Server Just start the redis-server: $ redis-server
  145. 158. Client/Agents
  146. 159. DEMO Blackboard
  147. 160. Blackboard Benchmarks
  148. 161. Blackboard: Future <ul><li>Move from Redis to a custom based Erlang blackboard implementation.
  149. 162. I would like that Erlang implementation to be easily used from other programming languages also.
  150. 163. So it's really two projects: </li><ul><li>Blackboard in erlang
  151. 164. Ruby-library to talk to blackboard in erlang </li></ul></ul>
  152. 165. Thank you! Luc Castera Founder / messagepub.com
  153. 166. Questions?Feedback? [email_address] www.speakerrate.com Luc Castera Founder / messagepub.com
  154. 167. Resources / References <ul><li>Part 1: Threading Models </li><ul><li>http://timetobleed.com/threading-models-so-many-different-ways-to-get-stuff-done/
  155. 168. http://envycasts.com/products/scaling-ruby
  156. 169. http://www.infoq.com/news/2007/05/ruby-threading-futures
  157. 170. http://thebogles.com/blog/2006/11/ruby-threading/
  158. 171. http://spec.ruby-doc.org/wiki/Ruby_Threading
  159. 172. http://www.bitwiese.de/2007/09/on-processes-and-threads.html
  160. 173. http://www.igvita.com/2008/11/13/concurrency-is-a-myth-in-ruby/
  161. 174. http://bartoszmilewski.wordpress.com/2008/08/24/threads-dont-scale-processes-do/
  162. 175. http://en.wikipedia.org/wiki/Global_Interpreter_Lock
  163. 176. http://www.gotw.ca/publications/concurrency-ddj.htm
  164. 177. http://tinyurl.com/rubyfibers </li></ul></ul>
  165. 178. Resources / References <ul><li>Part 2: Multiple Processes </li><ul><li>http://github.com/ezmobius/nanite
  166. 179. http://erlang.org/
  167. 180. http://www.rabbitmq.com/
  168. 181. http://code.google.com/p/redis/
  169. 182. http://revactor.org/
  170. 183. http://journeta.rubyforge.org/
  171. 184. http://home.mindspring.com/~eric_rollins/ParallelRuby.html </li></ul></ul>
  172. 185. Resources / References <ul><li>Part 3: TupleSpaces </li><ul><li>http://c2.com/cgi/wiki?TupleSpace
  173. 186. http://en.wikipedia.org/wiki/Tuplespace
  174. 187. http://www.julianbrowne.com/article/viewer/space-based-architecture-example
  175. 188. http://www.rubyagent.com/
  176. 189. http://segment7.net/projects/ruby/drb/
  177. 190. http://segment7.net/projects/ruby/drb/rinda/ringserver.html
  178. 191. JavaSpaces Principles, Patterns, and Practice – Freeman, Hupfer, et. al.
  179. 192. http://www.ruby-doc.org/stdlib/libdoc/rinda/rdoc/index.html </li></ul></ul>
  180. 193. Things I wish I had time to spend on <ul><li>MPI and Ruby-MPI </li><ul><li>http://github.com/abedra/mpi-ruby/tree/master </li></ul><li>Ruby forkoff: </li><ul><li>http://tinyurl.com/forkoff </li></ul></ul>

×