Ruby Concurrency
...and the mysterious case of the Reactor Pattern




                                       Christopher Spring
or...
WTF the EventMachine
and why should I care?
WTF the EventMachine
and why should I care?
WTF the EventMachine
and why should I care?
Resource utilization

                      • Lots of IO (90/10)
                       • Disk
                       • Network
                       • system()
                      • Lots of cores

http://www.mikeperham.com/2010/01/27/scalable-ruby-processing-with-eventmachine/
Ruby concurrency basics

      • Threads
      • Fibers
      • Processes
Threading Models
       1: N


       1 :1


      M :N
Threading Models
                 1: N


Kernel Threads   1 :1   User Threads


                 M :N
Threading Models
                 1: N


                 1 :1


                 M :N



Kernel Threads           User Threads
Threading Models
                    1: N
  • Green Threads
  • Ruby 1.8        1 :1

  • Pros/Cons
                    M :N



Kernel Threads             User Threads
Threading Models
                 1: N


                 1 :1


                 M :N



Kernel Threads           User Threads
Threading Models
                       1: N
  • Native Threads
  • Ruby 1.9 / jRuby   1 :1

  • Pros/Cons
                       M :N



Kernel Threads                User Threads
Threading Models
                 1: N


                 1 :1   ?
                 M :N



Kernel Threads           User Threads
Threads
Threads

• Shared state and memory space
Threads

• Shared state and memory space
• Relatively light weight
Threads

• Shared state and memory space
• Relatively light weight
• Preemptive scheduling
Ruby has baggage: GIL




 http://www.igvita.com/2008/11/13/concurrency-is-a-myth-in-ruby/
Threads... suck
Threads... suck

• Race conditions
Threads... suck

• Race conditions
• Deadlocks
Threads... suck

• Race conditions
• Deadlocks
• Hard to debug
Threads... suck

• Race conditions
• Deadlocks
• Hard to debug
• GIL
Threads... suck

• Race conditions
• Deadlocks
• Hard to debug
• GIL
Fibers
Fibers

• It’s a coroutine, dammit!
Fibers

• It’s a coroutine, dammit!
 • “... [component to] generalize
    subroutines to allow multiple entry
    points for suspending and resuming
    execution at certain locations.”
file_iterator = Fiber.new do
  file = File.open('stuff.csv', 'r')
  while line = file.gets
    Fiber.yield line
  end
  file.close
end

3.times{ file_iterator.resume }

# => line 1
# => line 2
# => line 3
Fibers

• Cooperative Scheduling
• Very lightweight
• Maintains state
• Great for: cooperative tasks, iterators,
  infinite lists and pipes
def interpret_csv( csv_source )
  Fiber.new do
    while csv_source.alive?
      str = csv_source.resume
      Fiber.yield str.split(',').map(&:strip)
    end
  end
end

def file_iterator(file_name)
  Fiber.new do
    file = File.open(file_name, 'r')
    while line = file.gets
      Fiber.yield line
    end
    file.close
  end
end

interpret_csv( file_iterator('stuff.csv') ).resume

# => [...]
Reactor Pattern
Client     Client    Client     Client




                     IO Stream




Event Handler A     Event Dispatcher
Event Handler B
                                        Demultiplexer
Event Handler C
Event Handler D
Benefits

• Non blocking IO
• Coarse grain concurrency
• No threads!
• Single Process
Limitations

• Hard to debug
• Dispatch to event handlers is synchronous
• Demultiplexer polling limits
EventMachine
require 'eventmachine'

class EchoServer < EM::Connection
  def post_init
    puts "New connecting"
  end

  def unbind
    puts "Connection closed"
  end

  def receive_data(data) # unbuffered!!
    puts "<< #{data}"
    send_data ">> #{data}"
  end
end

EM.run do
  EM.start_server('127.0.0.1', 9000, EchoServer)
  puts "Started server at 127.0.0.1:9000"
end # Runs till EM.stop called
#   $ telnet localhost 9000
require 'eventmachine'                    #   Hello
                                          #   >> Hello
class EchoServer < EM::Connection         #   Bye
  def post_init                           #   >> Bye
    puts "New connecting"
  end

  def unbind
    puts "Connection closed"
  end

  def receive_data(data) # unbuffered!!
    puts "<< #{data}"
    send_data ">> #{data}"
  end
end

EM.run do
  EM.start_server('127.0.0.1', 9000, EchoServer)
  puts "Started server at 127.0.0.1:9000"
end # Runs till EM.stop called
Primitives

• run loop
• next_tick; add_timer; add_periodic_timer
• EM.defer
• EM::Queue
• EM::Channel
EM.run do
               Defer
  operation = Proc.new do
    puts 'MapMap!!'
    sleep 3
    puts 'Done collecting data'
    [1, 1, 2, 3, 5, 8, 13]
  end

  callback = Proc.new do |arr|
    puts 'Reducing...'
    sleep 1
    puts 'Reduced'
    puts arr.inject(:+)
    EM.stop
  end

  EM.defer(operation, callback)
end
EM.run do
               Defer              #
                                  #
                                  #
                                      MapMap!!
                                      Done collecting data
                                      Reducing...
                                  #   Reduced
  operation = Proc.new do
                                  #   33
    puts 'MapMap!!'
    sleep 3
    puts 'Done collecting data'
    [1, 1, 2, 3, 5, 8, 13]
  end

  callback = Proc.new do |arr|
    puts 'Reducing...'
    sleep 1
    puts 'Reduced'
    puts arr.inject(:+)
    EM.stop
  end

  EM.defer(operation, callback)
end
EM.run do
            Queue
  queue = EM::Queue.new

  EM.defer do
    sleep 2; queue.push 'Mail 1'
    sleep 3; queue.push 'Mail 2'
    sleep 4; queue.push 'Mail 3'
  end

  mail_sender = Proc.new do |mail|
    puts "Sending #{mail}"
    EM.next_tick{ queue.pop(&mail_sender)}
  end

  queue.pop(&mail_sender)

end
Channel
EM.run do
  channel = EM::Channel.new

 EM.defer do
   channel.subscribe do |msg|
     puts "Received #{msg}"
   end
 end

  EM.add_periodic_timer(1) do
    channel << Time.now
  end
end
class Mailer
  include EM::Deferrable    Deferrable
  def initialize
    callback do
      sleep 1
      puts 'Updated statistics!'
    end

    errback{ puts 'retrying mail'}
  end

  def send
    rand >= 0.5 ? succeed : fail
  end
end

EM.run do
  5.times do
    mailer = Mailer.new
    EM.add_timer(rand * 5){ mailer.send}
  end
end
class Mailer
  include EM::Deferrable    Deferrable
  def initialize                 # Updating statistics!
    callback do                  # Updating statistics!
      sleep 1                    # retrying mail
      puts 'Updated statistics!'
    end

    errback{ puts 'retrying mail'}
  end

  def send
    rand >= 0.5 ? succeed : fail
  end
end

EM.run do
  5.times do
    mailer = Mailer.new
    EM.add_timer(rand * 5){ mailer.send}
  end
end
class Mailer
               Stacked callbacks
  include EM::Deferrable

                              EM.run do
  def add_mailing(val)
                                m = Mailer.new
    callback{
                                m.add_mailing(1)
      sleep 1;
                                m.add_mailing(2)
      puts "Sent #{val}"
                                m.connection_open!
    }
  end
                                EM.add_timer(1) do
                                  m.connection_lost!
  def connection_open!
                                  EM.add_timer(2) do
    puts 'Open connection'
                                    m.add_mailing(3)
    succeed
                                    m.add_mailing(4)
  end
                                    m.connection_open!
                                  end
  def connection_lost!
                                end
    puts 'Lost connection'
                              end
    set_deferred_status nil
  end
end
class Mailer
               Stacked callbacks
  include EM::Deferrable

                              EM.run do
  def add_mailing(val)
                                m = Mailer.new
    callback{                                            #   Open   connection
                                m.add_mailing(1)
      sleep 1;
                                m.add_mailing(2)         #   Sent   1
      puts "Sent #{val}"
                                m.connection_open!       #   Sent   2
    }
  end                                                    #   Lost   connection
                                EM.add_timer(1) do
                                                         #   Open   connection
                                  m.connection_lost!
  def connection_open!                                   #   Sent   3
                                  EM.add_timer(2) do
    puts 'Open connection'
                                    m.add_mailing(3)     #   Sent   4
    succeed
                                    m.add_mailing(4)
  end
                                    m.connection_open!
                                  end
  def connection_lost!
                                end
    puts 'Lost connection'
                              end
    set_deferred_status nil
  end
end
Gotchas

• Synchronous code will slow it down
 • Use/Write libraries for EM
• Everything in the event loop must be async!
Summary

• It’s a blocking world!
• Alternative concurrency implementations
• Start playing with EM
Worth checking out

• EM-Synchrony:
  https://github.com/igrigorik/em-synchrony
• Goliath:
  https://github.com/postrank-labs/goliath
Baie Dankie!
Questions?
Links

• http://www.mikeperham.com
• http://www.igvita.com (!!)
• http://rubyeventmachine.com/

Ruby Concurrency and EventMachine

  • 1.
    Ruby Concurrency ...and themysterious case of the Reactor Pattern Christopher Spring
  • 2.
  • 3.
    WTF the EventMachine andwhy should I care?
  • 4.
    WTF the EventMachine andwhy should I care?
  • 5.
    WTF the EventMachine andwhy should I care?
  • 6.
    Resource utilization • Lots of IO (90/10) • Disk • Network • system() • Lots of cores http://www.mikeperham.com/2010/01/27/scalable-ruby-processing-with-eventmachine/
  • 7.
    Ruby concurrency basics • Threads • Fibers • Processes
  • 8.
    Threading Models 1: N 1 :1 M :N
  • 9.
    Threading Models 1: N Kernel Threads 1 :1 User Threads M :N
  • 10.
    Threading Models 1: N 1 :1 M :N Kernel Threads User Threads
  • 11.
    Threading Models 1: N • Green Threads • Ruby 1.8 1 :1 • Pros/Cons M :N Kernel Threads User Threads
  • 12.
    Threading Models 1: N 1 :1 M :N Kernel Threads User Threads
  • 13.
    Threading Models 1: N • Native Threads • Ruby 1.9 / jRuby 1 :1 • Pros/Cons M :N Kernel Threads User Threads
  • 14.
    Threading Models 1: N 1 :1 ? M :N Kernel Threads User Threads
  • 15.
  • 16.
    Threads • Shared stateand memory space
  • 17.
    Threads • Shared stateand memory space • Relatively light weight
  • 18.
    Threads • Shared stateand memory space • Relatively light weight • Preemptive scheduling
  • 20.
    Ruby has baggage:GIL http://www.igvita.com/2008/11/13/concurrency-is-a-myth-in-ruby/
  • 21.
  • 22.
  • 23.
    Threads... suck • Raceconditions • Deadlocks
  • 24.
    Threads... suck • Raceconditions • Deadlocks • Hard to debug
  • 25.
    Threads... suck • Raceconditions • Deadlocks • Hard to debug • GIL
  • 26.
    Threads... suck • Raceconditions • Deadlocks • Hard to debug • GIL
  • 27.
  • 28.
    Fibers • It’s acoroutine, dammit!
  • 29.
    Fibers • It’s acoroutine, dammit! • “... [component to] generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations.”
  • 30.
    file_iterator = Fiber.newdo file = File.open('stuff.csv', 'r') while line = file.gets Fiber.yield line end file.close end 3.times{ file_iterator.resume } # => line 1 # => line 2 # => line 3
  • 31.
    Fibers • Cooperative Scheduling •Very lightweight • Maintains state • Great for: cooperative tasks, iterators, infinite lists and pipes
  • 32.
    def interpret_csv( csv_source) Fiber.new do while csv_source.alive? str = csv_source.resume Fiber.yield str.split(',').map(&:strip) end end end def file_iterator(file_name) Fiber.new do file = File.open(file_name, 'r') while line = file.gets Fiber.yield line end file.close end end interpret_csv( file_iterator('stuff.csv') ).resume # => [...]
  • 33.
  • 34.
    Client Client Client Client IO Stream Event Handler A Event Dispatcher Event Handler B Demultiplexer Event Handler C Event Handler D
  • 35.
    Benefits • Non blockingIO • Coarse grain concurrency • No threads! • Single Process
  • 36.
    Limitations • Hard todebug • Dispatch to event handlers is synchronous • Demultiplexer polling limits
  • 37.
  • 38.
    require 'eventmachine' class EchoServer< EM::Connection def post_init puts "New connecting" end def unbind puts "Connection closed" end def receive_data(data) # unbuffered!! puts "<< #{data}" send_data ">> #{data}" end end EM.run do EM.start_server('127.0.0.1', 9000, EchoServer) puts "Started server at 127.0.0.1:9000" end # Runs till EM.stop called
  • 39.
    # $ telnet localhost 9000 require 'eventmachine' # Hello # >> Hello class EchoServer < EM::Connection # Bye def post_init # >> Bye puts "New connecting" end def unbind puts "Connection closed" end def receive_data(data) # unbuffered!! puts "<< #{data}" send_data ">> #{data}" end end EM.run do EM.start_server('127.0.0.1', 9000, EchoServer) puts "Started server at 127.0.0.1:9000" end # Runs till EM.stop called
  • 40.
    Primitives • run loop •next_tick; add_timer; add_periodic_timer • EM.defer • EM::Queue • EM::Channel
  • 41.
    EM.run do Defer operation = Proc.new do puts 'MapMap!!' sleep 3 puts 'Done collecting data' [1, 1, 2, 3, 5, 8, 13] end callback = Proc.new do |arr| puts 'Reducing...' sleep 1 puts 'Reduced' puts arr.inject(:+) EM.stop end EM.defer(operation, callback) end
  • 42.
    EM.run do Defer # # # MapMap!! Done collecting data Reducing... # Reduced operation = Proc.new do # 33 puts 'MapMap!!' sleep 3 puts 'Done collecting data' [1, 1, 2, 3, 5, 8, 13] end callback = Proc.new do |arr| puts 'Reducing...' sleep 1 puts 'Reduced' puts arr.inject(:+) EM.stop end EM.defer(operation, callback) end
  • 43.
    EM.run do Queue queue = EM::Queue.new EM.defer do sleep 2; queue.push 'Mail 1' sleep 3; queue.push 'Mail 2' sleep 4; queue.push 'Mail 3' end mail_sender = Proc.new do |mail| puts "Sending #{mail}" EM.next_tick{ queue.pop(&mail_sender)} end queue.pop(&mail_sender) end
  • 44.
    Channel EM.run do channel = EM::Channel.new EM.defer do channel.subscribe do |msg| puts "Received #{msg}" end end EM.add_periodic_timer(1) do channel << Time.now end end
  • 45.
    class Mailer include EM::Deferrable Deferrable def initialize callback do sleep 1 puts 'Updated statistics!' end errback{ puts 'retrying mail'} end def send rand >= 0.5 ? succeed : fail end end EM.run do 5.times do mailer = Mailer.new EM.add_timer(rand * 5){ mailer.send} end end
  • 46.
    class Mailer include EM::Deferrable Deferrable def initialize # Updating statistics! callback do # Updating statistics! sleep 1 # retrying mail puts 'Updated statistics!' end errback{ puts 'retrying mail'} end def send rand >= 0.5 ? succeed : fail end end EM.run do 5.times do mailer = Mailer.new EM.add_timer(rand * 5){ mailer.send} end end
  • 47.
    class Mailer Stacked callbacks include EM::Deferrable EM.run do def add_mailing(val) m = Mailer.new callback{ m.add_mailing(1) sleep 1; m.add_mailing(2) puts "Sent #{val}" m.connection_open! } end EM.add_timer(1) do m.connection_lost! def connection_open! EM.add_timer(2) do puts 'Open connection' m.add_mailing(3) succeed m.add_mailing(4) end m.connection_open! end def connection_lost! end puts 'Lost connection' end set_deferred_status nil end end
  • 48.
    class Mailer Stacked callbacks include EM::Deferrable EM.run do def add_mailing(val) m = Mailer.new callback{ # Open connection m.add_mailing(1) sleep 1; m.add_mailing(2) # Sent 1 puts "Sent #{val}" m.connection_open! # Sent 2 } end # Lost connection EM.add_timer(1) do # Open connection m.connection_lost! def connection_open! # Sent 3 EM.add_timer(2) do puts 'Open connection' m.add_mailing(3) # Sent 4 succeed m.add_mailing(4) end m.connection_open! end def connection_lost! end puts 'Lost connection' end set_deferred_status nil end end
  • 49.
    Gotchas • Synchronous codewill slow it down • Use/Write libraries for EM • Everything in the event loop must be async!
  • 50.
    Summary • It’s ablocking world! • Alternative concurrency implementations • Start playing with EM
  • 51.
    Worth checking out •EM-Synchrony: https://github.com/igrigorik/em-synchrony • Goliath: https://github.com/postrank-labs/goliath
  • 53.
  • 54.
  • 55.

Editor's Notes

  • #2 \n
  • #3 \n
  • #4 \n
  • #5 \n
  • #6 \n
  • #7 \n
  • #8 \n
  • #9 \n
  • #10 Pros: Lots of threads; Cheap to create, execute &amp; cleanup\nCons: Kernel doesn&amp;#x2019;t know about threads; Blocking\ne.g. new green thread for every http request that comes in... \n
  • #11 Pros: Non blocking; Multi core systems; Shared memory\nCons: Expensive to create; complex context switching; far fewer threads\n
  • #12 Pros: Best of both worlds: Multiple CPUS; Not all threads blocked by system calls; Cheap creation, execution &amp; cleanup\nCons: Green threads blocking on IO can block other Green threads in kernel thread; Hard; Kernel and User scheduler need to work together\n
  • #13 Resource utilization\nAsync IO\n
  • #14 Resource utilization\nAsync IO\n
  • #15 Resource utilization\nAsync IO\n
  • #16 \n
  • #17 Ruby has a legacy of being thread unsafe (e.g. rails only became thread safe 2.2&amp;#x2018;ish)\n1.9 Ruby code does not execute on more than one thread concurrently!\n
  • #18 \n
  • #19 \n
  • #20 \n
  • #21 \n
  • #22 \n
  • #23 \n
  • #24 \n
  • #25 \n
  • #26 Fast and cheap to setup\n
  • #27 \n
  • #28 \n
  • #29 \n
  • #30 \n
  • #31 Inverted flow control (callback hell)\n... which limit concurrency\n\n
  • #32 Toolkit for creating evented apps\n
  • #33 EM interchangeable with EventMachine\n
  • #34 next_tick -&gt; run code at the next opportunity (always run in main thread)\ndefer -&gt; defer work to run on a thread (green) - 20 by default\nQueue -&gt; data\nChannel -&gt; comms\n
  • #35 \n
  • #36 \n
  • #37 \n
  • #38 \n
  • #39 \n
  • #40 \n
  • #41 \n
  • #42 \n
  • #43 \n
  • #44 \n
  • #45 \n
  • #46 \n