Modeling concurrency in Ruby and beyond
Upcoming SlideShare
Loading in...5
×
 

Modeling concurrency in Ruby and beyond

on

  • 4,426 views

The world of concurrent computation is a complicated one. We have to think about the hardware, the runtime, and even choose between half a dozen different models and primitives: fork/wait, threads, ...

The world of concurrent computation is a complicated one. We have to think about the hardware, the runtime, and even choose between half a dozen different models and primitives: fork/wait, threads, shared memory, message passing, semaphores, and transactions just to name a few. And that's only the beginning.

What's the state of the art for dealing with concurrency & parallelism in Ruby? We'll take a quick look at the available runtimes, what they offer, and their limitations. Then, we'll dive into the concurrency models and ask are threads really the best we can do to design, model, and test our software? What are the alternatives, and is Ruby the right language to tackle these problems?

Spoiler: out with the threads. Seriously.

Statistics

Views

Total Views
4,426
Views on SlideShare
4,406
Embed Views
20

Actions

Likes
10
Downloads
42
Comments
1

5 Embeds 20

http://coderwall.com 9
http://www.techgig.com 7
http://www.mefeedia.com 2
http://twitter.com 1
http://a0.twimg.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Modeling concurrency in Ruby and beyond Modeling concurrency in Ruby and beyond Presentation Transcript

    • Modeling concurrency in Ruby and beyond
      what is an advanced concurrency model?
      Ilya Grigorik
      @igrigorik
    • “Concurrency is a property of systems in which several computations are executing simultaneously, and potential interacting with each other.”
    • Threads!
      No. Events!
      Neither. You need them both.
      and neither is enough…
    • 2Ghz CPU = 0.5 ns cycle
      RAM: 2000 wasted cycles!
      ~0.5 ns
      ~100 ns
      • Prefetching
      • Brand prediction
      • Instruction pipelining
      • Hyperthreading
      • Speculative execution
      ~7 ns
      Hardware Parallelism
      maximizing resource utilization
      http://bit.ly/cSKKVb
    • if(cond1 && cond2) {
      System.err.println("Am I faster yet?");
      }
      if (cond1 || cond2) {
      System.err.println("Am I fast yet?");
      }
      1
      2
      Turns out. We don’t know.
      A quick poll
      which is faster?
    • Hardware Parallelism
      Software Parallelism
      (Processes, Threads, Events)
      pthreads, lwkt, epoll, kqueue, …
      C / C++, Java, Ruby, ….
      The “concurrency API”
      a bolt-on systems component for any language
    • Bruce: if you could go back in time, what is the one thing you would change?
      Matz: “I would remove the thread and add actors or some other more advanced concurrency features”
      More advanced concurrency features?
    • Hardware Parallelism
      Software Parallelism
      (Processes, Threads, Events)
      pthreads, lwkt, epoll, kqueue, …
      New!
      “Advanced concurrency model”
      C / C++, Java, Ruby, ….
    • Dataflow
      Petri-nets
      Actor Model
      Transactional Memory
      Pi-calculus / CSP

      http://bit.ly/fMLJR8
    • The value of a tool / model is in:
      what it enables you to do
      the constraints it imposes
      • Provide a way to express a behavior
      • Dictate a structure
      • Dictate a style
      • Disallow unwanted behavior
      • Implicitly “make the right choice”
      • Eliminate a class of errors
    • “A Universal Modular Actor Formalism for Artificial Intelligence”
      Carl Hewitt; Peter Bishop and Richard Steiger (1973)
      “Semantics of Communicating Parallel Professes”
      Irene Grief (MIT EECS Doctoral Dissertation. August 1975)

      Erlang (1986), Scala (2003), Kilim, …
      The history:actor model
      Let’s rewind back to the 1973 …
    • Give every process a name
      Give every process a “mailbox”
      Communicate via messages
      • A --> B
      Enables:
      • Message centric view
      • Communication between: threads, processes, machines
      • Distributed programming
      Constraints:
      • No side-effects
      • No race conditions
      • No mutexes, no semaphores
      Actor Model
      The 50k foot view…
    • “Communicating sequential processes”
      Hoare, C.A.R. (1978)
      CCS, pi-calculus, …

      Limbo (1995), Go (2007), CSP++, PyCSP…
      The history:CSPmodel
      Let’s rewind back to the 1978 …
    • Processes are anonymous
      Give every channel a name
      Processes communicate over named channels
      • Think UNIX pipes…
      Enables:
      • Message centric view
      • Communication between: threads, processes, machines
      • Distributed programming
      Constraints:
      • No side-effects
      • No race conditions
      • No mutexes, no semaphores
      CSP / Pi-calculus
      The 50k foot view…
    • A
      Multiple workers can share a channel
      A
      Workers are mobile! Delegate the channel
      to someone else!
      C(A)
      A
      A(B)
      Send a “response” channel to another
      process!
      B
    • gem install agent
      let’s get hands on…
    • Named channel
      Typed channel
      c =Agent::Channel.new(name: 'incr', type: Integer)
      go(c) do |c, i=0|
      loop { c <<i+= 1 }
      end
      p c.receive# => 1
      p c.receive# => 2
      Spawn the worker
      Consume the results
      Producer / Consumer
      look, no threads!
    • Request =Struct.new(:args, :resultChan)
      clientRequests=Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
      worker = Proc.new do |reqs|
      loop do
      req = reqs.receive
      sleep 1.0
      req.resultChan << [Time.now, req.args + 1].join(' : ')
      end
      end
      # start two workers
      go(clientRequests, &worker)
      go(clientRequests, &worker)
      req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))
      req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
      clientRequests << req1
      clientRequests << req2
      puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2
      puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
      “Request” type
      A “multi-threaded” server!
      where’s the synchronization?
    • Request =Struct.new(:args, :resultChan)
      clientRequests=Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
      worker =Proc.newdo |reqs|
      loopdo
      req=reqs.receive
      sleep 1.0
      req.resultChan<< [Time.now, req.args+ 1].join(' : ')
      end
      end
      # start two workers
      go(clientRequests, &worker)
      go(clientRequests, &worker)
      req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))
      req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
      clientRequests << req1
      clientRequests << req2
      puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2
      puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
      wait for work
      Sleep, increment, add timestamp
      A “multi-threaded” server!
      where’s the synchronization?
    • Request =Struct.new(:args, :resultChan)
      clientRequests=Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
      worker =Proc.newdo |reqs|
      loopdo
      req=reqs.receive
      sleep 1.0
      req.resultChan<< [Time.now, req.args+ 1].join(' : ')
      end
      end
      # start two workers
      go(clientRequests, &worker)
      go(clientRequests, &worker)
      req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))
      req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
      clientRequests << req1
      clientRequests << req2
      puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2
      puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
      Both workers listen on same channel
      A “multi-threaded” server!
      where’s the synchronization?
    • Request =Struct.new(:args, :resultChan)
      clientRequests=Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
      worker =Proc.newdo |reqs|
      loopdo
      req=reqs.receive
      sleep 1.0
      req.resultChan<< [Time.now, req.args+ 1].join(' : ')
      end
      end
      # start two workers
      go(clientRequests, &worker)
      go(clientRequests, &worker)
      req1 =Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))
      req2 =Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
      clientRequests << req1
      clientRequests << req2
      puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2
      puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
      Create two requests, each with return channel of type String
      A “multi-threaded” server!
      where’s the synchronization?
    • Request =Struct.new(:args, :resultChan)
      clientRequests=Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
      worker =Proc.newdo |reqs|
      loopdo
      req=reqs.receive
      sleep 1.0
      req.resultChan<< [Time.now, req.args+ 1].join(' : ')
      end
      end
      # start two workers
      go(clientRequests, &worker)
      go(clientRequests, &worker)
      req1 =Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))
      req2 =Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
      clientRequests<< req1
      clientRequests<< req2
      puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2
      puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
      Dispatch both requests
      A “multi-threaded” server!
      where’s the synchronization?
    • Request =Struct.new(:args, :resultChan)
      clientRequests=Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
      worker =Proc.newdo |reqs|
      loopdo
      req=reqs.receive
      sleep 1.0
      req.resultChan<< [Time.now, req.args+ 1].join(' : ')
      end
      end
      # start two workers
      go(clientRequests, &worker)
      go(clientRequests, &worker)
      req1 =Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))
      req2 =Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
      clientRequests<< req1
      clientRequests<< req2
      puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2
      puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
      A “multi-threaded” server!
      where’s the synchronization?
      Collect the results!
    • So, Ruby?
      JRuby, RBX, MacRuby, MRI, …
    • JRuby:
      • No GIL
      • JVM threads
      • Existing libraries & frameworks: Akka, Kilim, etc
      • Great platform for experiments
      Rubinius:
      • Hydra branch: no GIL
      • Built in Channel / Actor primitives
      • Great platform to experiment withwith new language features
      MRI:
      • GIL
      • Research work on MVM
      • ... agent?
      MacRuby:
      • Grand Central Dispatch
      • MacRuby + IOS?
      • GCD + higher level API?
      The many Rubies…
      for your concurrency experiments
    • IO:
      • Small, compact, easy to learn
      • Actor based concurrency
      • http://iolanguage.com/
      Go:
      • Released by Google in ‘07
      • CSP + channels
      • http://golang.org/
      Clojure:
      • JVM + Functional programming
      • Transactional memory
      • http://clojure.org/
      Scala:
      • JVM
      • Actor based concurrency
      • http://www.scala-lang.org/
      … and many others …
      Pick up & experiment with other runtimes!
      learn what works, find what resonates…
    • Hardware Parallelism
      Software Parallelism
      (Processes, Threads, Events)
      pthreads, lwkt, epoll, kqueue, …
      CSP / Actor / Dataflow / Transactional Memory
      In Summary:
      • We need threads; we need events; we need locks; we need shared memory; …
      • Are threads, events, etc., the right API for modeling concurrency? Likely not.
      • Threads, events, etc., should belong under the hood.
    • Multi-core, Threads & Message Passing:
      http://www.igvita.com/2010/08/18/multi-core-threads-message-passing/
      Concurrency with Actors, Goroutines & Ruby
      http://www.igvita.com/2010/12/02/concurrency-with-actors-goroutines-ruby/
      gem install agent
      https://github.com/igrigorik/agent/
      https://github.com/igrigorik/agent/tree/master/spec/
      Phew, time for questions?
      hope this convinced you to explore the area further…