Ruby's Concurrency
Management:
Now and Future
Koichi Sasada
ko1@cookpad.com
Today’s talk
•Supported features
•Process
•Thread
•Fiber
•Features under consideration
•Guild
•Auto-Fiber
Today’s talk
Process Guild Thread Auto-Fiber Fiber
Available Yes No Yes No Yes
Switch on time Yes Yes Yes No No
Switch on I/O Auto Auto Auto Auto No
Next target Auto Auto Auto Auto Specify
Parallel run Yes Yes No (on MRI) No No
Shared data N/A (mostly) N/A Everything Everything Everything
Comm. Hard Maybe Easy Easy Easy Easy
Programming
difficulty
Hard Easy Difficult Easy Easy
Debugging
difficulty
Easy? Maybe Easy Hard Maybe hard Easy
Koichi Sasada
http://atdot.net/~ko1/
•A programmer
•2006-2012 Faculty
•2012-2017 Heroku, Inc.
•2017- Cookpad Inc.
•Job: MRI development
•MRI: Matz Ruby Interpreter
•Core parts
• VM, Threads, GC, etc
Ruby interpreter
Ruby (Rails) app
RubyGems/Bundler
So many gems
such as Rails, pry, thin, … and so on.
Normal Ruby developer’s view
i gigantum umeris insidentes
Standing on the shoulders of giants
Interpret on RubyVM
Normal MRI developer’s view
6
Ruby
script
Parse
Compile
(codegen)
Ruby
Bytecode
Object management
Garbage collectorThreading
Embedded
classes and methods
Bundled
Libraries
Evaluator
Gem
Libraries
AST
Abstract Syntax Tree
Ruby interpreter
Ruby (Rails) app
RubyGems/Bundler
So many gems
such as Rails, pry, thin, … and so on.
Koichi’s job
Koichi
Ruby3: Ruby3 has 3 goals
•Static type checking
•Just-in-Time (JIT) compilation
•Parallel execution w/ highly
abstract concurrent model
Ruby3: Ruby3 has 3 goals
•For productivity
•Static checking
•For performance
•Just-in-Time (JIT) compilation
•Parallel execution w/ highly abstract
concurrent model
Concurrency
“In computer science, concurrency is the decomposability
property of a program, algorithm, or problem into order-
independent or partially-ordered components or units.[1] This
means that even if the concurrent units of the program,
algorithm, or problem are executed out-of-order or in partial
order, the final outcome will remain the same. This allows for
parallel execution of the concurrent units, which can
significantly improve overall speed of the execution in multi-
processor and multi-core systems.”
https://en.wikipedia.org/wiki/Concurrency_(computer_science)
Concurrent and Parallel execution
Concurrent
execution
Logical concept
Parallel
(and concurrent)
execution
Physical concept
Ruby (MRI) support only concurrency
Task A
Task B
Task A
Task B
Concurrency
Why needed?
•Easy to write some kind of programs
•Download files simultaneously
•Process web requests simultaneously
•Agent simulation (assume computer games)
•Each agent has its own logics
•Run agents simultaneously
Concurrency
Example: Downloader
Download A
Download B
Download C
wait for receving
wait for receving
wait for receving
post process
post process
post process
We can write this kind of program w/o concurrency support,
but not simple, not easy
Downloader example
With concurrency support (Thread)
ts = URLs.map do |url|
Thread.new(url) do |u|
data = download(u)
File.write(u.to_fname, data)
end
end.each{|th| th.join} # wait
Downloader example
Without concurrency support
# Serial execution
URLs.each do |u|
data = download(u)
File.write(u.to_fname, data)
end
Concurrency
Not concurrent case
Download A
Download B
Download C
wait for receving
wait for receving
post process
post process
… and download C after that
Downloader example
Without concurrency support
# Use select. Not so SIMPLE!!
fds = URLs.map do |u|
download_fd(u)
end
while ready_fds = select(fds)
ready_fds.each{|fd|
File.write(…, read(fd))}
end
Existing concurrency supports on
Ruby (MRI)
Supported features by Ruby/MRI
•Process
•Thread
•Fiber
Process
Traditional concurrency support
Process
•Use OS multi-process
• Use fork on Unix-like systems
•Shared-nothing
• Communicate with IPC (pipe, etc) such as IO.pipe
•Programming
• Difficult to manage processes and IPC
•Debugging
• Easy because a few synchronization bugs
Inter-process communication
Ruby process Ruby process
pipe
file
obj
string
Serialize
obj
string
Deserialize
Inter-process communication
Example code
# Traditional multi-process example
r, w = IO.pipe
fork do
result_str = work_something.to_s
w.write result_str
w.close
end
puts r.read # wait for a result
Sophisticated libraries/frameworks for
process programming
•dRuby: Distributed object for Ruby
•parallel gem: Parallel programming with
processes
•unicorn: Process based web application
server (master – worker model w/ processes)
Thread
Ruby’s native concurrency support
Thread
•Use Ruby managed threads
• Thread.new do … end
•Shared-everything
• Communication is very easy
•Programming
• Easy to make, easy to communicate (at a glance)
• Difficult to make completely safe program
•Debugging
• Hard because of synchronization
MRI: Thread with Giant Lock (GIL)
•Only a thread keeping the GIL can run (can’t
run in parallel)
Thread 1
Thread 2
Thread 1
CPU 1
CPU 2
IDLE
OS Thread 1
OS Thread 2
Lock
Lock
Inter-thread communication
Easy to share objects
Ruby process
objThread Thread
We can share objects directly
between threads very easily
Inter-thread communication
v = Object.new
$g = Object.new
Thread.new do
p [v, $g]
end
p [v, $g]
Thread programming
Synchronization is required
•Reading/writing data simultaneously w/o
synchronization will cause serious problem
•Race condition
•Data race
Mutate shared objects
Lucky case
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘ko1’
@gender = ‘male’
(1) Thread A tries to change the Speaker
to “Yuki” (female)
Note: Yuki is my wife.
Mutate shared objects
Lucky case
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘Yuki’
@gender = ‘male’
(2) A changes the name to “Yuki”
write
Mutate shared objects
Lucky case
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘Yuki’
@gender = ‘female’
(3) A changes the gender to “female”
write
Mutate shared objects
Lucky case
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘Yuki’
@gender = ‘female’
(4) Complete.
A and B can read correct speaker.
read read
Mutate shared objects
Problematic case
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘ko1’
@gender = ‘male’
(1) Thread A tries to change the Speaker
to “Yuki” (female)
Mutate shared objects
Problematic case
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘Yuki’
@gender = ‘male’
(2) A changes the name to “Yuki”
write
Mutate shared objects
Problematic case
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘Yuki’
@gender = ‘male’
(3) Before the changing,
B read incorrect data!!
Note: Yuki should be female.
read
Inter-thread communication
Synchronization
•Require synchronization for shared data
•Mutex, Queue and so on
•Usually Queue is enough
•To prohibit simultaneous mutation
•We need to keep consistency for each objects
Mutate shared objects
With lock
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘ko1’
@gender = ‘male’
(1) Thread A tries to change the Speaker
to “Yuki” (female). Lock an obj.
Locked by A
Mutate shared objects
With lock
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘Yuki’
@gender = ‘male’
(2) A changes the name to “Yuki”
Locked by A
Mutate shared objects
With lock
Ruby process
Thread
A
Thread
B
SpeakerObject
@name = ‘Yuki’
@gender = ‘male’
(3) Before complete the changing,
B tries to read, but prohibited by a lock
Locked by A
read
Thread programming
Easy to share data: Good and Bad
•Good: Easy to communicate with threads
•Bad: Too easy. Difficult to manage all of
them
•Mutation for shared data requires correct
synchronization
•Sometimes objects are shared implicitly
•Otherwise, it causes serious problems
“Why Threads Are A Bad Idea
(for most purposes)”
• Quoted from John Ousterhout, 1995 (about 20 years ago ☺)
Compare Process with Thread
Process Thread
Available Yes Yes
Switch on time Yes Yes
Switch on I/O Auto Auto
Next target Auto Auto
Parallel run Yes No (on MRI)
Shared data N/A Everything
Communication Hard (high-overhead) Easy (lightweight)
Programming difficulty Hard Difficult
Debugging difficulty Easy? Hard
Fiber
User-defined context switching
Fiber example
Infinite generator
fib = Fiber.new do
Fiber.yield a = b = 1
loop{ a, b = b, a+b
Fiber.yield a }
end
10.times{ p fib.resume }
Fiber example
Infinite generator
fib = Fiber.new do
Fiber.yield a = b = 1
loop{ a, b = b, a+b
Fiber.yield a }
end
10.times{ p fib.resume }
1. Fiber creation
2. Resume Fiber
3. Return to the
parent fiber
4. Resume fiber
(again)
5. Return to the
parent fiber
6. Resume fiber
(again2)
Fiber example
Infinite generator
fib = Fiber.new do
Fiber.yield a = b = 1
loop{ a, b = b, a+b
Fiber.yield a }
end
10.times{ p fib.resume }
1. Fiber creation
2. Resume Fiber
3. Return to the
parent fiber
4. Resume fiber
(again)
5. Return to the
parent fiber
6. Resume fiber
(again2)
Not a Proc?
a = 0; b = 1
fib = Proc.new{
a, b = b, a+b
a
}
p fib.call #=> 1
p fib.call #=> 1
p fib.call #=> 2
p fib.call #=> 3
p fib.call #=> 5
Proc can’t restart from
the middle of block
Proc (method) v.s. Fiber
Proc (method) Fiber
Start OK: call OK: Fiber#resume
Parameters OK: block (method) parameters OK: block parameters
Return OK: exit Proc/method OK: exit Proc/method
Suspend NG: N/A OK: Fiber.yield
Continue NG: N/A OK: Fiber#resume
Fiber example
Inner iterator to external iterator
f1 = Fiber.new do
2.times{|i| Fiber.yield i}
end
p f1.resume #=> 0
p f1.resume #=> 1
p f1.resume #=> 2 # return value of #times
p f1.resume #=> dead fiber called
(FiberError)
Fiber example
Inner iterator to external iterator
etc_passwd_ex_iter = Fiber.new do
open('/etc/passwd').each_line{|line|
Fiber.yield line
}
end
p etc_passwd_ex_iter.resume #=> 1st line
p etc_passwd_ex_iter.resume #=> 2nd line
…
Fiber example
Inner iterator to external iterator
# make Enumerator
iter = open('/etc/passwd').each_line
# Enumerator#next use Fiber implicitly
p iter.next #=> 1st line
p iter.next #=> 2nd line
…
Fiber example
Agent simulation
characters << Fiber.new{
loop{cat.move_up; Fiber.yield}}
characters << Fiber.new{
loop{dog.move_left; Fiber.yield}}
…
loop{cs.each{|e| e.resume}; redraw}
Fiber example
Agent simulation
characters << Fiber.new{
# you can specify complex rule for chars
loop{
cow.move_up; Fiber.yield
cow.move_right; Fiber.yield
cow.move_down; Fiber.yield
cow.move_left; Fiber.yield
}
}
Fiber example
Non-blocking IO scheduler
Wait multiple IO ops with
traditional “select” or
modern “poll”, “epoll” interface
Fiber
Programming difficulty
•Good
•Synchronization for shared data is not required
because of no unexpected switching
•Lightweight than Processes and Threads
•Bad
•We need to switch explicitly. For example,
“Blocking operations” (I/O blocking, etc) stop all
fibers
Comparison of existing supports
Process Thread Fiber
Available Yes Yes Yes
Switch on time Yes Yes No
Switch on I/O Auto Auto No
Next target Auto Auto Specify
Parallel run Yes No (on MRI) No
Shared data N/A Everything Everything
Comm. Hard Easy Easy
Programming difficulty Hard Difficult Easy
Debugging difficulty Easy? Hard Easy
Fiber: Brief history
•2007/05/23 cont.c (for callcc)
•2007/05/25 Fiber impl. [ruby-dev:30827]
•2007/05/28 Fiber introduced into cont.c
•2007/08/25 Fix Fiber spec
•2017 is 10th anniversary I introduced ☺
Proposed concurrency
features
Guild
Auto-Fiber
Guild
Proposed concurrency support for Ruby 3
Key idea
Problem of multi-thread programming:
Easy to share mutable objects
Idea:
Prohibit sharing mutable objects
Our goal for Ruby 3
•We need to keep compatibility with Ruby 2.
•We can make parallel program.
•We shouldn’t consider locks any more.
•We can share objects with copy, but copy
operation should be fast.
•We should share immutable objects if we can.
•We can provide special objects to share
mutable objects like Clojure if we really need
speed.
Guild: New concurrency abstraction
•Guild has at least one thread (and a thread
has at least one fiber)
Guild
Thread
Fiber
Guild
Thread
Fiber
Guild
Thread
Fiber
Fiber
Thread
Fiber
Threads in different guilds can run in
Parallel
•Threads in different guilds can run in parallel
•Threads in a same guild can not run in parallel
because of GVL (or GGL: Giant Guild Lock)
G1:T1
G1:T2
G2:T3
Acquire GGL
Acquire GGL
Important rule:
Mutable Objects have a membership
•All of mutable objects should belong to only
one Guild exclusively
•Guild can not touch objects belong to other
guildsGuild 1 Guild 2
obj
obj
obj
obj
obj
Can’t access
(read/write)
NG!!
Object membership
Only one guild can access mutable object
→ We don’t need to consider locks
(if Guild has only one thread)
Inter-guild communication
•“Guild::Channel” to communicate each guilds
•Two communication methods
1. Copy
2. Move (transfer_membership)
Copy using Channel
Guild1 Guild2
o2
o3
o1
channel
o2
o3
o1
COPY
channel.transfer(o1) o1 = channel.receive
O2:Data
O3:Data
O2:Data
O3:Data
Move using Channel
Guild1 Guild2
o2
o3
o1
channel
MOVE
channel.transfer_membership(o1) o1 = channel.receive
O2:Data
O3:Data
Move using Channel
Guild1 Guild2
channel
o2
o3
o1
MOVE
channel.transfer_membership(o1) o1 = channel.receive
O2:Data
O3:Data
-
-
-
From Guild1 perspective,
transferred objects are invalidated
Sharing immutable objects
We can share reference to immutable objects
Guild1 Guild2
o2
o3
o1
channel
channel.transfer(o1) o1 = channel.receive
O2:Data O3:Data
Ref to
o1
If o1 is immutable, any Guild can read o1
read
Ref to
o1
read
Use-case 1: master – worker type
def fib(n) ... end
g_fib = Guild.new(script: %q{
ch = Guild.default_channel
while n, return_ch = ch.receive
return_ch.transfer fib(n)
end
})
ch = Guild::Channel.new
g_fib.transfer([3, ch])
p ch.receive
Main
Guild
Fibonacci
Guild
ch
return_ch
n, return_ch
Answer of fib(n)
NOTE: Making other Fibonacci guilds,
you can compute fib(n) in parallel
Use-case 2: pipeline
result_ch = Guild::Channel.new
g_pipe3 = Guild.new(script: %q{
while obj = Guild.default_channel.receive
obj = modify_obj3(obj)
Guild.argv[0].transfer_membership(obj)
end
}, argv: [result_ch])
g_pipe2 = Guild.new(script: %q{
while obj = Guild.default_channel.receive
obj = modify_obj2(obj)
Guild.argv[0].transfer_membership(obj)
end
}, argv: [g_pipe3])
g_pipe1 = Guild.new(script: %q{
while obj = Guild.default_channel.receive
obj = modify_obj1(obj)
Guild.argv[0].transfer_membership(obj)
end
}, argv: [g_pipe2])
obj = SomeClass.new
g_pipe1.transfer_membership(obj)
obj = result_ch.receive
Main
Guild
Pipe 1
Guild
obj
Obj’
Move
and modify
Pipe 2
Guild
Obj’’
Move
and modify
Pipe 3
Guild
Obj’’’
Obj’’’
Move
and modify
Move
Compare with Process, Guild, Thread
Process Guild Thread
Available Yes No Yes
Switch on time Yes Yes Yes
Switch on I/O Auto Auto Auto
Next target Auto Auto Auto
Parallel run Yes Yes No (on MRI)
Shared data N/A (mostly) N/A Everything
Comm. Hard Maybe Easy Easy
Programming difficulty Hard Easy Difficult
Debugging difficulty Easy? Maybe Easy Hard
Auto Fiber
Another proposed concurrency support for Ruby 3
Problem of Fiber
Requires explicit switching
•“Fiber” enables writing scheduler by
programmer
→ Programmers need to write own scheduler
•We need to manage blocking operations like
I/O blocking
Auto Fiber proposal
https://bugs.ruby-lang.org/issues/13618
Auto Fiber proposal
Automatic schedule on I/O blocking
•Support Fiber scheduler
natively
• Don’t need to return
scheduler
•Switch Fibers on all blocking
I/O (and other ops)
• No need to change existing
programs
Advantage and Disadvantage
•Advantage
•Don’t need to modify existing programs
•Lightweight as a Fiber
•Safer than Threads (no preemption)
•Disadvantage
•Introduce “non-deterministic” dangers same as
Thread programs
• Non atomic operations can intercept accidentally.
Change the name…?
Compare w/ Thread and (auto-)Fiber
Thread Auto-Fiber Fiber
Available Yes No Yes
Switch on time Yes No No
Switch on I/O Auto Auto No
Next target Auto Auto Specify
Parallel run No (on MRI) No No
Shared data Everything Everything Everything
Comm. Easy Easy Easy
Programming difficulty Difficult Easy Easy
Debugging difficulty Hard Maybe hard Easy
Today’s talk
•Supported features
•Process
•Thread
•Fiber
•Features under consideration
•Guild
•Auto-Fiber
Today’s talk
Process Guild Thread Auto-Fiber Fiber
Available Yes No Yes No Yes
Switch on time Yes Yes Yes No No
Switch on I/O Auto Auto Auto Auto No
Next target Auto Auto Auto Auto Specify
Parallel run Yes Yes No (on MRI) No No
Shared data N/A (mostly) N/A Everything Everything Everything
Comm. Hard Maybe Easy Easy Easy Easy
Programming
difficulty
Hard Easy Difficult Easy Easy
Debugging
difficulty
Easy? Maybe Easy Hard Maybe hard Easy
References
•Fiber: RubyKaigi 2017 http://rubykaigi.org/2017/presentations/ko1.html
•Guild: RubyConf 2016 https://www.youtube.com/watch?v=mjzmUUQWqco
•Auto-fiber: Feature #13618 https://bugs.ruby-lang.org/issues/13618
Thank you for your attention
Koichi Sasada
<ko1@cookpad.com>

Ruby's Concurrency Management: Now and Future

  • 1.
    Ruby's Concurrency Management: Now andFuture Koichi Sasada ko1@cookpad.com
  • 2.
  • 3.
    Today’s talk Process GuildThread Auto-Fiber Fiber Available Yes No Yes No Yes Switch on time Yes Yes Yes No No Switch on I/O Auto Auto Auto Auto No Next target Auto Auto Auto Auto Specify Parallel run Yes Yes No (on MRI) No No Shared data N/A (mostly) N/A Everything Everything Everything Comm. Hard Maybe Easy Easy Easy Easy Programming difficulty Hard Easy Difficult Easy Easy Debugging difficulty Easy? Maybe Easy Hard Maybe hard Easy
  • 4.
    Koichi Sasada http://atdot.net/~ko1/ •A programmer •2006-2012Faculty •2012-2017 Heroku, Inc. •2017- Cookpad Inc. •Job: MRI development •MRI: Matz Ruby Interpreter •Core parts • VM, Threads, GC, etc
  • 5.
    Ruby interpreter Ruby (Rails)app RubyGems/Bundler So many gems such as Rails, pry, thin, … and so on. Normal Ruby developer’s view i gigantum umeris insidentes Standing on the shoulders of giants
  • 6.
    Interpret on RubyVM NormalMRI developer’s view 6 Ruby script Parse Compile (codegen) Ruby Bytecode Object management Garbage collectorThreading Embedded classes and methods Bundled Libraries Evaluator Gem Libraries AST Abstract Syntax Tree
  • 7.
    Ruby interpreter Ruby (Rails)app RubyGems/Bundler So many gems such as Rails, pry, thin, … and so on. Koichi’s job Koichi
  • 8.
    Ruby3: Ruby3 has3 goals •Static type checking •Just-in-Time (JIT) compilation •Parallel execution w/ highly abstract concurrent model
  • 9.
    Ruby3: Ruby3 has3 goals •For productivity •Static checking •For performance •Just-in-Time (JIT) compilation •Parallel execution w/ highly abstract concurrent model
  • 10.
    Concurrency “In computer science,concurrency is the decomposability property of a program, algorithm, or problem into order- independent or partially-ordered components or units.[1] This means that even if the concurrent units of the program, algorithm, or problem are executed out-of-order or in partial order, the final outcome will remain the same. This allows for parallel execution of the concurrent units, which can significantly improve overall speed of the execution in multi- processor and multi-core systems.” https://en.wikipedia.org/wiki/Concurrency_(computer_science)
  • 11.
    Concurrent and Parallelexecution Concurrent execution Logical concept Parallel (and concurrent) execution Physical concept Ruby (MRI) support only concurrency Task A Task B Task A Task B
  • 12.
    Concurrency Why needed? •Easy towrite some kind of programs •Download files simultaneously •Process web requests simultaneously •Agent simulation (assume computer games) •Each agent has its own logics •Run agents simultaneously
  • 13.
    Concurrency Example: Downloader Download A DownloadB Download C wait for receving wait for receving wait for receving post process post process post process We can write this kind of program w/o concurrency support, but not simple, not easy
  • 14.
    Downloader example With concurrencysupport (Thread) ts = URLs.map do |url| Thread.new(url) do |u| data = download(u) File.write(u.to_fname, data) end end.each{|th| th.join} # wait
  • 15.
    Downloader example Without concurrencysupport # Serial execution URLs.each do |u| data = download(u) File.write(u.to_fname, data) end
  • 16.
    Concurrency Not concurrent case DownloadA Download B Download C wait for receving wait for receving post process post process … and download C after that
  • 17.
    Downloader example Without concurrencysupport # Use select. Not so SIMPLE!! fds = URLs.map do |u| download_fd(u) end while ready_fds = select(fds) ready_fds.each{|fd| File.write(…, read(fd))} end
  • 18.
  • 19.
    Supported features byRuby/MRI •Process •Thread •Fiber
  • 20.
  • 21.
    Process •Use OS multi-process •Use fork on Unix-like systems •Shared-nothing • Communicate with IPC (pipe, etc) such as IO.pipe •Programming • Difficult to manage processes and IPC •Debugging • Easy because a few synchronization bugs
  • 22.
    Inter-process communication Ruby processRuby process pipe file obj string Serialize obj string Deserialize
  • 23.
    Inter-process communication Example code #Traditional multi-process example r, w = IO.pipe fork do result_str = work_something.to_s w.write result_str w.close end puts r.read # wait for a result
  • 24.
    Sophisticated libraries/frameworks for processprogramming •dRuby: Distributed object for Ruby •parallel gem: Parallel programming with processes •unicorn: Process based web application server (master – worker model w/ processes)
  • 25.
  • 26.
    Thread •Use Ruby managedthreads • Thread.new do … end •Shared-everything • Communication is very easy •Programming • Easy to make, easy to communicate (at a glance) • Difficult to make completely safe program •Debugging • Hard because of synchronization
  • 27.
    MRI: Thread withGiant Lock (GIL) •Only a thread keeping the GIL can run (can’t run in parallel) Thread 1 Thread 2 Thread 1 CPU 1 CPU 2 IDLE OS Thread 1 OS Thread 2 Lock Lock
  • 28.
    Inter-thread communication Easy toshare objects Ruby process objThread Thread We can share objects directly between threads very easily
  • 29.
    Inter-thread communication v =Object.new $g = Object.new Thread.new do p [v, $g] end p [v, $g]
  • 30.
    Thread programming Synchronization isrequired •Reading/writing data simultaneously w/o synchronization will cause serious problem •Race condition •Data race
  • 31.
    Mutate shared objects Luckycase Ruby process Thread A Thread B SpeakerObject @name = ‘ko1’ @gender = ‘male’ (1) Thread A tries to change the Speaker to “Yuki” (female) Note: Yuki is my wife.
  • 32.
    Mutate shared objects Luckycase Ruby process Thread A Thread B SpeakerObject @name = ‘Yuki’ @gender = ‘male’ (2) A changes the name to “Yuki” write
  • 33.
    Mutate shared objects Luckycase Ruby process Thread A Thread B SpeakerObject @name = ‘Yuki’ @gender = ‘female’ (3) A changes the gender to “female” write
  • 34.
    Mutate shared objects Luckycase Ruby process Thread A Thread B SpeakerObject @name = ‘Yuki’ @gender = ‘female’ (4) Complete. A and B can read correct speaker. read read
  • 35.
    Mutate shared objects Problematiccase Ruby process Thread A Thread B SpeakerObject @name = ‘ko1’ @gender = ‘male’ (1) Thread A tries to change the Speaker to “Yuki” (female)
  • 36.
    Mutate shared objects Problematiccase Ruby process Thread A Thread B SpeakerObject @name = ‘Yuki’ @gender = ‘male’ (2) A changes the name to “Yuki” write
  • 37.
    Mutate shared objects Problematiccase Ruby process Thread A Thread B SpeakerObject @name = ‘Yuki’ @gender = ‘male’ (3) Before the changing, B read incorrect data!! Note: Yuki should be female. read
  • 38.
    Inter-thread communication Synchronization •Require synchronizationfor shared data •Mutex, Queue and so on •Usually Queue is enough •To prohibit simultaneous mutation •We need to keep consistency for each objects
  • 39.
    Mutate shared objects Withlock Ruby process Thread A Thread B SpeakerObject @name = ‘ko1’ @gender = ‘male’ (1) Thread A tries to change the Speaker to “Yuki” (female). Lock an obj. Locked by A
  • 40.
    Mutate shared objects Withlock Ruby process Thread A Thread B SpeakerObject @name = ‘Yuki’ @gender = ‘male’ (2) A changes the name to “Yuki” Locked by A
  • 41.
    Mutate shared objects Withlock Ruby process Thread A Thread B SpeakerObject @name = ‘Yuki’ @gender = ‘male’ (3) Before complete the changing, B tries to read, but prohibited by a lock Locked by A read
  • 42.
    Thread programming Easy toshare data: Good and Bad •Good: Easy to communicate with threads •Bad: Too easy. Difficult to manage all of them •Mutation for shared data requires correct synchronization •Sometimes objects are shared implicitly •Otherwise, it causes serious problems
  • 43.
    “Why Threads AreA Bad Idea (for most purposes)” • Quoted from John Ousterhout, 1995 (about 20 years ago ☺)
  • 44.
    Compare Process withThread Process Thread Available Yes Yes Switch on time Yes Yes Switch on I/O Auto Auto Next target Auto Auto Parallel run Yes No (on MRI) Shared data N/A Everything Communication Hard (high-overhead) Easy (lightweight) Programming difficulty Hard Difficult Debugging difficulty Easy? Hard
  • 45.
  • 46.
    Fiber example Infinite generator fib= Fiber.new do Fiber.yield a = b = 1 loop{ a, b = b, a+b Fiber.yield a } end 10.times{ p fib.resume }
  • 47.
    Fiber example Infinite generator fib= Fiber.new do Fiber.yield a = b = 1 loop{ a, b = b, a+b Fiber.yield a } end 10.times{ p fib.resume } 1. Fiber creation 2. Resume Fiber 3. Return to the parent fiber 4. Resume fiber (again) 5. Return to the parent fiber 6. Resume fiber (again2)
  • 48.
    Fiber example Infinite generator fib= Fiber.new do Fiber.yield a = b = 1 loop{ a, b = b, a+b Fiber.yield a } end 10.times{ p fib.resume } 1. Fiber creation 2. Resume Fiber 3. Return to the parent fiber 4. Resume fiber (again) 5. Return to the parent fiber 6. Resume fiber (again2)
  • 49.
    Not a Proc? a= 0; b = 1 fib = Proc.new{ a, b = b, a+b a } p fib.call #=> 1 p fib.call #=> 1 p fib.call #=> 2 p fib.call #=> 3 p fib.call #=> 5 Proc can’t restart from the middle of block
  • 50.
    Proc (method) v.s.Fiber Proc (method) Fiber Start OK: call OK: Fiber#resume Parameters OK: block (method) parameters OK: block parameters Return OK: exit Proc/method OK: exit Proc/method Suspend NG: N/A OK: Fiber.yield Continue NG: N/A OK: Fiber#resume
  • 51.
    Fiber example Inner iteratorto external iterator f1 = Fiber.new do 2.times{|i| Fiber.yield i} end p f1.resume #=> 0 p f1.resume #=> 1 p f1.resume #=> 2 # return value of #times p f1.resume #=> dead fiber called (FiberError)
  • 52.
    Fiber example Inner iteratorto external iterator etc_passwd_ex_iter = Fiber.new do open('/etc/passwd').each_line{|line| Fiber.yield line } end p etc_passwd_ex_iter.resume #=> 1st line p etc_passwd_ex_iter.resume #=> 2nd line …
  • 53.
    Fiber example Inner iteratorto external iterator # make Enumerator iter = open('/etc/passwd').each_line # Enumerator#next use Fiber implicitly p iter.next #=> 1st line p iter.next #=> 2nd line …
  • 54.
    Fiber example Agent simulation characters<< Fiber.new{ loop{cat.move_up; Fiber.yield}} characters << Fiber.new{ loop{dog.move_left; Fiber.yield}} … loop{cs.each{|e| e.resume}; redraw}
  • 55.
    Fiber example Agent simulation characters<< Fiber.new{ # you can specify complex rule for chars loop{ cow.move_up; Fiber.yield cow.move_right; Fiber.yield cow.move_down; Fiber.yield cow.move_left; Fiber.yield } }
  • 56.
    Fiber example Non-blocking IOscheduler Wait multiple IO ops with traditional “select” or modern “poll”, “epoll” interface
  • 57.
    Fiber Programming difficulty •Good •Synchronization forshared data is not required because of no unexpected switching •Lightweight than Processes and Threads •Bad •We need to switch explicitly. For example, “Blocking operations” (I/O blocking, etc) stop all fibers
  • 58.
    Comparison of existingsupports Process Thread Fiber Available Yes Yes Yes Switch on time Yes Yes No Switch on I/O Auto Auto No Next target Auto Auto Specify Parallel run Yes No (on MRI) No Shared data N/A Everything Everything Comm. Hard Easy Easy Programming difficulty Hard Difficult Easy Debugging difficulty Easy? Hard Easy
  • 59.
    Fiber: Brief history •2007/05/23cont.c (for callcc) •2007/05/25 Fiber impl. [ruby-dev:30827] •2007/05/28 Fiber introduced into cont.c •2007/08/25 Fix Fiber spec •2017 is 10th anniversary I introduced ☺
  • 60.
  • 61.
  • 62.
    Key idea Problem ofmulti-thread programming: Easy to share mutable objects Idea: Prohibit sharing mutable objects
  • 63.
    Our goal forRuby 3 •We need to keep compatibility with Ruby 2. •We can make parallel program. •We shouldn’t consider locks any more. •We can share objects with copy, but copy operation should be fast. •We should share immutable objects if we can. •We can provide special objects to share mutable objects like Clojure if we really need speed.
  • 64.
    Guild: New concurrencyabstraction •Guild has at least one thread (and a thread has at least one fiber) Guild Thread Fiber Guild Thread Fiber Guild Thread Fiber Fiber Thread Fiber
  • 65.
    Threads in differentguilds can run in Parallel •Threads in different guilds can run in parallel •Threads in a same guild can not run in parallel because of GVL (or GGL: Giant Guild Lock) G1:T1 G1:T2 G2:T3 Acquire GGL Acquire GGL
  • 66.
    Important rule: Mutable Objectshave a membership •All of mutable objects should belong to only one Guild exclusively •Guild can not touch objects belong to other guildsGuild 1 Guild 2 obj obj obj obj obj Can’t access (read/write) NG!!
  • 67.
    Object membership Only oneguild can access mutable object → We don’t need to consider locks (if Guild has only one thread)
  • 68.
    Inter-guild communication •“Guild::Channel” tocommunicate each guilds •Two communication methods 1. Copy 2. Move (transfer_membership)
  • 69.
    Copy using Channel Guild1Guild2 o2 o3 o1 channel o2 o3 o1 COPY channel.transfer(o1) o1 = channel.receive O2:Data O3:Data O2:Data O3:Data
  • 70.
    Move using Channel Guild1Guild2 o2 o3 o1 channel MOVE channel.transfer_membership(o1) o1 = channel.receive O2:Data O3:Data
  • 71.
    Move using Channel Guild1Guild2 channel o2 o3 o1 MOVE channel.transfer_membership(o1) o1 = channel.receive O2:Data O3:Data - - - From Guild1 perspective, transferred objects are invalidated
  • 72.
    Sharing immutable objects Wecan share reference to immutable objects Guild1 Guild2 o2 o3 o1 channel channel.transfer(o1) o1 = channel.receive O2:Data O3:Data Ref to o1 If o1 is immutable, any Guild can read o1 read Ref to o1 read
  • 73.
    Use-case 1: master– worker type def fib(n) ... end g_fib = Guild.new(script: %q{ ch = Guild.default_channel while n, return_ch = ch.receive return_ch.transfer fib(n) end }) ch = Guild::Channel.new g_fib.transfer([3, ch]) p ch.receive Main Guild Fibonacci Guild ch return_ch n, return_ch Answer of fib(n) NOTE: Making other Fibonacci guilds, you can compute fib(n) in parallel
  • 74.
    Use-case 2: pipeline result_ch= Guild::Channel.new g_pipe3 = Guild.new(script: %q{ while obj = Guild.default_channel.receive obj = modify_obj3(obj) Guild.argv[0].transfer_membership(obj) end }, argv: [result_ch]) g_pipe2 = Guild.new(script: %q{ while obj = Guild.default_channel.receive obj = modify_obj2(obj) Guild.argv[0].transfer_membership(obj) end }, argv: [g_pipe3]) g_pipe1 = Guild.new(script: %q{ while obj = Guild.default_channel.receive obj = modify_obj1(obj) Guild.argv[0].transfer_membership(obj) end }, argv: [g_pipe2]) obj = SomeClass.new g_pipe1.transfer_membership(obj) obj = result_ch.receive Main Guild Pipe 1 Guild obj Obj’ Move and modify Pipe 2 Guild Obj’’ Move and modify Pipe 3 Guild Obj’’’ Obj’’’ Move and modify Move
  • 75.
    Compare with Process,Guild, Thread Process Guild Thread Available Yes No Yes Switch on time Yes Yes Yes Switch on I/O Auto Auto Auto Next target Auto Auto Auto Parallel run Yes Yes No (on MRI) Shared data N/A (mostly) N/A Everything Comm. Hard Maybe Easy Easy Programming difficulty Hard Easy Difficult Debugging difficulty Easy? Maybe Easy Hard
  • 76.
    Auto Fiber Another proposedconcurrency support for Ruby 3
  • 77.
    Problem of Fiber Requiresexplicit switching •“Fiber” enables writing scheduler by programmer → Programmers need to write own scheduler •We need to manage blocking operations like I/O blocking
  • 78.
  • 79.
    Auto Fiber proposal Automaticschedule on I/O blocking •Support Fiber scheduler natively • Don’t need to return scheduler •Switch Fibers on all blocking I/O (and other ops) • No need to change existing programs
  • 80.
    Advantage and Disadvantage •Advantage •Don’tneed to modify existing programs •Lightweight as a Fiber •Safer than Threads (no preemption) •Disadvantage •Introduce “non-deterministic” dangers same as Thread programs • Non atomic operations can intercept accidentally. Change the name…?
  • 81.
    Compare w/ Threadand (auto-)Fiber Thread Auto-Fiber Fiber Available Yes No Yes Switch on time Yes No No Switch on I/O Auto Auto No Next target Auto Auto Specify Parallel run No (on MRI) No No Shared data Everything Everything Everything Comm. Easy Easy Easy Programming difficulty Difficult Easy Easy Debugging difficulty Hard Maybe hard Easy
  • 82.
  • 83.
    Today’s talk Process GuildThread Auto-Fiber Fiber Available Yes No Yes No Yes Switch on time Yes Yes Yes No No Switch on I/O Auto Auto Auto Auto No Next target Auto Auto Auto Auto Specify Parallel run Yes Yes No (on MRI) No No Shared data N/A (mostly) N/A Everything Everything Everything Comm. Hard Maybe Easy Easy Easy Easy Programming difficulty Hard Easy Difficult Easy Easy Debugging difficulty Easy? Maybe Easy Hard Maybe hard Easy
  • 84.
    References •Fiber: RubyKaigi 2017http://rubykaigi.org/2017/presentations/ko1.html •Guild: RubyConf 2016 https://www.youtube.com/watch?v=mjzmUUQWqco •Auto-fiber: Feature #13618 https://bugs.ruby-lang.org/issues/13618
  • 85.
    Thank you foryour attention Koichi Sasada <ko1@cookpad.com>