6. What is a thread?
A thread is just a set of execution
state
7. What is a thread?
A thread is just a set of execution
state
This state usually includes:
8. What is a thread?
A thread is just a set of execution
state
This state usually includes:
instruction & stack pointers
9. What is a thread?
A thread is just a set of execution
state
This state usually includes:
instruction & stack pointers
scheduling priority
10. What is a thread?
A thread is just a set of execution
state
This state usually includes:
instruction & stack pointers
scheduling priority
other CPU state
15. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
16. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
Pros
17. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
Pros
Create lots of them cheaply (10,000s)
18. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
Pros
Create lots of them cheaply (10,000s)
Switch between them cheaply (Ruby doesn’t)
19. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
Pros
Create lots of them cheaply (10,000s)
Switch between them cheaply (Ruby doesn’t)
Schedule them however you want
20. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
Pros
Create lots of them cheaply (10,000s)
Switch between them cheaply (Ruby doesn’t)
Schedule them however you want
Cons
21. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
Pros
Create lots of them cheaply (10,000s)
Switch between them cheaply (Ruby doesn’t)
Schedule them however you want
Cons
A blocking call in one blocks ALL
22. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
Pros
Create lots of them cheaply (10,000s)
Switch between them cheaply (Ruby doesn’t)
Schedule them however you want
Cons
A blocking call in one blocks ALL
Kernel doesn’t know about them
23. Green Threads (1:N)
“Green” because they are light weight
Kernel doesn’t know they exist
Implementation is in userland
Pros
Create lots of them cheaply (10,000s)
Switch between them cheaply (Ruby doesn’t)
Schedule them however you want
Cons
A blocking call in one blocks ALL
Kernel doesn’t know about them
Can’t take advantage of SMP
31. Native Threads (1:1)
Native Threads
Kernel knows they exist
Some userland code (libpthread)
Pros
Take advantage of SMP
32. Native Threads (1:1)
Native Threads
Kernel knows they exist
Some userland code (libpthread)
Pros
Take advantage of SMP
Shared memory
33. Native Threads (1:1)
Native Threads
Kernel knows they exist
Some userland code (libpthread)
Pros
Take advantage of SMP
Shared memory
Blocking in one thread doesn’t block everyone
34. Native Threads (1:1)
Native Threads
Kernel knows they exist
Some userland code (libpthread)
Pros
Take advantage of SMP
Shared memory
Blocking in one thread doesn’t block everyone
Don’t have to write a scheduler
35. Native Threads (1:1)
Native Threads
Kernel knows they exist
Some userland code (libpthread)
Pros
Take advantage of SMP
Shared memory
Blocking in one thread doesn’t block everyone
Don’t have to write a scheduler
Cons
36. Native Threads (1:1)
Native Threads
Kernel knows they exist
Some userland code (libpthread)
Pros
Take advantage of SMP
Shared memory
Blocking in one thread doesn’t block everyone
Don’t have to write a scheduler
Cons
Overhead limits how many you can create
37. Native Threads (1:1)
Native Threads
Kernel knows they exist
Some userland code (libpthread)
Pros
Take advantage of SMP
Shared memory
Blocking in one thread doesn’t block everyone
Don’t have to write a scheduler
Cons
Overhead limits how many you can create
Bugs (glibc, more threads = slower creation time)
38. Native Threads (1:1)
Native Threads
Kernel knows they exist
Some userland code (libpthread)
Pros
Take advantage of SMP
Shared memory
Blocking in one thread doesn’t block everyone
Don’t have to write a scheduler
Cons
Overhead limits how many you can create
Bugs (glibc, more threads = slower creation time)
Don’t have fine grained scheduling control
46. Hybrid Threads (M:N)
Hybrid threads
Almost best of both worlds
Pros
Take advantage of SMP
Cheap setup and teardown
47. Hybrid Threads (M:N)
Hybrid threads
Almost best of both worlds
Pros
Take advantage of SMP
Cheap setup and teardown
Blocking in one thread doesn’t block everyone
48. Hybrid Threads (M:N)
Hybrid threads
Almost best of both worlds
Pros
Take advantage of SMP
Cheap setup and teardown
Blocking in one thread doesn’t block everyone
Cons
49. Hybrid Threads (M:N)
Hybrid threads
Almost best of both worlds
Pros
Take advantage of SMP
Cheap setup and teardown
Blocking in one thread doesn’t block everyone
Cons
Need 2 schedulers (userland + kernel)
50. Hybrid Threads (M:N)
Hybrid threads
Almost best of both worlds
Pros
Take advantage of SMP
Cheap setup and teardown
Blocking in one thread doesn’t block everyone
Cons
Need 2 schedulers (userland + kernel)
Need to make them actually work together
51. Hybrid Threads (M:N)
Hybrid threads
Almost best of both worlds
Pros
Take advantage of SMP
Cheap setup and teardown
Blocking in one thread doesn’t block everyone
Cons
Need 2 schedulers (userland + kernel)
Need to make them actually work together
All green threads backed by same native thread
can be blocked
56. Preemptive
Multitasking
Outside event (timer) signals end of CPU slice
57. Preemptive
Multitasking
Outside event (timer) signals end of CPU slice
Handle important events quickly
58. Preemptive
Multitasking
Outside event (timer) signals end of CPU slice
Handle important events quickly
Can help ensure everyone gets to execute
59. Preemptive
Multitasking
Outside event (timer) signals end of CPU slice
Handle important events quickly
Can help ensure everyone gets to execute
But..
60. Preemptive
Multitasking
Outside event (timer) signals end of CPU slice
Handle important events quickly
Can help ensure everyone gets to execute
But..
Need to build a smart scheduler
61. Preemptive
Multitasking
Outside event (timer) signals end of CPU slice
Handle important events quickly
Can help ensure everyone gets to execute
But..
Need to build a smart scheduler
Can yield non-determistic execution order
63. Cooperative
Multitasking
Threads voluntarily release the CPU
64. Cooperative
Multitasking
Threads voluntarily release the CPU
Give up the CPU when it is “optimal”
65. Cooperative
Multitasking
Threads voluntarily release the CPU
Give up the CPU when it is “optimal”
Can guarantee deterministic execution order
66. Cooperative
Multitasking
Threads voluntarily release the CPU
Give up the CPU when it is “optimal”
Can guarantee deterministic execution order
Very simple “scheduler”
67. Cooperative
Multitasking
Threads voluntarily release the CPU
Give up the CPU when it is “optimal”
Can guarantee deterministic execution order
Very simple “scheduler”
But..
68. Cooperative
Multitasking
Threads voluntarily release the CPU
Give up the CPU when it is “optimal”
Can guarantee deterministic execution order
Very simple “scheduler”
But..
Badly written code can hang all threads
69. So, what is a fiber?
In Ruby fibers are green threads
with cooperative multitasking.
70. So what’s the deal
with ruby threads?
strace
google-perftools
ltrace
gdb
72. strace -cp <pid>
-c
Count time, calls, and errors for each system call and report a
summary on program exit.
-p pid
Attach to the process with the process ID pid and begin tracing.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
50.39 0.000064 0 1197 592 read
34.65 0.000044 0 609 writev
14.96 0.000019 0 1226 epoll_ctl
0.00 0.000000 0 4 close
0.00 0.000000 0 1 select
0.00 0.000000 0 4 socket
0.00 0.000000 0 4 4 connect
0.00 0.000000 0 1057 epoll_wait
------ ----------- ----------- --------- --------- ----------------
100.00 0.000127 4134 596 total
73. strace -ttTp <pid> -o <file>
-t
Prefix each line of the trace with the time of day.
-tt
If given twice, the time printed will include the microseconds.
-T
Show the time spent in system calls. This records the time
difference between the beginning and the end of each system call.
-o filename
Write the trace output to the file filename rather than to stderr.
01:09:11.266949 epoll_wait(9, {{EPOLLIN, {u32=68841296, u64=68841296}}}, 4096, 50) = 1 <0.033109>
01:09:11.300102 accept(10, {sa_family=AF_INET, sin_port=38313, sin_addr="127.0.0.1"}, [1226]) = 22 <0.000014>
01:09:11.300190 fcntl(22, F_GETFL) = 0x2 (flags O_RDWR) <0.000007>
01:09:11.300237 fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>
01:09:11.300277 setsockopt(22, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000008>
01:09:11.300489 accept(10, 0x7fff5d9c07d0, [1226]) = -1 EAGAIN <0.000014>
01:09:11.300547 epoll_ctl(9, EPOLL_CTL_ADD, 22, {EPOLLIN, {u32=108750368, u64=108750368}}) = 0 <0.000009>
01:09:11.300593 epoll_wait(9, {{EPOLLIN, {u32=108750368, u64=108750368}}}, 4096, 50) = 1 <0.000007>
01:09:11.300633 read(22, "GET / HTTP/1.1r"..., 16384) = 772 <0.000012>
01:09:11.301727 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000007>
01:09:11.302095 poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000008>
01:09:11.302144 write(5, "1000000-0003SELECT * FROM `table`"..., 56) = 56 <0.000023>
01:09:11.302221 read(5, "25101,20x234m"..., 16384) = 284 <1.300897>
74. strace -ttTp <pid> -o <file>
-t
Prefix each line of the trace with the time of day.
-tt
If given twice, the time printed will include the microseconds.
-T
Show the time spent in system calls. This records the time
difference between the beginning and the end of each system call.
-o filename
Write the trace output to the file filename rather than to stderr.
01:09:11.266949 epoll_wait(9, {{EPOLLIN, {u32=68841296, u64=68841296}}}, 4096, 50) = 1 <0.033109>
01:09:11.300102 accept(10, {sa_family=AF_INET, sin_port=38313, sin_addr="127.0.0.1"}, [1226]) = 22 <0.000014>
01:09:11.300190 fcntl(22, F_GETFL) = 0x2 (flags O_RDWR) <0.000007>
01:09:11.300237 fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>
01:09:11.300277 setsockopt(22, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000008>
01:09:11.300489 accept(10, 0x7fff5d9c07d0, [1226]) = -1 EAGAIN <0.000014>
01:09:11.300547 epoll_ctl(9, EPOLL_CTL_ADD, 22, {EPOLLIN, {u32=108750368, u64=108750368}}) = 0 <0.000009>
01:09:11.300593 epoll_wait(9, {{EPOLLIN, {u32=108750368, u64=108750368}}}, 4096, 50) = 1 <0.000007>
01:09:11.300633 read(22, "GET / HTTP/1.1r"..., 16384) = 772 <0.000012>
01:09:11.301727 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000007>
01:09:11.302095 poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000008>
01:09:11.302144 write(5, "1000000-0003SELECT * FROM `table`"..., 56) = 56 <0.000023>
01:09:11.302221 read(5, "25101,20x234m"..., 16384) = 284 <1.300897>
75. strace -ttTp <pid> -o <file>
-t
Prefix each line of the trace with the time of day.
-tt
If given twice, the time printed will include the microseconds.
-T
Show the time spent in system calls. This records the time
difference between the beginning and the end of each system call.
-o filename
Write the trace output to the file filename rather than to stderr.
01:09:11.266949 epoll_wait(9, {{EPOLLIN, {u32=68841296, u64=68841296}}}, 4096, 50) = 1 <0.033109>
01:09:11.300102 accept(10, {sa_family=AF_INET, sin_port=38313, sin_addr="127.0.0.1"}, [1226]) = 22 <0.000014>
01:09:11.300190 fcntl(22, F_GETFL) = 0x2 (flags O_RDWR) <0.000007>
01:09:11.300237 fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>
01:09:11.300277 setsockopt(22, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000008>
01:09:11.300489 accept(10, 0x7fff5d9c07d0, [1226]) = -1 EAGAIN <0.000014>
01:09:11.300547 epoll_ctl(9, EPOLL_CTL_ADD, 22, {EPOLLIN, {u32=108750368, u64=108750368}}) = 0 <0.000009>
01:09:11.300593 epoll_wait(9, {{EPOLLIN, {u32=108750368, u64=108750368}}}, 4096, 50) = 1 <0.000007>
01:09:11.300633 read(22, "GET / HTTP/1.1r"..., 16384) = 772 <0.000012>
01:09:11.301727 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000007>
01:09:11.302095 poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000008>
01:09:11.302144 write(5, "1000000-0003SELECT * FROM `table`"..., 56) = 56 <0.000023>
01:09:11.302221 read(5, "25101,20x234m"..., 16384) = 284 <1.300897>
76. strace -ttTp <pid> -o <file>
-t
Prefix each line of the trace with the time of day.
-tt
If given twice, the time printed will include the microseconds.
-T
Show the time spent in system calls. This records the time
difference between the beginning and the end of each system call.
-o filename
Write the trace output to the file filename rather than to stderr.
01:09:11.266949 epoll_wait(9, {{EPOLLIN, {u32=68841296, u64=68841296}}}, 4096, 50) = 1 <0.033109>
01:09:11.300102 accept(10, {sa_family=AF_INET, sin_port=38313, sin_addr="127.0.0.1"}, [1226]) = 22 <0.000014>
01:09:11.300190 fcntl(22, F_GETFL) = 0x2 (flags O_RDWR) <0.000007>
01:09:11.300237 fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>
01:09:11.300277 setsockopt(22, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000008>
01:09:11.300489 accept(10, 0x7fff5d9c07d0, [1226]) = -1 EAGAIN <0.000014>
01:09:11.300547 epoll_ctl(9, EPOLL_CTL_ADD, 22, {EPOLLIN, {u32=108750368, u64=108750368}}) = 0 <0.000009>
01:09:11.300593 epoll_wait(9, {{EPOLLIN, {u32=108750368, u64=108750368}}}, 4096, 50) = 1 <0.000007>
01:09:11.300633 read(22, "GET / HTTP/1.1r"..., 16384) = 772 <0.000012>
01:09:11.301727 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000007>
01:09:11.302095 poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000008>
01:09:11.302144 write(5, "1000000-0003SELECT * FROM `table`"..., 56) = 56 <0.000023>
01:09:11.302221 read(5, "25101,20x234m"..., 16384) = 284 <1.300897>
80. ruby uses setitimer and signals
to schedule green threads*
The first time a new thread is created, ruby
calls:
setitimer(ITIMER_VIRTUAL, 10ms): tell the
kernel to send the process a SIGVTALRM
every 10ms
posix_signal(SIGVTALRM, catch_timer): bind
the catch_timer function to the signal
* when compiled without --enable-pthread
81. static void
catch_timer(sig)
int sig;
{
if (!rb_thread_critical) { static VALUE
rb_thread_pending = 1; rb_thread_start_0(fn, arg, th)
} VALUE (*fn)();
/* cause EINTR */ void *arg;
} rb_thread_t th;
{
void if (!thread_init) {
rb_thread_start_timer() thread_init = 1;
{ posix_signal(SIGVTALRM, catch_timer);
struct itimerval tval; rb_thread_start_timer();
}
if (!thread_init) return;
tval.it_interval.tv_sec = 0; /* ... */
tval.it_interval.tv_usec = 10000; }
tval.it_value = tval.it_interval;
setitimer(ITIMER_VIRTUAL, &tval, NULL);
}
82. static void
catch_timer(sig)
int sig;
{
if (!rb_thread_critical) { static VALUE
rb_thread_pending = 1; rb_thread_start_0(fn, arg, th)
} VALUE (*fn)();
/* cause EINTR */ void *arg;
} rb_thread_t th;
{
void if (!thread_init) {
rb_thread_start_timer() thread_init = 1;
{ posix_signal(SIGVTALRM, catch_timer);
struct itimerval tval; rb_thread_start_timer();
}
if (!thread_init) return;
tval.it_interval.tv_sec = 0; /* ... */
tval.it_interval.tv_usec = 10000; }
tval.it_value = tval.it_interval;
setitimer(ITIMER_VIRTUAL, &tval, NULL);
}
83. static void
catch_timer(sig)
int sig;
{
if (!rb_thread_critical) { static VALUE
rb_thread_pending = 1; rb_thread_start_0(fn, arg, th)
} VALUE (*fn)();
/* cause EINTR */ void *arg;
} rb_thread_t th;
{
void if (!thread_init) {
rb_thread_start_timer() thread_init = 1;
{ posix_signal(SIGVTALRM, catch_timer);
struct itimerval tval; rb_thread_start_timer();
}
if (!thread_init) return;
tval.it_interval.tv_sec = 0; /* ... */
tval.it_interval.tv_usec = 10000; }
tval.it_value = tval.it_interval;
setitimer(ITIMER_VIRTUAL, &tval, NULL);
}
85. But I’m not using threads!
begin
# require 'net/http'
# Net::HTTP.new(host, port).request(...)
# require 'net/smtp'
# Net::SMTP.new('localhost').send_message(...)
require 'timeout'
Timeout.timeout(0.1) do
1+2*3/4 while true
end
rescue Timeout::Error
end
500_000_000.times{ |i| i * 2 }
86. But I’m not using threads!
begin
# require 'net/http'
# Net::HTTP.new(host, port).request(...) uses timeout
# require 'net/smtp'
# Net::SMTP.new('localhost').send_message(...)
require 'timeout'
Timeout.timeout(0.1) do
1+2*3/4 while true
end
rescue Timeout::Error
end
500_000_000.times{ |i| i * 2 }
87. But I’m not using threads!
begin
# require 'net/http'
# Net::HTTP.new(host, port).request(...) uses timeout
# require 'net/smtp'
# Net::SMTP.new('localhost').send_message(...)
require 'timeout'
Timeout.timeout(0.1) do uses threads
1+2*3/4 while true
end
rescue Timeout::Error
end
500_000_000.times{ |i| i * 2 }
88. But I’m not using threads!
begin
# require 'net/http'
# Net::HTTP.new(host, port).request(...) uses timeout
# require 'net/smtp'
# Net::SMTP.new('localhost').send_message(...)
require 'timeout'
Timeout.timeout(0.1) do uses threads
1+2*3/4 while true
end
rescue Timeout::Error
end
500_000_000.times{ |i| i * 2 }
Thread.new, Timeout.timeout and Net::* all use threads
and start the thread timer
Once the timer is started, it will interrupt your
process every 10ms, even if all threads are killed
89. PATCH: stop the thread timer
@@ -10518,6 +10520,15 @@ rb_thread_remove(th)
rb_thread_die(th);
th->prev->next = th->next;
th->next->prev = th->prev;
+
+ /* if this is the last ruby thread, stop timer signals */
+ if (th->next == th->prev && th->next == main_thread) {
+ rb_thread_stop_timer();
+ thread_init = 0;
+ }
}
100. ucontext?
ruby can use either setjmp/longjmp or
setcontext/getcontext in its
threading implementation and for
exception handling
101. ucontext?
ruby can use either setjmp/longjmp or
setcontext/getcontext in its
threading implementation and for
exception handling
setjmp/longjmp save and restore the
current cpu registers
102. ucontext?
ruby can use either setjmp/longjmp or
setcontext/getcontext in its
threading implementation and for
exception handling
setjmp/longjmp save and restore the
current cpu registers
setcontext/getcontext are an advanced
version of setjmp/longjmp, but they
also call sigprocmask to save/restore
the signal mask before each jump
103. PATCH: --disable-ucontext
--- a/configure.in
+++ b/configure.in
@@ -368,6 +368,10 @@
+AC_ARG_ENABLE(ucontext,
+ [ --disable-ucontext do not use getcontext()/setcontext().],
+ [disable_ucontext=yes], [disable_ucontext=no])
+
AC_ARG_ENABLE(pthread,
[ --enable-pthread use pthread library.],
[enable_pthread=$enableval], [enable_pthread=no])
@@ -1038,7 +1042,8 @@
-if test x"$ac_cv_header_ucontext_h" = xyes; then
+if test x"$ac_cv_header_ucontext_h" = xyes && test x"$disable_ucontext" = xno; then
if test x"$rb_with_pthread" = xyes; then
AC_CHECK_FUNCS(getcontext setcontext)
fi
./configure --enable-pthread --disable-ucontext
104. PATCH: --disable-ucontext
--- a/configure.in
+++ b/configure.in
@@ -368,6 +368,10 @@
+AC_ARG_ENABLE(ucontext,
+ [ --disable-ucontext do not use getcontext()/setcontext().],
+ [disable_ucontext=yes], [disable_ucontext=no])
+
AC_ARG_ENABLE(pthread,
[ --enable-pthread use pthread library.],
[enable_pthread=$enableval], [enable_pthread=no])
@@ -1038,7 +1042,8 @@
-if test x"$ac_cv_header_ucontext_h" = xyes; then
+if test x"$ac_cv_header_ucontext_h" = xyes && test x"$disable_ucontext" = xno; then
if test x"$rb_with_pthread" = xyes; then
AC_CHECK_FUNCS(getcontext setcontext)
fi
./configure --enable-pthread --disable-ucontext
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
nan 0.000000 0 13 read
nan 0.000000 0 21 10 open
nan 0.000000 0 11 close
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 45 10 total
105.
106. EventMachine + threads = slow??
EventMachine allocates large buffers on the
stack to read/write from the network
Using threads with EM made ruby extremely
slow..
107. EventMachine + threads = slow??
EventMachine allocates large buffers on the
stack to read/write from the network
Using threads with EM made ruby extremely
slow..
...profile?
108. EventMachine + threads = slow??
EventMachine allocates large buffers on the
stack to read/write from the network
Using threads with EM made ruby extremely
slow..
#include "ruby.h"
require 'cext'
VALUE bigstack(VALUE self)
(1..2).map{ {
Thread.new{ char buffer[ 50 * 1024 ]; /* large stack frame */
CExt.bigstack{ if (rb_block_given_p()) rb_yield(Qnil);
100_000.times{ return Qnil;
1*2+3/4 }
Thread.pass
} void Init_cext()
} {
} VALUE CExt = rb_define_module("CExt");
}.map{ |t| t.join } rb_define_singleton_method(CExt, "bigstack", bigstack, 0);
}
...profile?
109. EventMachine + threads = slow??
EventMachine allocates large buffers on the
stack to read/write from the network
Using threads with EM made ruby extremely
slow..
#include "ruby.h"
require 'cext'
VALUE bigstack(VALUE self)
(1..2).map{ {
Thread.new{ char buffer[ 50 * 1024 ]; /* large stack frame */
CExt.bigstack{ if (rb_block_given_p()) rb_yield(Qnil);
100_000.times{ return Qnil;
1*2+3/4 }
Thread.pass
} void Init_cext()
} {
} VALUE CExt = rb_define_module("CExt");
}.map{ |t| t.join } rb_define_singleton_method(CExt, "bigstack", bigstack, 0);
}
...profile?
110. EventMachine + threads = slow??
EventMachine allocates large buffers on the
stack to read/write from the network
Using threads with EM made ruby extremely
slow..
#include "ruby.h"
require 'cext'
VALUE bigstack(VALUE self)
(1..2).map{ {
Thread.new{ char buffer[ 50 * 1024 ]; /* large stack frame */
CExt.bigstack{ if (rb_block_given_p()) rb_yield(Qnil);
100_000.times{ return Qnil;
1*2+3/4 }
Thread.pass
} void Init_cext()
} {
} VALUE CExt = rb_define_module("CExt");
}.map{ |t| t.join } rb_define_singleton_method(CExt, "bigstack", bigstack, 0);
}
...profile?
111. EventMachine + threads = slow??
EventMachine allocates large buffers on the
stack to read/write from the network
Using threads with EM made ruby extremely
slow..
#include "ruby.h"
require 'cext'
VALUE bigstack(VALUE self)
(1..2).map{ {
Thread.new{ char buffer[ 50 * 1024 ]; /* large stack frame */
CExt.bigstack{ if (rb_block_given_p()) rb_yield(Qnil);
100_000.times{ return Qnil;
1*2+3/4 }
Thread.pass
} void Init_cext()
} {
} VALUE CExt = rb_define_module("CExt");
}.map{ |t| t.join } rb_define_singleton_method(CExt, "bigstack", bigstack, 0);
}
...profile?
112. EventMachine + threads = slow??
EventMachine allocates large buffers on the
stack to read/write from the network
Using threads with EM made ruby extremely
slow..
#include "ruby.h"
require 'cext'
VALUE bigstack(VALUE self)
(1..2).map{ {
Thread.new{ char buffer[ 50 * 1024 ]; /* large stack frame */
CExt.bigstack{ if (rb_block_given_p()) rb_yield(Qnil);
100_000.times{ return Qnil;
1*2+3/4 }
Thread.pass
} void Init_cext()
} {
} VALUE CExt = rb_define_module("CExt");
}.map{ |t| t.join } rb_define_singleton_method(CExt, "bigstack", bigstack, 0);
}
...profile?
150. Stack vs. Heap
Stack:
Storage for local vars
Only valid while stack
frame is on the stack!
151. Stack vs. Heap
Stack:
Storage for local vars
Only valid while stack
frame is on the stack!
Keeping track of function calls
152. Stack vs. Heap
Stack: Heap:
Storage for local vars
Only valid while stack
frame is on the stack!
Keeping track of function calls
153. Stack vs. Heap
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Keeping track of function calls
154. Stack vs. Heap
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
155. Stack vs. Heap
func1()
void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
156. Stack vs. Heap
func1()
4 bytes void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
157. Stack vs. Heap
func2()
char *string = malloc(10);
func3();
func1()
4 bytes void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
158. Stack vs. Heap
func2()
4 bytes char *string = malloc(10);
func3();
func1()
4 bytes void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
159. Stack vs. Heap
func2()
4 bytes char *string = malloc(10); 10 bytes
func3();
func1()
4 bytes void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
160. Stack vs. Heap
func3()
char buffer[8];
func2()
4 bytes char *string = malloc(10); 10 bytes
func3();
func1()
4 bytes void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
161. Stack vs. Heap
func3()
8 bytes char buffer[8];
func2()
4 bytes char *string = malloc(10); 10 bytes
func3();
func1()
4 bytes void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
162. Stack vs. Heap
func3()
char buffer[8];
func2()
4 bytes char *string = malloc(10); 10 bytes
func3();
func1()
4 bytes void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
163. Stack vs. Heap
func2()
4 bytes char *string = malloc(10); 10 bytes
func3();
func1()
4 bytes void *data;
func2();
Stack: Heap:
Storage for local vars Storage for vars that
persist across function
Only valid while stack calls.
frame is on the stack!
Managed by malloc
Keeping track of function calls
171. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
172. gdb walkthrough
% gdb ./test-it start gdb
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
173. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
174. gdb walkthrough
% gdb ./test-it
(gdb) b average set breakpoint on function named average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
175. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
176. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run run program
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
177. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
178. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3 hit breakpoint!
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
179. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
180. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt show backtrace
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
181. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
182. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
function stack
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
183. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
184. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0; single step
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
185. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
186. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5 print variables
(gdb) p sum
$2 = 11
187. gdb walkthrough
% gdb ./test-it
(gdb) b average
Breakpoint 1 at 0x1f8e: file test-it.c, line 3.
(gdb) run
Starting program: /Users/joe/test-it
Reading symbols for shared libraries ++. done
Breakpoint 1, average (x=5, y=6) at test-it.c:3
3 int sum = x + y;
(gdb) bt
#0 average (x=5, y=6) at test-it.c:3
#1 0x00001fec in main () at test-it.c:12
(gdb) s
4 double avg = sum / 2.0;
(gdb) s
5 return avg;
(gdb) p avg
$1 = 5.5
(gdb) p sum
$2 = 11
188. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
189. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
190. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
191. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
192. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
193. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
194. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
195. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
196. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
197. What’s on the ruby stack?
(gdb) where
#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0,
scope=0, self=1403220) at eval.c:6125
#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493
#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at
eval.c:5083
#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168
#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946
#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0,
argv=0x0) at eval.c:5759
#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0,
argv=0x0, body=0x152b24, flags=0) at eval.c:5911
#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0,
scope=0, self=1403220) at eval.c:6158
#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493
#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223
#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437
#11 0x0001d60f in ruby_exec_internal () at eval.c:1642
#12 0x0001d660 in ruby_exec () at eval.c:1662
#13 0x0001d68e in ruby_run () at eval.c:1672
#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48
rb_eval recursively executes ruby code in 1.8
199. How big is the stack?
#8 rb_eval at eval.c:3493
(gdb) p $ebp - $esp
$1 = 968
200. How big is the stack?
#8 rb_eval at eval.c:3493
(gdb) p $ebp - $esp base - stack ptr = frame size
$1 = 968
201. How big is the stack?
#8 rb_eval at eval.c:3493
(gdb) p $ebp - $esp base - stack ptr = frame size
$1 = 968 each rb_eval stack frame is almost 1k!
202. How big is the stack?
#8 rb_eval at eval.c:3493
(gdb) p $ebp - $esp base - stack ptr = frame size
$1 = 968 each rb_eval stack frame is almost 1k!
#0 rb_thread_save_context at eval.c:10597
(gdb) p (void*)rb_gc_stack_start - $esp
$1 = 10572
203. How big is the stack?
#8 rb_eval at eval.c:3493
(gdb) p $ebp - $esp base - stack ptr = frame size
$1 = 968 each rb_eval stack frame is almost 1k!
#0 rb_thread_save_context at eval.c:10597
(gdb) p (void*)rb_gc_stack_start - $esp
$1 = 10572 10.5k stack will be memcpy()’d
204. How big is the stack?
#8 rb_eval at eval.c:3493
(gdb) p $ebp - $esp base - stack ptr = frame size
$1 = 968 each rb_eval stack frame is almost 1k!
#0 rb_thread_save_context at eval.c:10597
(gdb) p (void*)rb_gc_stack_start - $esp
$1 = 10572 10.5k stack will be memcpy()’d
50 method calls * 1k ≈ 50k stack
206. Recap: How do Ruby threads work?
Each thread has it’s own execution context:
saved cpu registers (setjmp/longjmp)
copy of vm globals (current frame, scope, block)
stack (memcpy)
Editor's Notes
this talk is going to get technical, so feel free to interrupt if you have any questions
differs on OS and platforms, but usually includes..
differs on OS and platforms, but usually includes..
differs on OS and platforms, but usually includes..
differs on OS and platforms, but usually includes..
differs on OS and platforms, but usually includes..
each one has pros and cons, different use cases where they make sense.
i&#x2019;ll show pictures for each one.
let&#x2019;s dive into differences
solaris older than version 9 used hybrid threads too
switch to aman
syscalls are calls to kernel functions
numbered functions
switches from usermode to kernel mode
doesn&#x2019;t show userland functions, but you can look for gaps
look for system calls that took a while
look for gaps that indicate userland activity
lots of other options, trace network related or fd related calls, etc
look for system calls that took a while
look for gaps that indicate userland activity
lots of other options, trace network related or fd related calls, etc
look for system calls that took a while
look for gaps that indicate userland activity
lots of other options, trace network related or fd related calls, etc
so what&#x2019;s the deal with ruby threads? lets strace to find out
straced a production ruby.. lots of vtalrms. wtf?
so what&#x2019;s the deal with ruby threads? lets strace to find out
straced a production ruby.. lots of vtalrms. wtf?
ruby uses setitimer and signals to schedule green threads
setitimer tells the kernel to send a VTALRM signal every 10ms. signal interrupts the process and invokes catch_timer to set rb_thread_pending, which lets the interpreter know it needs to switch threads.
rb_thread_start uses thread_init to keep track of whether it needs to start the timer or not.
rb_thread_start calls rb_thread_start_timer (.. or pthread_create later)
ruby uses setitimer and signals to schedule green threads
setitimer tells the kernel to send a VTALRM signal every 10ms. signal interrupts the process and invokes catch_timer to set rb_thread_pending, which lets the interpreter know it needs to switch threads.
rb_thread_start uses thread_init to keep track of whether it needs to start the timer or not.
rb_thread_start calls rb_thread_start_timer (.. or pthread_create later)
ruby uses setitimer and signals to schedule green threads
setitimer tells the kernel to send a VTALRM signal every 10ms. signal interrupts the process and invokes catch_timer to set rb_thread_pending, which lets the interpreter know it needs to switch threads.
rb_thread_start uses thread_init to keep track of whether it needs to start the timer or not.
rb_thread_start calls rb_thread_start_timer (.. or pthread_create later)
but our code isn&#x2019;t using threads!
turns out net::http and smtp use timeout, which uses threads. and the first time a thread is spawned, the timer is started.. and it never stops!
let&#x2019;s fix it.
but our code isn&#x2019;t using threads!
turns out net::http and smtp use timeout, which uses threads. and the first time a thread is spawned, the timer is started.. and it never stops!
let&#x2019;s fix it.
but our code isn&#x2019;t using threads!
turns out net::http and smtp use timeout, which uses threads. and the first time a thread is spawned, the timer is started.. and it never stops!
let&#x2019;s fix it.
remember the thread_init variable from before?
thread_remove() removes the thread from the linked list. if only the main_thread is left, we simply stop the timer, and make sure to set thread_init=0 so the timer is started up again next time a new thread is spawned.
switch over to JOE. talk about running debian ruby in production
we noticed ruby on debian is pretty slow
we googled debian ruby issues, and it turns out sigprocmask is related to enable pthread
we noticed ruby on debian is pretty slow
we googled debian ruby issues, and it turns out sigprocmask is related to enable pthread
we noticed ruby on debian is pretty slow
we googled debian ruby issues, and it turns out sigprocmask is related to enable pthread
using a pthread for timing doesn&#x2019;t make it slower.. what does?
let&#x2019;s see what ./configure --enable-pthread actually does. diff&#x2019;ed generated config.h.
hmm, getcontext/setcontext??
using a pthread for timing doesn&#x2019;t make it slower.. what does?
let&#x2019;s see what ./configure --enable-pthread actually does. diff&#x2019;ed generated config.h.
hmm, getcontext/setcontext??
turns out you don&#x2019;t really need ucontext to use pthreads (maybe on some obscure platforms?)
let&#x2019;s strace it!
.. 3.5 million sigprocmask are gone! ruby is 30% faster!
switch to aman
two threads
each allocates large stack frame (50kb)
does some computation, then calls thread pass to switch to the other thread
two threads
each allocates large stack frame (50kb)
does some computation, then calls thread pass to switch to the other thread
two threads
each allocates large stack frame (50kb)
does some computation, then calls thread pass to switch to the other thread
two threads
each allocates large stack frame (50kb)
does some computation, then calls thread pass to switch to the other thread
two threads
each allocates large stack frame (50kb)
does some computation, then calls thread pass to switch to the other thread
two threads
each allocates large stack frame (50kb)
does some computation, then calls thread pass to switch to the other thread
really.. memcpy? let&#x2019;s make sure
really.. memcpy? let&#x2019;s make sure
really.. memcpy? let&#x2019;s make sure
really.. memcpy? let&#x2019;s make sure
really.. memcpy? let&#x2019;s make sure
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
ok, its calling memcpy. what is it copying?
it&#x2019;s copying the thread stacks to the heap.
let&#x2019;s take a step back and talk about the difference between stacks and heaps
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
func3() has a 8byte stack frame, twice as big as the other two
the bigger the stack frames, the more it has to memcpy and the longer it takes.
syscalls are calls to kernel functions
numbered functions
switches from usermode to kernel mode
doesn&#x2019;t show userland functions, but you can look for gaps
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
starts out with main() like any C program
calls ruby_run right away to start the ruby vm
int_dotimes in numeric.c, this code calls 5000.times{}
rb_yield is yielding to the block
but, the most common stack frame is rb_eval. 1.8&#x2019;s vm represents ruby code using nodes, and nodes are evaluated using rb_eval. also notice that rb_eval is recursive.. rails for instance would show many dozens of nested rb_eval
each rb_eval stack frame is almost 1k!
(mention mbari patches)
switch to joe
each rb_eval stack frame is almost 1k!
(mention mbari patches)
switch to joe
each rb_eval stack frame is almost 1k!
(mention mbari patches)
switch to joe
each rb_eval stack frame is almost 1k!
(mention mbari patches)
switch to joe
each rb_eval stack frame is almost 1k!
(mention mbari patches)
switch to joe
each rb_eval stack frame is almost 1k!
(mention mbari patches)
switch to joe
rb_thread_start allocates a new heap, sets the stack pointer using assembly
then thread_save/restore just call setjump and longjump like normal, which takes care of saving and restoring where the stack pointer was pointing!
rb_thread_start allocates a new heap, sets the stack pointer using assembly
then thread_save/restore just call setjump and longjump like normal, which takes care of saving and restoring where the stack pointer was pointing!
rb_thread_start allocates a new heap, sets the stack pointer using assembly
then thread_save/restore just call setjump and longjump like normal, which takes care of saving and restoring where the stack pointer was pointing!
normally the kernel extends the stack automatically
mmap is an alternative to malloc that gives you a big region of memory
each thread decrements number and then pauses itself. basically tests 50 million thread context switches across 500 threads with 20 ruby method frames in each thread stack
each thread decrements number and then pauses itself. basically tests 50 million thread context switches across 500 threads with 20 ruby method frames in each thread stack
each thread decrements number and then pauses itself. basically tests 50 million thread context switches across 500 threads with 20 ruby method frames in each thread stack
each thread decrements number and then pauses itself. basically tests 50 million thread context switches across 500 threads with 20 ruby method frames in each thread stack
each thread decrements number and then pauses itself. basically tests 50 million thread context switches across 500 threads with 20 ruby method frames in each thread stack
each thread decrements number and then pauses itself. basically tests 50 million thread context switches across 500 threads with 20 ruby method frames in each thread stack