2. Who are you?
> Masahiro Nakagawa
> github/twitter: @repeatedly
> Treasure Data, Inc.
> Senior Software Engineer
> Fluentd / td-agent developer
> I love OSS :)
> D language - Phobos committer
> Fluentd - Main maintainer
> MessagePack / RPC- D and Python (only RPC)
> The organizer of Presto Source Code Reading
> etc…
4. Ruby is not only for web apps!
> System programs
• Chef - server configuration management tool
• Serverspec - spec framework for servers
•Apache Deltacloud - IaaS API abstraction library
> Network servers
• Starling - distributed message queue server
•Unicorn - multiprocess HTTP server
> Log servers
• Fluentd - extensible data collection tool
5. Problem: server programming is hard
Server programs should support:
> multi-process or multi-thread
> robust error handling
> log rotation
> signal handling
> dynamic reconfiguration
> metrics collection
> etc...
6. Solution: Use a framework
ServerEngine A framework for server programming in Ruby
github.com/fluent/serverengine
7. What’s ServerEngine?
With ServerEngine, we can write multi-process
server programs, like Unicorn, easily.
What we need to write is a 2 modules:
Worker module and Server module.
Everything else, including daemonize, logging,
dynamic reconfiguration, multi-processing
is done by ServerEngine.
8. Hello world in ServerEngine
require
'serverengine'
!
module
MyWorker
def
run
until
@stop
logger.info
"Hello
world!"
sleep
1
end
end
!
def
stop
@stop
=
true
end
end
!
se
Worker
Server
=
ServerEngine.create(nil,
MyWorker,
{
log:
'myserver.log',
pid_path:
'myserver.pid',
})
se.run
Config
9. How ServerEngine works?
1. Robust process management (supervisor)
2. Multi-process and multi-threading
3. Dynamic configuration reloading
4. Log rotation
5. Signal handling
6. Live restart
7. “sigdump”
10. 1. Robust process management
Heartbeat via pipe
& auto-restart
Supervisor Server
Dynamic reconfiguration
& live restart support
Multi-process
Worker
Worker or Multi-thread
Worker
11. Each role overview
Supervisor Server Worker
• Manage Server
• heartbeat
• attach / detach
• restart
!
• Disable by default
!
• No extension point
• Manage Worker
• monitor
• restart
!
• Some execution types
• Embedded
• Thread
• Process
!
• Extension point
• before_run
• after_run
• after_restart
• Execution unit
• implement run
method
!
• Extension point
• stop
• reload
• before_fork
• after_start
12. 2. Multi-process & multi-threading
require
'serverengine'
!
module
MyWorker
def
run
until
@stop
logger.info
"Awesome
work!"
sleep
1
end
end
!
def
stop
@stop
=
true
end
end
!
se
=
ServerEngine.create(nil,
MyWorker,
{
daemonize:
true,
log:
'myserver.log',
pid_path:
'myserver.pid',
worker_type:
'process',
workers:
4,
})
se.run
> 3 server types
> embedded
> process
> thread
- thread example
se
=
ServerEngine.create(nil,
MyWorker,{
daemonize:
true,
log:
'myserver.log',
pid_path:
'myserver.pid',
worker_type:
'thread',
workers:
8,
})
se.run
13. 2. Multi-process & multi-threading
embedded process thread
fork
Worker
• default mode
• use main thread
Server
Worker
• use fork for parallel
execution
• not work on Windows
• use thread for parallel
execution
• for JRuby and Rubinius
Server
WWorokrekrer
Server
Thread.new
WWorokrekrer
Worker
14. 3. Dynamic reconfiguration
module
MyWorker
def
initialize
reload
end
!
def
run
#
…
end
!
def
reload
@message
=
config["message"]
||
"default"
@sleep
=
config["sleep"]
||
1
end
end
!
se
=
ServerEngine.create(nil,
MyWorker)
do
YAML.load_file("config.yml").merge({
:daemonize
=>
true,
:worker_type
=>
'process',
})
end
se.run
> Overwrite method
> reload
in worker
> reload_config
in server
> Send USR2 signal
15. 4. Log rotation
> Support useful features
> multi-process aware log rotation
> support “trace” level
> Port to Ruby core
> https://github.com/ruby/ruby/pull/428
se
=
ServerEngine.create(MyServer,
MyWorker,
{
log:
'myserver.log',
log_level:
'debug',
log_rotate_age:
5,
log_rotate_size:
1
*
1024
*
1024,
})
se.run
16. 5. Signal handling
> Queue based signal handling
> serialize signal processing
> signal handling is separated from
signal handler to avoid lock issues
SignalThread.new
do
|st|
st.trap(:TERM)
{
server.stop(true)
}
st.trap(:QUIT)
{
server.stop(false)
}
st.trap(:USR1)
{
server.restart(true)
}
st.trap(:HUP)
{
server.restart(false)
}
st.trap(:USR2)
{
server.reload
}
#
...
end
17. 5. Signal handling - register 1
SignalThread
INT { process_int }
Register Signal
INT
18. 5. Signal handling - register 2
SignalThread
USR1 { process_usr1 }
INT { process_int }
Register Signal
USR1
19. 5. Signal handling - register 3
SignalThread
QUIT { process_quit }
TERM { process_term }
USR1 { process_usr1 }
INT { process_int }
Register Signal
XXX
20. 5. Signal handling - process 1
SignalThread
QUIT { process_quit }
TERM { process_term }
USR1 { process_usr1 }
{ process_int }
USR1
Send Signal
USR1
INT
Monitor queue
21. 5. Signal handling - process 2
SignalThread
QUIT { process_quit }
TERM { process_term }
USR1 { process_usr1 }
INT { process_int }
QUIT
USR1
Send Signal
QUIT
22. 5. Signal handling - process 3
SignalThread
QUIT { process_quit }
TERM { process_term }
USR1 { process_usr1 }
INT { process_int }
QUIT
Send Signal
XXX
23. 6. Live Restart
> Minimize server restart downtime
> via INT signal
> enable_detach and
supervisor parameters must be true
> Network server can’t use live restart
> “Address already in use” occurred
> use “process” worker and USR1 instead
• restart workers, not server
24. 6. Live Restart - flow 1
1. start a server
Supervisor Server WWWooorrkrkkeeerrr
25. 6. Live Restart - flow 2
2. receive SIGINT and
wait for shutdown of the server
Supervisor Server WWWooorrkrkkeeerrr
26. 6. Live Restart - flow 3
3. start new server if
the server doesn’t exit in server_detach_wait
Supervisor Server
Worker
Server WWWooorrkrkkeeerrr
27. 7. “sigdump”
> SIGQUIT of JavaVM for Ruby
> https://github.com/frsyuki/sigdump
> dump backtrace of running threads
and allocated object list
> for debugging, slow code, dead-lock, …
> ServerEngine traps SIGCONT for sigdump
> Trapping signal is configurable using
“SIGDUMP_SIGNAL” environment variable
29. Use-case1: Sneakers
> A fast background processing
framework for Ruby
> use ServerEngine and RabbitMQ
> jondot.github.io/sneakers/
Server WWWooorrkrkkeeerrr
Task
Sneakers RabbitMQ
30. Use-case2: Fluentd v1
> Data collector for unified logging layer
> http://www.fluentd.org/
> Improve core features
> Logging
> Signal handling
> New features based on ServerEngine
> Multi-process support
> Zero downtime restart
> etc…
31. Fluentd v1 - Multi-process
Worker
Supervisor
Worker Worker
<Worker>
input tail
output forward
</worker>
<Worker>
input forward
output webhdfs
</worker>
<Worker>
input foo
output bar
</worker>
Separate stream pipelines in one instance!
32. Fluentd v1 - Zero downtime restart
> SocketManager shares the resource
32
Supervisor
TCP
1. Listen TCP socket
33. Fluentd v1 - Zero downtime restart
> SocketManager shares the resource
33
Worker
Supervisor
heartbeat
TCP
TCP
1. Listen TCP socket
2. Pass its socket to worker
34. Fluentd v1 - Zero downtime restart
> SocketManager shares the resource
34
Worker
Supervisor
1. Listen TCP socket
2. Pass its socket to worker
3. Do same action
at worker restarting
with keeping TCP socket
Worker
TCP
TCP
heartbeat