Zurg part 1

2,314 views

Published on

Published in: Technology, Business
1 Comment
8 Likes
Statistics
Notes
No Downloads
Views
Total views
2,314
On SlideShare
0
From Embeds
0
Number of Embeds
46
Actions
Shares
0
Downloads
0
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide
  • * In script language
  • *Must be done in child process and pass back to parent
  • Zurg part 1

    1. 1. 1 chenshuo.com ZURG PART 1 OF N2012/04 Shuo Chen
    2. 2. What is it?2  An example of muduo protorpc A toy C++ project that can be useful  https://github.com/chenshuo/muduo-protorpc  分布式系统部署、监控与进程管理的几重境界  http://www.cnblogs.com/Solstice/archive/2011/05/09/2041306.html  多线程服务器的适用场合  http://blog.csdn.net/Solstice/article/details/5334243  分布式系统的工程化开发方法  http://blog.csdn.net/solstice/article/details/5950190 (slides)  http://techparty.org/2010/10/19/2010q4summary/ (video) 2012/04 chenshuo.com
    3. 3. Overview3  Master-Slave structure  Communicates with bi-directional RPC  Command line tool to change and view status  A web frontend in future if I have time to learn web  Central configuration of service placements  Zurg slave is memory-less, doesn’t store any thing  That is different to supervisord  Also serve as a name server  Master looks like a SPOF, but can be overcome 2012/04 chenshuo.com
    4. 4. Why not just run services as4 daemons?  It’s fine to do so on 5 hosts, how about 50? 500?  Not easy to upgrade apps  Usually needs to ssh to every host and restart apps  Not transparent  How is every application running well ?  Has to deploy a monitor system anyway  And the notification of app crashing is not real time  Auto restart daemons could hide the real problem and confuse the monitor system 2012/04 chenshuo.com
    5. 5. Zurg slave – functionalities5  Process management  Run a command (short-lived child process)  Start/stop a service (long-lived child process)  Not standard services, but programs written by yourself  Detect child death in real time and report to master  Not polling with pids or process names  Collecting performance metrics  Monitor system health  Both regular heartbeats and event notifications to Master 2012/04 chenshuo.com
    6. 6. Zurg slave – design decisions6  All-in-one single-threaded process  Don’tkeep running iostat/vmstat/top/netstat/XXXstat  Replaces(?) nagios/monit/ganglia/munin/supervisord  No plugins, just compiled what you need into one binary  C++ for efficient and less resource usage  Itruns on every hosts, every little helps  Often the monitoring tools* use too much resource  No local configuration, easy to deploy & upgrade  Just point it to the master  Start it in init.d, it will take over everything else 2012/04 chenshuo.com
    7. 7. Zurg slave – NOT in scope7  Configuration management  System administration  Use Puppet instead  Deployment of in-house software  Although can be done with ‘wget’ followed by ‘tar xf’ 2012/04 chenshuo.com
    8. 8. Run a command8  Start a child process  Wait until it finishes (asynchronously, of course)  Capture stdout/stderr  No other opened files in the parent should be leaked to child, set FD_CLOEXEC on every fd  Sounds like re-invent Python subprocess module?  Not exactly! 2012/04 chenshuo.com
    9. 9. The easy part of process mgmt9  Start a new process  fork(2)/exec*(2)  How to get errno if exec() failes? It’s in child process  “The self-pipe trick” http://cr.yp.to/docs/selfpipe.html  Get notification when a child terminates  SIGCHLD, either signalfd(2) or legacy signal handler  Signal is not reliable, so run wait(2) periodically (nb)  Get exit status of a terminated child process  wait4(2) tells everything incl. memory/CPU usage 2012/04 chenshuo.com
    10. 10. A simple challenge10  Limit the runtime of a command, not CPU time  Typical timeout of 60 seconds  Remember the pid when start running a command  Set up a timer, kill(2) it when timeout  How do you know that the process you are going to kill is the one that you created for the cmd?  Set atimer to kill pid 9527, 60 seconds later  What if process 9527 dies just before the timer event,  And a new process was created with the same pid (?!) 2012/04 chenshuo.com
    11. 11. Pid is unique but not always11  Pid wraps (in minutes or seconds)  Pid is unique when take a snapshot of all processes  But it is not unique if time moves on  The possible values of pids are small (1~32767)  /proc/sys/kernel/pid_max default 32768  /proc/loadavg lastpid 3387  /proc/stat processes 423666  There is a tiny time window between timer wakeup and kill(2)ing, anything could happen in between  And there is no mutex or lock for this race condition 2012/04 chenshuo.com
    12. 12. How to kill a child properly?12  So it is not safe to kill-by-pid, you may kill someone else’s child process by mistake  How about check ppid first?  Youmay kill you own new child, if another RunCommand reuses the pid just before the timer.  The pid + start_time combination is unique in space and time  Start time is in /proc/pid/stat, in jiffies since boot  Remember the start time after fork() a child*  Check start time before killing the child 2012/04 chenshuo.com
    13. 13. Why it is safe?13  If two processes start at almost the same time, their pids must be different  If two processes happen to have the same pid, their start time must be different  It takes seconds to wrap pid, start time is monotonic  Since zurg slave is single-threaded, no race condition between checking and killing  Don’t run zurg slave as root, (it quits if euid == 0)  Don’t run two zurg slaves with same uid on a box 2012/04 chenshuo.com
    14. 14. Capture stdout&stderr, simple ?14  Two pipes are needed, dup2() the write fd to 1, 2 in child, read the other side of two fds in parent.  Keep data in memory and send back when finishes  Command ‘cat /dev/zero’ will blow up zurg slave  We must limit the size of stdout and stderr  The default size is 1024KiB  Two approaches, when size breaches limit:  Stop reading, i.e. block writing, wait until timeout  Close the read side of pipe, i.e. kill child with SIGPIPE  Directly sending a SIGPIPE signal doesn’t work 2012/04 chenshuo.com
    15. 15. Race condition at process exits15  When a child exits, all its open fds will be closed  Parent will read(2) a 0, it should close the fd, otherwise POLLHUP will cause a busy loop  A child could close them purposefully before dying  The events of process exited and std{out,err} fds closed could arrive in no particular order  Is there any flying data that has not been received?  The lifetime mgmt of Process/Pipe objects are also subtle, as fds are reused so aggressively  Read the code to find out how to do it correctly 2012/04 chenshuo.com
    16. 16. Run Command Request16message RunCommandRequest { required string command = 1; optional string cwd = 2 [default = "/tmp"]; repeated string args = 3; repeated string envs = 4; optional bool envs_only = 5 [default = false]; optional int32 max_stdout = 6 [default = 1048576]; optional int32 max_stderr = 7 [default = 1048576]; optional int32 timeout = 8 [default = 60]; optional int32 max_memory_mb = 9 [default = 32768];} 2012/04 chenshuo.com
    17. 17. Run Command Response17message RunCommandResponse { required int32 error_code = 1; optional int32 pid = 2; optional int32 status = 3; optional bytes std_output = 4; optional bytes std_error = 5; optional int64 start_time_us = 16; optional int64 finish_time_us = 17; optional float user_time = 18; optional float system_time = 19; optional int64 memory_maxrss_kb = 20; // optional int64 ctxsw = 21; optional int32 exit_status = 30 [default = 0]; optional int32 signaled = 31 [default = 0]; optional bool coredump = 32 [default = false];} 2012/04 chenshuo.com
    18. 18. Run Script18  RunCommand with script file content provided in the request  A programmatic way to run slightly different scripts on many hosts 2012/04 chenshuo.com
    19. 19. Application management19  Start/monitor/stop applications  Applications a.k.a services, long running processes  Apps can be written in C++/Java/Python/etc.  Share most functionalities of RunCommand  stdout/stderr redirected to files, not captured  No timeout  Intrusive vs. non-intrusive  Canzurg_slave manage any application?  Should the managed application follow some rules? 2012/04 chenshuo.com
    20. 20. How to detect app exiting20  Polling (pid and start time)  Not real time, always with a poll interval  How do you know one process is the application?  SIGCHLD  Not 100% reliable, so call wait(2) periodically  Pipe, leave the write side in child process, read in zurg_slave, when app exits, read(2) returns 0  Reliable and promptly  The application must not close the fd* (intrusive!) 2012/04 chenshuo.com
    21. 21. What if zurg_slave crashes?21  How to prevent starting duplicated services  SIGCHILD and pipe(2) are nonrenewable  Sockets? App reconnects to localhost zurg slave  i.e. heartbeat between app and zurg slave  Even more intrusive, retry logic in all languages  Other thoughts?  An other layer of indirection? 2012/04 chenshuo.com
    22. 22. To be continued22  Collecting health & performance data  Periodically heartbeat to master  Process status, performance metrics  Zurg slave is 50% done as of end of April 2012 2012/04 chenshuo.com
    23. 23. Zurg Master23  A multithreaded program  Its status is all retrievable from outside  Easy to build Web/GUI frontends  Have not started coding yet. 2012/04 chenshuo.com

    ×