3. C100K
• The C100K problem is the problem of optimising
network sockets to handle a large number of clients
at the same time.The name C100K is a numeronym
for concurrently handling one hundred thousand
connections.
4. C100K: Key point
Memory
• the less, the better
Scheduler
• the faster, the better
Complexity
• the simple, the better
5. Solutions
• serve one client with each server thread, use blocking I/O
• Serve many clients with each thread, use nonblocking I/O and
level-triggered readiness notification
• Serve many clients with each thread, use nonblocking I/O and
readiness change notification
• Serve many clients with each thread, use asynchronous I/O
• Build the server code into the kernel
• Others …
6. • serve one client with each server thread, use blocking I/O
• Simple
• Memory
• ~1MB*100K = 100GB
• Scheduler
• maintain 100K kernel threads in RB tree(using NPTL)
• thread switch (~30 us) (http://blog.tsunanet.net/2010/11/how-long-does-it-
take-to-make-context.html)
Solution
7. • Serve many clients with each thread, use nonblocking I/O
• Complexity
• using epoll: can’t deal with file IO
• using thread pool or async API to deal with file IO
• Memory
• Low: numCPU * 1MB
• Scheduler
• Low
8. • Serve many clients with each thread, use asynchronous I/O
• Complexity
• Highest
• Limitation: only support direct io
• Memory
• Low: numCPU * 1MB
• Scheduler
• Low
14. Go vs Nginx
From: https://groups.google.com/forum/#!topic/golang-nuts/YIlpcoUPY_M
15. Go
• Go is an open source programming language that
makes it easy to build simple, reliable, and efficient
software.
16.
17.
18.
19. Go routine
def: A go routine is a lightweight
thread managed by the Go runtime
Feature: Go routine has its own stack
Feature: Go routine vs system thread
(M:N M>>N)
20. Go stack
• Tiny stack:~2KB
• Stack grows as needed (2X each time)
• when?
• how?
23. Go stack
• How
• split stack (v1.0 ~ v1.2)
or segment stack. In this approach, stacks are discontiguous and grow
incrementally. Each stack starts with a single segment. When the stack needs to
grow, another segment is allocated and linked to the previous one, and so forth.
Each stack is effectively a doubly linked list of one or more segments.
24.
25.
26. Go stack
• continous stack (v1.3 ~ )
or copy stack. When a stack needs to grow, instead of allocating a new
segment the runtime will:
1. create a new, somewhat larger stack
2. copy the contents of the old stack to the new stack
3. re-adjust every copied pointer to point to the new addresses
4. destroy the old stack
34. Syscall
• Enter sys-call
• detach M from P (p.m = nil), but m still point to p(m.p = p)
• P state change to PSyscall (P has some Gs)
35. Syscall
• Exit sys-call
• if m.p is still waiting, put g into p and run
• else find a idle p and run g in it
• else put g in global queue and sleep
36. Syscall
• P state during syscall
• if PSyscall state last over 10ms, sysmon will take over
• P state become PIdle
• sysmon will get a new idle M to run Gs on P
37. Channel r/w
• if G
• read from a empty channel (a := <-c)
• write to a full channel (false->c)
• G will be scheduled out
• A brand new G will be picked and run
38. Preemption
• sysmon will check all P’s last schedule time
• if now - lastsched > 10ms, force running G on P to schedule
• set G.preempt = true and
• set G.stackguard0 = StackPreempt(-1314)
• next time G does function call, stack overflow check, bingo!
• during new stack, schedule
39. Get runnable G
• from global run queue for fairness
• from P’s local runnable G queue
• wait any G blocked on IO event(network)
• steal from other P
43. Implementation(Linux)
• Using epoll under linux
• routines blocked on r/w will be scheduled out
• routines woken up if socket has i/o event
• sysmon will check I/O event in socket
• ……
49. wake up
• for each socket, routine associated with it will be
woken up
50. socket timeout
• timeout is different from os
• os to:how long app wait if no data transferred
• go to:when will this r/w failed
51. socket timeout
• timeout is different from os
• os to:set once, take effective every time
• go to:set each time before r/w
• go to:more exactly than os to!
• suppose 100KB data to be read, 10KB each time