Your SlideShare is downloading. ×
0
Developing high-performance network servers in Lisp Vladimir Sedach <vsedach@gmail.com>
Overview <ul><li>Servers and protocols
Strings, character encoding, and memory
Concurrency and asynchrony
Overview of Common Lisp HTTP servers employing these techniques </li></ul>
(Almost) no new ideas <ul><li>Basic considerations described by Jeff Darcy:
http://pl.atyp.us/content/tech/servers.html </li></ul>
What's a server, anyway? <ul><li>Get input
Process input
Write output </li></ul>
How is that different from any other program? <ul><li>Many clients
High-latency connections
Need to serve each client quickly
Need to serve many clients simultaneously  </li></ul>
Protocols <ul><li>”Process” part of Input-Process-Output is application specific
Server is focused on I/O
I/O is protocol-specific </li></ul>
HTTP <ul><li>1.0: One request-response per TCP connection
1.1: Persistent connection, possibly multiple (pipelined) request-responses per connection
Comet: Server waits to respond until some event occurs (faking two-way communication) </li></ul>
Cache is King <ul><li>Memory access is expensive
Memory copying is expensive </li></ul>
Character encoding <ul><li>How does your Lisp do it?
SBCL: 32 bits per character
read() returns vector of bytes </li></ul>
UTF-8 <ul><li>Use UTF-8 everywhere
Do everything with byte vectors (DIY character encoding)
No need to copy/re-encode result of read()
Do encoding of output at compile-time with (compiler) macros </li></ul>
Upcoming SlideShare
Loading in...5
×

Developing high-performance network servers in Lisp

4,306

Published on

Overview of current high-performance Common Lisp web servers and implementation techniques, and description of a new hybrid approach to asynchronous I/O based on separate racing accept() and epoll() thread pools.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,306
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
50
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Developing high-performance network servers in Lisp"

  1. 1. Developing high-performance network servers in Lisp Vladimir Sedach <vsedach@gmail.com>
  2. 2. Overview <ul><li>Servers and protocols
  3. 3. Strings, character encoding, and memory
  4. 4. Concurrency and asynchrony
  5. 5. Overview of Common Lisp HTTP servers employing these techniques </li></ul>
  6. 6. (Almost) no new ideas <ul><li>Basic considerations described by Jeff Darcy:
  7. 7. http://pl.atyp.us/content/tech/servers.html </li></ul>
  8. 8. What's a server, anyway? <ul><li>Get input
  9. 9. Process input
  10. 10. Write output </li></ul>
  11. 11. How is that different from any other program? <ul><li>Many clients
  12. 12. High-latency connections
  13. 13. Need to serve each client quickly
  14. 14. Need to serve many clients simultaneously </li></ul>
  15. 15. Protocols <ul><li>”Process” part of Input-Process-Output is application specific
  16. 16. Server is focused on I/O
  17. 17. I/O is protocol-specific </li></ul>
  18. 18. HTTP <ul><li>1.0: One request-response per TCP connection
  19. 19. 1.1: Persistent connection, possibly multiple (pipelined) request-responses per connection
  20. 20. Comet: Server waits to respond until some event occurs (faking two-way communication) </li></ul>
  21. 21. Cache is King <ul><li>Memory access is expensive
  22. 22. Memory copying is expensive </li></ul>
  23. 23. Character encoding <ul><li>How does your Lisp do it?
  24. 24. SBCL: 32 bits per character
  25. 25. read() returns vector of bytes </li></ul>
  26. 26. UTF-8 <ul><li>Use UTF-8 everywhere
  27. 27. Do everything with byte vectors (DIY character encoding)
  28. 28. No need to copy/re-encode result of read()
  29. 29. Do encoding of output at compile-time with (compiler) macros </li></ul>
  30. 30. Vectored I/O <ul><li>Access and copying is slow, so concatenation is slow
  31. 31. System calls also slow
  32. 32. Use writev() to output a list of byte vectors in one system call </li></ul>
  33. 33. Input <ul><li>Vectored input a pain, not worth it for HTTP
  34. 34. Reuse input buffers!
  35. 35. In some Common Lisp implementations (SBCL), think about declaring inline and ftype for FFI read() etc input functions to cut down on consing (small overhead but adds up) </li></ul>
  36. 36. Demultiplexing <ul><li>Threads
  37. 37. Asynchronous IO </li></ul>
  38. 38. Threads <ul><li>Thread Per Connection (TPC)
  39. 39. Thread blocks and sleeps when waiting for input
  40. 40. Threads have a lot more memory overhead than continuations
  41. 41. Passing information between threads expensive – synchronization and context switch
  42. 42. But can (almost) pretend we're servicing one client at a time </li></ul>
  43. 43. Asynchronous I/O <ul><li>I/O syscalls don't block
  44. 44. When cannot read from one client connection, try reading from another
  45. 45. Basically coroutines
  46. 46. epoll/kqueue interfaces in Linux/BSD provide efficient mechanism that does readiness notification for many fds </li></ul>
  47. 47. (Not So) Asynchronous I/O <ul><li>Most AIO HTTP servers not really AIO
  48. 48. Just loop polling/reading fds until a complete HTTP request has been received, then hand it off to a monolithic process_request() function
  49. 49. What happens when process_request() blocks on DB/file access?
  50. 50. What happens with more complicated protocols where input and processing are intertwined? </li></ul>
  51. 51. Handling AIO Events <ul><li>Have to resort to state machines if you don't have call/cc
  52. 52. For true AIO, need to involve state handling code with application code (even for HTTP for things like DB access)
  53. 53. Not feasible with state machines
  54. 54. Trivial with first-class continuations </li></ul>
  55. 55. Thread pools <ul><li>Typical design – one thread demultiplexes (accept()), puts work in a task queue for worker threads
  56. 56. Context switch before any work can get done
  57. 57. Synchronization costs on task queue </li></ul>
  58. 58. Leader-Follower (LF) <ul><li>http://www.kircher-schwanninger.de/michael/publications/lf.pdf
  59. 59. (let (fd)
  60. 60. (loop (with-lock-held (server-lock)
  61. 61. (setf fd (accept server-socket)))
  62. 62. (process-connection fd)))
  63. 63. On Linux and Windows, accept() is thread-safe – no need for taking server-lock </li></ul>
  64. 64. Which one is better? <ul><li>Paul Tyma (Mailinator, Google) says thread-per-connection handles more requests
  65. 65. http://paultyma.blogspot.com/2008/03/writing-java-multithreaded-servers.html
  66. 66. For many simultaneous connections, AIO consumes much less memory while making progress on input
  67. 67. In SBCL, AIO seems to win </li></ul>
  68. 68. Antiweb <ul><li>http://hoytech.com/antiweb/
  69. 69. Async (predates both nginx and lighttpd)
  70. 70. Connections managed by state machine
  71. 71. Multi-process, single-threaded
  72. 72. Doesn't use Lisp strings (UTF-8 in byte vectors)
  73. 73. Vectored I/O (incl support for HTTP pipelining – send multiple responses in one writev()) </li></ul>
  74. 74. TPD2 <ul><li>http://common-lisp.net/project/teepeedee2/
  75. 75. Async
  76. 76. Connections managed by continuations (code transformed with cl-cont)
  77. 77. Single-process, single-threaded
  78. 78. Doesn't use Lisp strings (UTF-8 in byte vectors)
  79. 79. Vectored I/O </li></ul>
  80. 80. TPD2 goodies <ul><li>Comes with mechanism for vectored I/O (sendbuf)
  81. 81. Comes with macros for UTF-8 encoding at compile-time (incl. HTML generation macros)
  82. 82. Really fast on John Fremlin's benchmark – 11k requests/sec vs 8k r/s for nginx and embedded Perl (http://john.freml.in/teepeedee2-c10k) </li></ul>
  83. 83. Rethinking web servers <ul><li>Look at HTTP as a protocol
  84. 84. Want to respond to initial request quickly
  85. 85. For persistent connections, there may or may not be another request coming, but the connections themselves pile up because of long keepalive times
  86. 86. Comet and long polling – assume huge number of connections sitting idle waiting for something to happen (ex - Web2.0 chat) </li></ul>
  87. 87. HTTP DOHC <ul><li>http://github.com/vsedach/HTTP-DOHC
  88. 88. (Needs repository version of IOLib with my patches, available on IOLib mailing list)
  89. 89. Hybrid TPC/AIO architecture (new)
  90. 90. Single process, multi-threaded
  91. 91. Doesn't use Lisp strings (UTF-8 in byte vectors)
  92. 92. Not finished yet
  93. 93. Not as fast as TPD2, faster than nginx </li></ul>
  94. 94. HTTP DOHC architecture <ul><li>LF thread pool demultiplexing accept() and immediately handling the initial HTTP request
  95. 95. If HTTP connection is persistent, hand it off to second LF thread pool demultiplexing epoll/kqueue to handle (possible) second and subsequent requests
  96. 96. For now the second thread pool still does TPC. Can make it do AIO WOLOG </li></ul>
  97. 97. Why separate LF pools? <ul><li>Combining multiple event sources into one results in an inverse leader/task-queue/worker pattern
  98. 98. Planning to look at SEDA for ideas on moving threads around between pools to handle load </li></ul>
  99. 99. HTTP DOHC TODO <ul><li>Plan to make interface as compatible as reasonable with Hunchentoot 1.0
  100. 100. Planned third LF thread pool to demultiplex synthetic server-side events to handle Comet requests
  101. 101. File handling
  102. 102. SSL
  103. 103. Gzip encoding </li></ul>
  104. 104. Conclusions <ul><li>Good server design depends on protocol </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×