Developing high-performance network servers in Lisp

  • 3,929 views
Uploaded on

Overview of current high-performance Common Lisp web servers and implementation techniques, and description of a new hybrid approach to asynchronous I/O based on separate racing accept() and epoll() …

Overview of current high-performance Common Lisp web servers and implementation techniques, and description of a new hybrid approach to asynchronous I/O based on separate racing accept() and epoll() thread pools.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,929
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
47
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Developing high-performance network servers in Lisp Vladimir Sedach <vsedach@gmail.com>
  • 2. Overview
    • Servers and protocols
    • 3. Strings, character encoding, and memory
    • 4. Concurrency and asynchrony
    • 5. Overview of Common Lisp HTTP servers employing these techniques
  • 6. (Almost) no new ideas
    • Basic considerations described by Jeff Darcy:
    • 7. http://pl.atyp.us/content/tech/servers.html
  • 8. What's a server, anyway?
  • 11. How is that different from any other program?
    • Many clients
    • 12. High-latency connections
    • 13. Need to serve each client quickly
    • 14. Need to serve many clients simultaneously
  • 15. Protocols
    • ”Process” part of Input-Process-Output is application specific
    • 16. Server is focused on I/O
    • 17. I/O is protocol-specific
  • 18. HTTP
    • 1.0: One request-response per TCP connection
    • 19. 1.1: Persistent connection, possibly multiple (pipelined) request-responses per connection
    • 20. Comet: Server waits to respond until some event occurs (faking two-way communication)
  • 21. Cache is King
    • Memory access is expensive
    • 22. Memory copying is expensive
  • 23. Character encoding
    • How does your Lisp do it?
    • 24. SBCL: 32 bits per character
    • 25. read() returns vector of bytes
  • 26. UTF-8
    • Use UTF-8 everywhere
    • 27. Do everything with byte vectors (DIY character encoding)
    • 28. No need to copy/re-encode result of read()
    • 29. Do encoding of output at compile-time with (compiler) macros
  • 30. Vectored I/O
    • Access and copying is slow, so concatenation is slow
    • 31. System calls also slow
    • 32. Use writev() to output a list of byte vectors in one system call
  • 33. Input
    • Vectored input a pain, not worth it for HTTP
    • 34. Reuse input buffers!
    • 35. In some Common Lisp implementations (SBCL), think about declaring inline and ftype for FFI read() etc input functions to cut down on consing (small overhead but adds up)
  • 36. Demultiplexing
    • Threads
    • 37. Asynchronous IO
  • 38. Threads
    • Thread Per Connection (TPC)
    • 39. Thread blocks and sleeps when waiting for input
    • 40. Threads have a lot more memory overhead than continuations
    • 41. Passing information between threads expensive – synchronization and context switch
    • 42. But can (almost) pretend we're servicing one client at a time
  • 43. Asynchronous I/O
    • I/O syscalls don't block
    • 44. When cannot read from one client connection, try reading from another
    • 45. Basically coroutines
    • 46. epoll/kqueue interfaces in Linux/BSD provide efficient mechanism that does readiness notification for many fds
  • 47. (Not So) Asynchronous I/O
    • Most AIO HTTP servers not really AIO
    • 48. Just loop polling/reading fds until a complete HTTP request has been received, then hand it off to a monolithic process_request() function
    • 49. What happens when process_request() blocks on DB/file access?
    • 50. What happens with more complicated protocols where input and processing are intertwined?
  • 51. Handling AIO Events
    • Have to resort to state machines if you don't have call/cc
    • 52. For true AIO, need to involve state handling code with application code (even for HTTP for things like DB access)
    • 53. Not feasible with state machines
    • 54. Trivial with first-class continuations
  • 55. Thread pools
    • Typical design – one thread demultiplexes (accept()), puts work in a task queue for worker threads
    • 56. Context switch before any work can get done
    • 57. Synchronization costs on task queue
  • 58. Leader-Follower (LF)
    • http://www.kircher-schwanninger.de/michael/publications/lf.pdf
    • 59. (let (fd)
    • 60. (loop (with-lock-held (server-lock)
    • 61. (setf fd (accept server-socket)))
    • 62. (process-connection fd)))
    • 63. On Linux and Windows, accept() is thread-safe – no need for taking server-lock
  • 64. Which one is better?
    • Paul Tyma (Mailinator, Google) says thread-per-connection handles more requests
    • 65. http://paultyma.blogspot.com/2008/03/writing-java-multithreaded-servers.html
    • 66. For many simultaneous connections, AIO consumes much less memory while making progress on input
    • 67. In SBCL, AIO seems to win
  • 68. Antiweb
    • http://hoytech.com/antiweb/
    • 69. Async (predates both nginx and lighttpd)
    • 70. Connections managed by state machine
    • 71. Multi-process, single-threaded
    • 72. Doesn't use Lisp strings (UTF-8 in byte vectors)
    • 73. Vectored I/O (incl support for HTTP pipelining – send multiple responses in one writev())
  • 74. TPD2
    • http://common-lisp.net/project/teepeedee2/
    • 75. Async
    • 76. Connections managed by continuations (code transformed with cl-cont)
    • 77. Single-process, single-threaded
    • 78. Doesn't use Lisp strings (UTF-8 in byte vectors)
    • 79. Vectored I/O
  • 80. TPD2 goodies
    • Comes with mechanism for vectored I/O (sendbuf)
    • 81. Comes with macros for UTF-8 encoding at compile-time (incl. HTML generation macros)
    • 82. Really fast on John Fremlin's benchmark – 11k requests/sec vs 8k r/s for nginx and embedded Perl (http://john.freml.in/teepeedee2-c10k)
  • 83. Rethinking web servers
    • Look at HTTP as a protocol
    • 84. Want to respond to initial request quickly
    • 85. For persistent connections, there may or may not be another request coming, but the connections themselves pile up because of long keepalive times
    • 86. Comet and long polling – assume huge number of connections sitting idle waiting for something to happen (ex - Web2.0 chat)
  • 87. HTTP DOHC
    • http://github.com/vsedach/HTTP-DOHC
    • 88. (Needs repository version of IOLib with my patches, available on IOLib mailing list)
    • 89. Hybrid TPC/AIO architecture (new)
    • 90. Single process, multi-threaded
    • 91. Doesn't use Lisp strings (UTF-8 in byte vectors)
    • 92. Not finished yet
    • 93. Not as fast as TPD2, faster than nginx
  • 94. HTTP DOHC architecture
    • LF thread pool demultiplexing accept() and immediately handling the initial HTTP request
    • 95. If HTTP connection is persistent, hand it off to second LF thread pool demultiplexing epoll/kqueue to handle (possible) second and subsequent requests
    • 96. For now the second thread pool still does TPC. Can make it do AIO WOLOG
  • 97. Why separate LF pools?
    • Combining multiple event sources into one results in an inverse leader/task-queue/worker pattern
    • 98. Planning to look at SEDA for ideas on moving threads around between pools to handle load
  • 99. HTTP DOHC TODO
    • Plan to make interface as compatible as reasonable with Hunchentoot 1.0
    • 100. Planned third LF thread pool to demultiplex synthetic server-side events to handle Comet requests
    • 101. File handling
    • 102. SSL
    • 103. Gzip encoding
  • 104. Conclusions
    • Good server design depends on protocol