Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Lisp Meetup #22 Eitaro Fukamachi 
Writing 
a fast HTTP parser
Thank you for coming.
I’m Eitaro Fukamachi 
@nitro_idiot fukamachi
(and 'web-application-developer 
'common-lisper)
We’re hiring! 
Tell @Rudolph_Miller.
fast-http 
• HTTP request/response parser 
• Written in portable Common Lisp 
• Fast 
• Chunked body parser
fast-http 
Benchmarked with SBCL 1.2.5 / GCC v6.0.0
Let me tell 
why I had to write 
a fast HTTP parser.
Wookie is slower than Node.js 
• Wookie is 2 times slower than Node.js 
• Profiling result was saying 
“WOOKIE:READ-DATA” ...
The bottleneck was 
HTTP parsing.
Wookie is slower than Node.js 
• Node.js’s HTTP parse is “http-parser”. 
• Written in C. 
• General version of Nginx’s HTT...
Today, I’m talking 
what I did for writing 
a fast Common Lisp program.
5 important things 
• Architecture 
• Reducing memory allocation 
• Choosing the right data types 
• Benchmark & Profile 
...
5 important things 
• Architecture 
• Reducing memory allocation 
• Choosing the right data types 
• Benchmark & Profile 
...
A brief introduction of HTTP
HTTP request look like… 
GET /media HTTP/1.1↵ 
Host: somewrite.jp↵ 
Connection: keep-alive↵ 
Accept: */*↵ 
↵
HTTP request look like… 
GET /media HTTP/1.1↵ 
Host: somewrite.jp↵ 
Connection: keep-alive↵ 
Accept: */*↵ 
↵ 
First Line 
...
HTTP request look like… 
GET /media HTTP/1.1↵ 
Host: somewrite.jp↵ 
Connection: keep-alive↵ 
Accept: */*↵ 
↵ CR + LF 
CRLF...
HTTP response look like… 
HTTP/1.1 200 OK↵ 
Cache-Control: max-age=0↵ 
Content-Type: text/html↵ 
Date: Wed, 26 Nov 2014 04...
HTTP response look like… 
HTTP/1.1 200 OK↵ 
Status Line 
Cache-Control: max-age=0↵ 
Content-Type: text/html↵ 
Headers 
Dat...
HTTP is… 
• Text-based protocol. (not binary) 
• Lines terminated with CRLF 
• Very lenient. 
• Ignore multiple spaces 
• ...
And, 
there’s another difficulty.
HTTP messages are 
sent over a network.
Which means, 
we need to think about 
long & incomplete 
HTTP messages.
There’s 2 ways 
to resolve this problem.
1. Stateful (http-parser)
http-parser (used in Node.js) 
• https://github.com/joyent/http-parser 
• Written in C 
• Ported from Nginx’s HTTP parser ...
http-parser (used in Node.js) 
for (p=data; p != data + len; p++) { 
… 
switch (parser->state) { 
case s_dead: 
… 
case s_...
http-parser (used in Node.js) 
for (p=data; p != data + len; p++) { 
… 
switch (parser->state) { 
Process char by char 
ca...
2. Stateless (PicoHTTPParser)
PicoHTTPParser (used in H2O) 
• https://github.com/h2o/picohttpparser 
• Written in C 
• Stateless 
• Reparse when the dat...
And fast-http is…
fast-http is in the middle 
• Not track state for every character 
• Set state for every line 
• It makes the program simp...
5 important things 
• Architecture 
• Reducing memory allocation 
• Choosing the right data types 
• Benchmark & Profile 
...
Memory allocation is slow 
• (in general) 
• Make sure not to allocate memory during 
processing 
• cons, make-instance, m...
5 important things 
• Architecture 
• Reducing memory allocation 
• Choosing the right data types 
• Benchmark & Profile 
...
Data types 
• Wrong data type makes your program slow. 
• List or Vector 
• Hash Table or Structure or Class
5 important things 
• Architecture 
• Reducing memory allocation 
• Choosing the right data types 
• Benchmark & Profile 
...
Benchmark is quite important 
• “Don’t guess, measure!” 
• Check if your changes improve the 
performance. 
• Benchmarking...
Profiling 
• SBCL has builtin profiler 
• (sb-profile:profile “FAST-HTTP” …) 
• (sb-profile:report)
5 important things 
• Architecture 
• Reducing memory allocation 
• Choosing the right data types 
• Benchmark & Profile 
...
Type declaration 
• Common Lisp has type declaration 
(optional) 
• (declare (type <type> <variable symbol>)) 
• It’s a hi...
(safety 0) 
• (safety 0) means “don’t check the type & 
array index in run-time”. 
• Fast & unsafe (like C) 
• Is fixnum e...
(safety 0) 
• fast-http has 2 layers 
• Low-level API 
• (speed 3) (safety 0) 
• High-level API (safer) 
• Check the varia...
Attitude
Attitude 
• Write carefully. 
• It’s possible to beat C program 
• (if the program is complicated enough) 
• Don’t give up...
Thanks.
EITARO FUKAMACHI 
8arrow.org 
@nitro_idiot fukamachi
Writing a fast HTTP parser
Writing a fast HTTP parser
Writing a fast HTTP parser
Upcoming SlideShare
Loading in …5
×

Writing a fast HTTP parser

3,490 views

Published on

At Lisp Meetup #22

Published in: Technology
  • Be the first to comment

Writing a fast HTTP parser

  1. 1. Lisp Meetup #22 Eitaro Fukamachi Writing a fast HTTP parser
  2. 2. Thank you for coming.
  3. 3. I’m Eitaro Fukamachi @nitro_idiot fukamachi
  4. 4. (and 'web-application-developer 'common-lisper)
  5. 5. We’re hiring! Tell @Rudolph_Miller.
  6. 6. fast-http • HTTP request/response parser • Written in portable Common Lisp • Fast • Chunked body parser
  7. 7. fast-http Benchmarked with SBCL 1.2.5 / GCC v6.0.0
  8. 8. Let me tell why I had to write a fast HTTP parser.
  9. 9. Wookie is slower than Node.js • Wookie is 2 times slower than Node.js • Profiling result was saying “WOOKIE:READ-DATA” was pretty slow. • It was only calling “http-parse”. • “http-parse” which is an HTTP parser Wookie is using.
  10. 10. The bottleneck was HTTP parsing.
  11. 11. Wookie is slower than Node.js • Node.js’s HTTP parse is “http-parser”. • Written in C. • General version of Nginx’s HTTP parser. • Is it possible to beat it with Common Lisp?
  12. 12. Today, I’m talking what I did for writing a fast Common Lisp program.
  13. 13. 5 important things • Architecture • Reducing memory allocation • Choosing the right data types • Benchmark & Profile • Type declarations
  14. 14. 5 important things • Architecture • Reducing memory allocation • Choosing the right data types • Benchmark & Profile • Type declarations
  15. 15. A brief introduction of HTTP
  16. 16. HTTP request look like… GET /media HTTP/1.1↵ Host: somewrite.jp↵ Connection: keep-alive↵ Accept: */*↵ ↵
  17. 17. HTTP request look like… GET /media HTTP/1.1↵ Host: somewrite.jp↵ Connection: keep-alive↵ Accept: */*↵ ↵ First Line Headers Body (empty, in this case)
  18. 18. HTTP request look like… GET /media HTTP/1.1↵ Host: somewrite.jp↵ Connection: keep-alive↵ Accept: */*↵ ↵ CR + LF CRLF * 2 at the end of headers
  19. 19. HTTP response look like… HTTP/1.1 200 OK↵ Cache-Control: max-age=0↵ Content-Type: text/html↵ Date: Wed, 26 Nov 2014 04:52:55 GMT↵ ↵ <html> …
  20. 20. HTTP response look like… HTTP/1.1 200 OK↵ Status Line Cache-Control: max-age=0↵ Content-Type: text/html↵ Headers Date: Wed, 26 Nov 2014 04:52:55 GMT↵ ↵ <html> … Body
  21. 21. HTTP is… • Text-based protocol. (not binary) • Lines terminated with CRLF • Very lenient. • Ignore multiple spaces • Allow continuous header values
  22. 22. And, there’s another difficulty.
  23. 23. HTTP messages are sent over a network.
  24. 24. Which means, we need to think about long & incomplete HTTP messages.
  25. 25. There’s 2 ways to resolve this problem.
  26. 26. 1. Stateful (http-parser)
  27. 27. http-parser (used in Node.js) • https://github.com/joyent/http-parser • Written in C • Ported from Nginx’s HTTP parser • Written as Node.js’s HTTP parser • Stateful
  28. 28. http-parser (used in Node.js) for (p=data; p != data + len; p++) { … switch (parser->state) { case s_dead: … case s_start_req_or_res: … case s_res_or_resp_H: … } }
  29. 29. http-parser (used in Node.js) for (p=data; p != data + len; p++) { … switch (parser->state) { Process char by char case s_dead: … case s_start_req_or_res: … case s_res_or_resp_H: … } } Do something for each state
  30. 30. 2. Stateless (PicoHTTPParser)
  31. 31. PicoHTTPParser (used in H2O) • https://github.com/h2o/picohttpparser • Written in C • Stateless • Reparse when the data is incomplete • Most HTTP request is small
  32. 32. And fast-http is…
  33. 33. fast-http is in the middle • Not track state for every character • Set state for every line • It makes the program simple • And easy to optimize
  34. 34. 5 important things • Architecture • Reducing memory allocation • Choosing the right data types • Benchmark & Profile • Type declarations
  35. 35. Memory allocation is slow • (in general) • Make sure not to allocate memory during processing • cons, make-instance, make-array… • subseq, append, copy-seq
  36. 36. 5 important things • Architecture • Reducing memory allocation • Choosing the right data types • Benchmark & Profile • Type declarations
  37. 37. Data types • Wrong data type makes your program slow. • List or Vector • Hash Table or Structure or Class
  38. 38. 5 important things • Architecture • Reducing memory allocation • Choosing the right data types • Benchmark & Profile • Type declarations
  39. 39. Benchmark is quite important • “Don’t guess, measure!” • Check if your changes improve the performance. • Benchmarking also keeps your motivation.
  40. 40. Profiling • SBCL has builtin profiler • (sb-profile:profile “FAST-HTTP” …) • (sb-profile:report)
  41. 41. 5 important things • Architecture • Reducing memory allocation • Choosing the right data types • Benchmark & Profile • Type declarations
  42. 42. Type declaration • Common Lisp has type declaration (optional) • (declare (type <type> <variable symbol>)) • It’s a hint for your Lisp compiler • (declare (optimize (speed 3) (safety 0))) • It’s your wish to your Lisp compiler See also: Cより高速なCommon Lispコードを書く
  43. 43. (safety 0) • (safety 0) means “don’t check the type & array index in run-time”. • Fast & unsafe (like C) • Is fixnum enough? • What do you do when someone passes a bignum to the function?
  44. 44. (safety 0) • fast-http has 2 layers • Low-level API • (speed 3) (safety 0) • High-level API (safer) • Check the variable type • (speed 3) (safety 2)
  45. 45. Attitude
  46. 46. Attitude • Write carefully. • It’s possible to beat C program • (if the program is complicated enough) • Don’t give up easily • Safety is more important than speed
  47. 47. Thanks.
  48. 48. EITARO FUKAMACHI 8arrow.org @nitro_idiot fukamachi

×