Rhebok, High Performance Rack Handler / Rubykaigi 2015

53,815 views

Published on

Rhebok, High Performance Rack Handler / Rubykaigi 2015

Published in: Technology
0 Comments
13 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
53,815
On SlideShare
0
From Embeds
0
Number of Embeds
15,248
Actions
Shares
0
Downloads
12
Comments
0
Likes
13
Embeds 0
No embeds

No notes for slide

Rhebok, High Performance Rack Handler / Rubykaigi 2015

  1. 1. Rhebok, High performance Rack Handler Masahiro Nagano @kazeburo RubyKaigi 2015
  2. 2. Me •Masahiro Nagano •@kazeburo •Principal Site Reliability Engineer at Mercari, Inc.
  3. 3. Mercari •Download: 27M (JP+US) •GMV: Several Billion per a Month •Items: Several hundreds of thousand or more new items in a Day •Backend language: PHP, Go, lua, etc
  4. 4. Agenda •Rhebok Overview and Benchmark •How to create a High Performance Rack Handler & Rhebok internals
  5. 5. Rhebok Overview and Benchmark
  6. 6. Rhebok • Rack Handler/Web Server • 1.5x-2x performance when compared to Unicorn • Prefork Architecture same as Unicorn • Rhebok is suitable for running HTTP application servers behind a reverse proxy like nginx • Ruby port of Perl’s Gazelle
  7. 7. What’s Gazelle? • High Performance Plack Handler • Plack is Perl’s Rack • 2x~3x times faster than servers commonly used like Starman, Starlet • Production Ready • Installed to dozen servers and has shown to reduce their CPU usage by 1-3%
  8. 8. https://www.flickr.com/photos/rohit_saxena/9819136626/ https://www.flickr.com/photos/wildlifewanderer/8176461065/
  9. 9. Who should use Rhebok? •A Highly optimized high traffic websites • Gaming, Ad-tec, Recipe Site, Media or massive scale SNS • By using Rhebok, it is possible to improve the response speed to higher level •Can be applied to any website
  10. 10. general website optimized website SQLCacheWAFRack Handler Ruby SQLCacheRubyWAFRack Handler % in response time
  11. 11. Who should not use Rhebok? •Who want to use WebSocket or Streaming •Who can not setup the reverse proxy in front of Rhebok
  12. 12. Rhebok Spec •HTTP/1.1 Web Server •Support full HTTP/1.1 features except for KeepAlive •Support TCP and Unix Domain Socket •Hot Deployment using start_server •OobGC
  13. 13. Usage $ rackup -s Rhebok --port 8080 -E production -O MaxWorkers=20 -O MaxRequestPerChild=1000 -O OobGC=yes config.ru
  14. 14. Recommended configuration RhebokAmazon Web Services LLC or its affiliates. All rights reserved. Client Multimedia Corporate data center Traditional server Mobile Client IAM Add-on Example: IAM Add-on Assignment/ Task RequesterWorkers Reverse Proxy (Nginx,h2o) HTTP/2 HTTP/1.1 TCP Unix Domain Socket http { listen 443 ssl http2; upstream app { server unix:/path/to/app.sock; } server { location / { proxy_pass http://app; } location ~ ^/assets/ { root /path/to/webapp/assets; } } }
  15. 15. Hot Deploy
  16. 16. $ start_server --port 8080 -- rackup -s Rhebok -E production -O MaxWorkers=20 -O MaxRequestPerChild=1000 -O OobGC=yes config.ru perl: https://metacpan.org/release/Server-Starter golang: https://github.com/lestrrat/go-server-starter start_server
  17. 17. How works start_server start_server --port 8080 -- rackup Rhebok worker worker worker socket fork Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task RequesterWorkersAmazon Mechanical Turk Non-Service Specific Socket.for_fd( ENV["SERVER_STARTER_PORT"] )
  18. 18. How works start_server start_server --port 8080 -- rackup Rhebok worker worker worker socket Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task RequesterWorkersAmazon Mechanical Turk Non-Service Specific SIGHUP Rhebok worker worker worker fork Socket.for_fd( ENV["SERVER_STARTER_PORT"] )
  19. 19. How works start_server start_server --port 8080 -- rackup Rhebok worker worker worker socket Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task RequesterWorkersAmazon Mechanical Turk Non-Service Specific SIGHUP Rhebok worker worker worker SIGTERM
  20. 20. Benchmark
  21. 21. Benchmark environment • Amazon EC2 c3.8xlarge • 32 vcpu • Amazon Linux • Ruby 2.2.3 • Unicorn 5.0.0 / rhebok 0.9.0 • patched wrk that supports unix domain socket • https://github.com/kazeburo/wrk/tree/unixdomain2
  22. 22. Benchmark HelloWorld sinatra rails 5577 30788 248094 6151 34557 398898 req/sec Rhebok unicorn
  23. 23. ISUCON benchmark • ISUCON • web application tuning contest • Contestants compete with the scores of benchmark created by organizers • Web application that becomes the theme of ISUCON is close to the service it is in reality
  24. 24. ISUCON 4 Qualifier 43560 41175 SCORE unicorn Rhebok
  25. 25. How to create a high performance Rack Handler and Rhebok internals
  26. 26. Basics of Rack and Rack Handler
  27. 27. Rack •Rack is specification • interface between webservers that support ruby and ruby web frameworks •Rack also is implementation • eg. Rack::Request, Response and Middlewares
  28. 28. web server interface unicorn thin puma Rack Web interface Rails sinatra Padrino Web Server Framework
  29. 29. Rack Application app = Proc.new do |env| [ '200', {'Content-Type' => 'text/html'}, ['Hello'] ] end
  30. 30. Rack env hash •Hash object contains Request Data •CGI keys • REQUEST_METHOD, SCRIPT_NAME, PATH_INFO, QUERY_STRING, HTTP_Variables •Rack specific keys • rack.version, rack.url_scheme, rack.input, rack.errors, rack.multithread, rack.multiprocess, rack.run_once,rack.hijack?
  31. 31. Response Array [ '200', { 'Content-Type' => 'text/html', ‘X-Content-Type-Options’ => ‘nosniff’, ‘X-Frame-Options’ => ‘SAMEORIGIN’, ‘X-XSS-Protection’ => ‘1; mode=block’ }, ['Hello',‘world’] ]
  32. 32. Response body •Response body must respond to each • Array of strings • Application instance • File like object
  33. 33. Role of Rack Handler •Create env from an HTTP request sent from a client •Call an application •Create an HTTP response from array and send back to the client env app array HTTP req HTTP res
  34. 34. Create a Rack Handler
  35. 35. module Rack module Handler class Shika def self.run(app, options) slf = new() slf.run_server(app) end def run_server(app) server = TCPServer.new('0.0.0.0', 8080) while true conn = server.accept buf = "" while true buf << conn.sysread(4096) break if buf[-4,4] == "rnrn" end reqs = buf.split("rn") req = reqs.shift.split env = { 'REQUEST_METHOD' => req[0], 'SCRIPT_NAME' => '', 'PATH_INFO' => req[1], 'QUERY_STRING' => req[1].split('?').last, 'SERVER_NAME' => '0.0.0.0', 'SERVER_PORT' => '5000', 'rack.version' => [0,1], 'rack.input' => StringIO.new('').set_encoding('BINARY'), 'rack.errors' => STDERR, 'rack.multithread' => false, 'rack.multiprocess' => false, 'rack.run_once' => false, 'rack.url_scheme' => 'http' } reqs.each do |header| header = header.split(": ") env["HTTP_"+header[0].upcase.gsub('-','_')] = header[1]; end status, headers, body = app.call(env) res_header = "HTTP/1.0 "+status.to_s+" res_header << "+Rack::Utils::HTTP_STATUS_CODES[status]+"rn" headers.each do |k, v| res_header << "#{k}: #{v}rn" end res_header << "Connection: closernrn" conn.write(res_header) body.each do |chunk| conn.write(chunk) end conn.close end end end end create socket accept read request & create env run app create response
  36. 36. Run server $ rackup -r ./shika.rb -s Shika -E production config.ru
  37. 37. This rack handler has some problems • Performance problem • Handle only one request at once • Stop the whole world when one request lagged • No TIMEOUT • No HTTP request parser support HTTP/ 1.1 spec
  38. 38. Increase concurrency • Multi process • simple and easy to scale • Multi thread • lightweight context switch compared to the process • IO Multiplexing • Event driven, can handle many connections
  39. 39. Concurrency strategy • Unicorn • -> multi process • PUMA • -> multi thread + limited event model (+ multi process) • Thin • event model (+ multi process)
  40. 40. Manager Prefork Architecture
  41. 41. Manager bind listen Prefork Architecture
  42. 42. Worker accept Worker accept Worker accept Worker accept Manager bind listen fork fork fork fork Prefork Architecture
  43. 43. Worker accept Worker accept Worker accept Worker accept Manager bind listen fork fork fork fork Client Client ClientClient Prefork Architecture
  44. 44. prefork_engine •https://github.com/kazeburo/ prefork_engine •Ruby port of Perl’s Parallel::Prefork •a simple prefork server framework
  45. 45. prefork_engine server = TCPServer.new('0.0.0.0', 8080) pe = PreforkEngine.new({ "max_workers" => 5, "trap_signals" => { "TERM" => 'TERM', "HUP" => 'TERM', }, }) while !pe.signal_received.match(/^TERM$/) pe.start { # child while true conn = server.accept .... end } end pe.wait_all_children
  46. 46. IO timeout
  47. 47. IO timeout •Unicorn does not have io timeout • send SIGKILL to a long running process • default timeout 30 sec E, [2015-12-08T03:13:24.863287 #90217] ERROR -- : worker=0 PID: 90243 timeout (61s > 60s), killing E, [2015-12-08T03:13:24.865764 #90217] ERROR -- : reaped #<Process::Status: pid 90243 SIGKILL (signal 9)> worker=0 I, [2015-12-08T03:13:24.866176 #90217] INFO -- : worker=0 spawning...
  48. 48. Using select(2) while true connection = @server.accept buf = self.read_timeout(connection) if buf == nil connection.close next end parse_http_header(…) -- def read_timeout(conn) if !IO.select([conn],nil,nil,READ_TIMEOUT) return nil end return connection.sysread(4096) end
  49. 49. Rhebok supports IO timeout •Implement read_timeout in C • avoid strange behavior of nonblock + sysread • use poll(2) instead of select(2) $ rackup -s Rhebok -O Timeout=60 config.ru
  50. 50. Parse HTTP request
  51. 51. HTTP parser • HTTP Parser is easy to cause security issue. It's safer to choose an existing one that is widely used • There are several fast implementation • Mongrel based - Unicorn, PUMA • Node.js based - Passenger 5 • PicoHTTPParser - Rhebok, h2o • pico_http_parser in rubygems • Ruby binding of PicoHTTPParser
  52. 52. pico_http_parser benchmark 0 1 2 4 10 80814 118499 140395153002 167823 109602 166615 203201 231919 455188 # of headers picohttpparser unicorn
  53. 53. PicoHTTPParser in Rhebok •uses PicoHTTPParser directly • does not use pico_http_parser.gem •performs both of reading and parsing the HTTP header in a C function • reduce overhead of create Ruby’s string contain HTTP header
  54. 54. TCP optimization
  55. 55. TCP_NODELAY •When data is written, TCP does not send packets immediately. There are some delays. •TCP uses Nagle’s algorithm to collect small packets in order to send them all at once by default •TCP_NODELAY disable it
  56. 56. write(“foo”) write(“bar”) os/kernel clientApplication buffering “foobar” Nagle’s algorithm delay
  57. 57. write(“foo”) write(“bar”) os/kernel clientApplication “foo” “bar” TCP_NODELAY
  58. 58. Problem of TCP_NODELAY • When TCP_NODEALY is enable, take care of excessive fragmentation of tcp packet • causes increase network latency • To prevent fragmentation • concat data in application • use writev(2)
  59. 59. writev(2)
  60. 60. w/o writev(2) char *buf1 = “Hello ”; char *buf2 = “RubyKaigi”; char *buf3 = “rn”; write(fd, buf1, strlen(buf1)); write(fd, buf2, strlen(buf2)); write(fd, buf3, strlen(buf3)); kernel User Users Client MultimMobile Client Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task WorkersAmazon Mechanical Turk Non-Service Specific “Hello“ “RubyKaigi” “rn” many syscalls
  61. 61. w/o writev(2) char *buf1 = “Hello ”; char *buf2 = “RubyKaigi”; char *buf3 = “rn”; char *buf; str = (char *)malloc(100); strcat(buf, buf1); strcat(buf, buf2); strcat(buf, buf2); write(fd, buf, strlen(buf)); free(buf); kernel “Hello RubyKaigirn” one syscall Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task WorkersAmazon Mechanical Turk Non-Service Specific allocate memory
  62. 62. writev(2) ssize_t rv; char *buf1 = “Hello ”; char *buf2 = “RubyKaigi”; char *buf3 = “rn”; struct iovec v[3]; v[0].io_base = buf1; v[0].io_len = strlen(buf1); ... v[2].io_base = buf3; v[2].io_len = strlen(buf3); rv = writev(fd, v, 3); kernel Gathering buffers Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task WorkersAmazon Mechanical Turk Non-Service Specific “Hello RubyKaigirn” one syscall
  63. 63. Rhebok internals •Prefork Architecture •Effecient network IO •Ultra Fast HTTP parser •TCP Optimization •Implemented C
  64. 64. conclusion
  65. 65. conclusion •Rhebok is a High Performance Rack Handler •Rhebok is built on many modern technologies •Please use Rhebok and feedback to me
  66. 66. end

×