Gazelle - Plack Handler for performance freaks #yokohamapm

Gazelle 
Plack Handler for performance freaks 
Yokohama.pm #12 
Masahiro Nagano (kazeburo) 
https://www.!ickr.com/photos/ckindel/424610604/
Me 
• 長野雅広 (Masahiro Nagano) 
• @kazeburo 
• CPAN: KAZEBURO / github: kazeburo 
• 横浜市西区在住 
• ISUCON 2013,2014 優勝
Gazelle
Gazelle #とは 
• Plack Handler / PSGI Server 
• HTTP/1.0 Web Server 
• Preforking Architecture 
• Suitable for running application servers 
behind a reverse proxy 
• Starlet compatible / hot deploy 
• Fast Fast Fast
“Hello World” 
130,000 
97,500 
req/sec nginx starman Starlet Gazelle 
65,000 
32,500 
0 
106,028 
62,069 
33,300 
127,462 
3x Faster!! 
than starman
“counter.psgi” 
110,000 
82,500 
55,000 
27,500 
0 
42,285 
106,028 
28,292 
62,069 
20,100 
33,300 
hello world counter.psgi 
req/sec 
starman Starlet Gazelle
ISUCON4 Quali!er 
45,000 
33,750 
Score 予選通過ラインstarman Starlet Gazelle 
22,500 
11,250 
39,776 42,813 44,764 37,808 
「ISUCON4 予選でアプリケーションを変更せずに予選通過ラインを突破するの術」 
に若干変更を加えたバージョン
Gazelle はなぜ速い 
• Only Support HTTP/1.0 and does not 
support KeepAlive. It make code very simple 
• Mostly written in XS 
• Ultra fast HTTP processing using 
picohttpparser 
• Use accept4(2) 
• Use writev(2) for output responses
Simple HTTP/1.0 GET 
accept4(2) 
read(2) 
parse_header 
poll(2) 
complete? 
OK 
execute app 
writev(2) 
poll(2) 
written? 
close(2) 
No 
OK No
Mostly written in XS 
accept4(2) 
read(2) 
parse_header 
poll(2) 
complete? 
OK 
execute app 
writev(2) 
poll(2) 
written? 
close(2) 
No 
OK No 
XS 
XS
Perl code using XS 
while (1) { 
if ( my ($fd, $buf, $env) = accept_psgi( 
fileno($listen_sock), $timeout, $listen_sock_is_tcp, 
$host || 0, $port || 0 
) ) { 
my $guard = guard { close_client($fd) }; 
$res = Plack::Util::run_app $app, $env; 
my $status_code = $res->[0]; 
my $headers = $res->[1]; 
my $body = $res->[2]; 
write_psgi_response($fd, $timeout, $status_code, $headers, $body); 
}
Perl code using XS 
while (1) { 
if ( my ($fd, $buf, $env) = accept_psgi( 
fileno($listen_sock), $timeout, $listen_sock_is_tcp, 
$host || 0, $port || 0 
) ) { 
my $guard = guard { close_client($fd) }; 
$res = Plack::Util::run_app $app, $env; 
my $status_code = $res->[0]; 
my $headers = $res->[1]; 
my $body = $res->[2]; 
write_psgi_response($fd, $timeout, $status_code, $headers, $body); 
}
picohttpparser 
• created by kazuho-san 
• used in H2O and HTTP::Parser::XS
必読 
http://blog.kazuhooku.com/2014/11/the-internals-h2o-or-how-to-write-fast.html
accept4(2) 
• Required Linux >= 2.6.28 
• Set FD_CLOEXEC and O_NONBLOCK in 
one system call
accept4(2) 
int 
_accept(int fileno, struct sockaddr *addr, unsigned int addrlen) { 
int fd; 
#ifdef SOCK_NONBLOCK 
fd = accept4(fileno, addr, &addrlen, SOCK_CLOEXEC|SOCK_NONBLOCK); 
#else 
fd = accept(fileno, addr, &addrlen); 
fcntl(fd, F_SETFD, FD_CLOEXEC); 
fcntl(fd, F_SETFL, fcntl(fd, F_GETFL) | O_NONBLOCK); 
#endif 
return fd; 
}
accept4(2) 
13:51:26.755628 accept(4, {sa_family=AF_INET, sin_port=htons(42828), 
sin_addr=inet_addr("127.0.0.1")}, [16]) = 5 
13:51:27.324951 fcntl(5, F_SETFD, FD_CLOEXEC) = 0 
13:51:27.325014 fcntl(5, F_GETFL) = 0x2 (flags O_RDWR) 
13:51:27.325067 fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0 
13:51:27.325117 read(5, "GET / HTTP/1.1rnUser-Agent:"..., 16384) = 155 
13:51:27.325200 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0 
13:52:17.946622 accept4(4, {sa_family=AF_INET, sin_port=htons(42835), 
sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_CLOEXEC|SOCK_NONBLOCK) = 5 
13:52:18.505428 read(5, "GET / HTTP/1.1rnUser-Agent:”..., 16384) = 155 
13:52:18.505519 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0
writev 
• write multiple buffer to a fd in one 
system call 
• reduce memory copy or system calls
ex. memory copy 
#perl 
my $header = ["Server"=>"gazelle","Content-Type"=>"text/plain",...]; 
#xs 
char buf[512]; 
while ( i < av_len(headers) + 1 ) { 
key = SvPV_nolen(*av_fetch(headers,i++,0)); 
strcat(buf, key); 
strcat(buf, ": "); 
val = SvPV_nolen(*av_fetch(headers,i++,0)); 
strcat(buf, val); 
strcat(buf, "rn"); 
} 
write(fd, buf, sizeof(buf)-1); 
Too many 
memory copy 
cause system 
overhead
ex. write write write 
#perl 
my $header = ["Server"=>"gazelle","Content-Type"=>"text/plain"]; 
#xs 
while ( i < av_len(headers) + 1 ) { 
key = SvPV(*av_fetch(headers,i++,0),&len); 
write(fd, key, len); 
write(fd, ": ", sizeof(“: ”) - 1); 
val = SvPV(*av_fetch(headers,i++,0),&len); 
write(fd, val, len); 
write(fd, "rn", sizeof(“rn”) - 1); 
} 
Too many 
write(2) 
increase latency 
of network
writev(2) 
#perl 
my $header = ["Server"=>"gazelle","Content-Type"=>"text/plain"]; 
#xs 
struct iovec v[av_len(headers)+1)*2 + 10]; 
iovcnt = 0; 
while ( i < av_len(headers) + 1 ) { 
key = SvPV(*av_fetch(headers,i++,0),&len); 
iovcnt++; 
v[iovcnt].iov_base = key; 
v[iovcnt].iov_len = len; 
iovcnt++; 
v[iovcnt].iov_base = “: ”; 
no memory copy 
v[iovcnt].iov_len = sizeof(“: ”) - 1; 
one system call 
... 
} 
writev(fd, v, iovcnt);
writev(2) 
plackup -s Gazelle -e 'sub{[200,["Content-Type"=>"text/plain"], 
["xxx","xxx","yyy","yyy","zzzz","n"]]}' 
writev(5, [{"HTTP/1.0 200 OKrnConnection: closernServer: gazeller 
n", 53}, {"Content-Type", 12}, {": ", 2}, {"text/plain", 10}, {"rn", 
2}, {"Date: Fri, 28 Nov 2014 04:38:08 GMTrnrn", 39}, {"xxx", 3}, 
{"xxx", 3}, {"yyy", 3}, {"yyy", 3}, {"zzzz", 4}, {"n", 1}], 12) = 135
高速なサーバを書くには 
• write XS, minimize Perl code 
• reduce system calls 
• Zero Copy
高速なAppサーバって必要なの? 
• ISUCON :) 
• Social Games, AdTech, SNS 
• High optimized applications 
• few msec ~ few tens of msec 
• Several hundreds of request/sec/host 
• 1PVあたりの利益が小さいサービス
Gazelleの実績 
• livedoor Blog 
• 2500万req/day/host 
• Starletからの移行でCPU使用率 
1%~3%さがった
ぜひお使い下さい 
https://www.!ickr.com/photos/superformosa/9057428400/
1 of 26

More Related Content

Similar to Gazelle - Plack Handler for performance freaks #yokohamapm

Debugging: Rules & ToolsDebugging: Rules & Tools
Debugging: Rules & ToolsIan Barber
11.8K views67 slides

Similar to Gazelle - Plack Handler for performance freaks #yokohamapm(20)

Using ngx_lua in UPYUNUsing ngx_lua in UPYUN
Using ngx_lua in UPYUN
Cong Zhang13K views
Security Challenges in Node.jsSecurity Challenges in Node.js
Security Challenges in Node.js
Websecurify3.3K views
Internationalizing CakePHP ApplicationsInternationalizing CakePHP Applications
Internationalizing CakePHP Applications
Pierre MARTIN5.8K views
Debugging: Rules & ToolsDebugging: Rules & Tools
Debugging: Rules & Tools
Ian Barber11.8K views
Performance patternsPerformance patterns
Performance patterns
Stoyan Stefanov4.1K views
Puppet @ SeatPuppet @ Seat
Puppet @ Seat
Alessandro Franceschi5.9K views
Going crazy with Node.JS and CakePHPGoing crazy with Node.JS and CakePHP
Going crazy with Node.JS and CakePHP
Mariano Iglesias13.8K views
PHP 5.4PHP 5.4
PHP 5.4
Federico Damián Lozada Mosto1.1K views
Groovy on the ShellGroovy on the Shell
Groovy on the Shell
sascha_klein5.8K views
UspUsp
Usp
preethamnaik92644 views
Perl basics for PentestersPerl basics for Pentesters
Perl basics for Pentesters
Sanjeev Kumar Jaiswal1K views
Node.js - A Quick TourNode.js - A Quick Tour
Node.js - A Quick Tour
Felix Geisendörfer9.3K views
Nodejs - A-quick-tour-v3Nodejs - A-quick-tour-v3
Nodejs - A-quick-tour-v3
Felix Geisendörfer3.1K views
Facebook的缓存系统Facebook的缓存系统
Facebook的缓存系统
yiditushe1.1K views

More from Masahiro Nagano(20)

ISUCONの勝ち方 YAPC::Asia Tokyo 2015ISUCONの勝ち方 YAPC::Asia Tokyo 2015
ISUCONの勝ち方 YAPC::Asia Tokyo 2015
Masahiro Nagano54.6K views
Mackerel & Norikra mackerel meetup #4 LTMackerel & Norikra mackerel meetup #4 LT
Mackerel & Norikra mackerel meetup #4 LT
Masahiro Nagano36.6K views
Isucon makers casual talksIsucon makers casual talks
Isucon makers casual talks
Masahiro Nagano3K views
WebアプリケーションとメモリWebアプリケーションとメモリ
Webアプリケーションとメモリ
Masahiro Nagano13.9K views
MHA for MySQL の話MHA for MySQL の話
MHA for MySQL の話
Masahiro Nagano6.7K views

Recently uploaded(20)

Gazelle - Plack Handler for performance freaks #yokohamapm

  • 1. Gazelle Plack Handler for performance freaks Yokohama.pm #12 Masahiro Nagano (kazeburo) https://www.!ickr.com/photos/ckindel/424610604/
  • 2. Me • 長野雅広 (Masahiro Nagano) • @kazeburo • CPAN: KAZEBURO / github: kazeburo • 横浜市西区在住 • ISUCON 2013,2014 優勝
  • 4. Gazelle #とは • Plack Handler / PSGI Server • HTTP/1.0 Web Server • Preforking Architecture • Suitable for running application servers behind a reverse proxy • Starlet compatible / hot deploy • Fast Fast Fast
  • 5. “Hello World” 130,000 97,500 req/sec nginx starman Starlet Gazelle 65,000 32,500 0 106,028 62,069 33,300 127,462 3x Faster!! than starman
  • 6. “counter.psgi” 110,000 82,500 55,000 27,500 0 42,285 106,028 28,292 62,069 20,100 33,300 hello world counter.psgi req/sec starman Starlet Gazelle
  • 7. ISUCON4 Quali!er 45,000 33,750 Score 予選通過ラインstarman Starlet Gazelle 22,500 11,250 39,776 42,813 44,764 37,808 「ISUCON4 予選でアプリケーションを変更せずに予選通過ラインを突破するの術」 に若干変更を加えたバージョン
  • 8. Gazelle はなぜ速い • Only Support HTTP/1.0 and does not support KeepAlive. It make code very simple • Mostly written in XS • Ultra fast HTTP processing using picohttpparser • Use accept4(2) • Use writev(2) for output responses
  • 9. Simple HTTP/1.0 GET accept4(2) read(2) parse_header poll(2) complete? OK execute app writev(2) poll(2) written? close(2) No OK No
  • 10. Mostly written in XS accept4(2) read(2) parse_header poll(2) complete? OK execute app writev(2) poll(2) written? close(2) No OK No XS XS
  • 11. Perl code using XS while (1) { if ( my ($fd, $buf, $env) = accept_psgi( fileno($listen_sock), $timeout, $listen_sock_is_tcp, $host || 0, $port || 0 ) ) { my $guard = guard { close_client($fd) }; $res = Plack::Util::run_app $app, $env; my $status_code = $res->[0]; my $headers = $res->[1]; my $body = $res->[2]; write_psgi_response($fd, $timeout, $status_code, $headers, $body); }
  • 12. Perl code using XS while (1) { if ( my ($fd, $buf, $env) = accept_psgi( fileno($listen_sock), $timeout, $listen_sock_is_tcp, $host || 0, $port || 0 ) ) { my $guard = guard { close_client($fd) }; $res = Plack::Util::run_app $app, $env; my $status_code = $res->[0]; my $headers = $res->[1]; my $body = $res->[2]; write_psgi_response($fd, $timeout, $status_code, $headers, $body); }
  • 13. picohttpparser • created by kazuho-san • used in H2O and HTTP::Parser::XS
  • 15. accept4(2) • Required Linux >= 2.6.28 • Set FD_CLOEXEC and O_NONBLOCK in one system call
  • 16. accept4(2) int _accept(int fileno, struct sockaddr *addr, unsigned int addrlen) { int fd; #ifdef SOCK_NONBLOCK fd = accept4(fileno, addr, &addrlen, SOCK_CLOEXEC|SOCK_NONBLOCK); #else fd = accept(fileno, addr, &addrlen); fcntl(fd, F_SETFD, FD_CLOEXEC); fcntl(fd, F_SETFL, fcntl(fd, F_GETFL) | O_NONBLOCK); #endif return fd; }
  • 17. accept4(2) 13:51:26.755628 accept(4, {sa_family=AF_INET, sin_port=htons(42828), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5 13:51:27.324951 fcntl(5, F_SETFD, FD_CLOEXEC) = 0 13:51:27.325014 fcntl(5, F_GETFL) = 0x2 (flags O_RDWR) 13:51:27.325067 fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0 13:51:27.325117 read(5, "GET / HTTP/1.1rnUser-Agent:"..., 16384) = 155 13:51:27.325200 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0 13:52:17.946622 accept4(4, {sa_family=AF_INET, sin_port=htons(42835), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_CLOEXEC|SOCK_NONBLOCK) = 5 13:52:18.505428 read(5, "GET / HTTP/1.1rnUser-Agent:”..., 16384) = 155 13:52:18.505519 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  • 18. writev • write multiple buffer to a fd in one system call • reduce memory copy or system calls
  • 19. ex. memory copy #perl my $header = ["Server"=>"gazelle","Content-Type"=>"text/plain",...]; #xs char buf[512]; while ( i < av_len(headers) + 1 ) { key = SvPV_nolen(*av_fetch(headers,i++,0)); strcat(buf, key); strcat(buf, ": "); val = SvPV_nolen(*av_fetch(headers,i++,0)); strcat(buf, val); strcat(buf, "rn"); } write(fd, buf, sizeof(buf)-1); Too many memory copy cause system overhead
  • 20. ex. write write write #perl my $header = ["Server"=>"gazelle","Content-Type"=>"text/plain"]; #xs while ( i < av_len(headers) + 1 ) { key = SvPV(*av_fetch(headers,i++,0),&len); write(fd, key, len); write(fd, ": ", sizeof(“: ”) - 1); val = SvPV(*av_fetch(headers,i++,0),&len); write(fd, val, len); write(fd, "rn", sizeof(“rn”) - 1); } Too many write(2) increase latency of network
  • 21. writev(2) #perl my $header = ["Server"=>"gazelle","Content-Type"=>"text/plain"]; #xs struct iovec v[av_len(headers)+1)*2 + 10]; iovcnt = 0; while ( i < av_len(headers) + 1 ) { key = SvPV(*av_fetch(headers,i++,0),&len); iovcnt++; v[iovcnt].iov_base = key; v[iovcnt].iov_len = len; iovcnt++; v[iovcnt].iov_base = “: ”; no memory copy v[iovcnt].iov_len = sizeof(“: ”) - 1; one system call ... } writev(fd, v, iovcnt);
  • 22. writev(2) plackup -s Gazelle -e 'sub{[200,["Content-Type"=>"text/plain"], ["xxx","xxx","yyy","yyy","zzzz","n"]]}' writev(5, [{"HTTP/1.0 200 OKrnConnection: closernServer: gazeller n", 53}, {"Content-Type", 12}, {": ", 2}, {"text/plain", 10}, {"rn", 2}, {"Date: Fri, 28 Nov 2014 04:38:08 GMTrnrn", 39}, {"xxx", 3}, {"xxx", 3}, {"yyy", 3}, {"yyy", 3}, {"zzzz", 4}, {"n", 1}], 12) = 135
  • 23. 高速なサーバを書くには • write XS, minimize Perl code • reduce system calls • Zero Copy
  • 24. 高速なAppサーバって必要なの? • ISUCON :) • Social Games, AdTech, SNS • High optimized applications • few msec ~ few tens of msec • Several hundreds of request/sec/host • 1PVあたりの利益が小さいサービス
  • 25. Gazelleの実績 • livedoor Blog • 2500万req/day/host • Starletからの移行でCPU使用率 1%~3%さがった