HTTPプロクシライブラリ
proxy2の設計と実装
inaz2
PyCon JP 2016
2016/09/22
Design and Implementation of proxy2, the HTTP Proxy Library
About me
• inaz2
• https://twitter.com/inaz2
• https://github.com/inaz2
• Security engineer & Python programmer
• Weblog: ももいろテクノロジー
• http://inaz2.hatenablog.com/
2
HTTP Proxy
• There are some proxies for caching or load balancing
• But the “proxy” in this talk is a little different with these
3
Do you know Proxomitron?
• http://www.proxomitron.info/
• Since 1999 till 2003
4
Local debug proxy
• Intercept and modify the HTTP request/response
5
Request
Response
Logging and modifying
Major debugging proxies
• Useful for debugging and security testing
• Burp Proxy
• https://portswigger.net/burp/proxy.html
• Fiddler
• http://www.telerik.com/fiddler
• OWASP ZAP
• https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_
Project
• Charles
• https://www.charlesproxy.com/
• mitmproxy
• https://mitmproxy.org/
6
These are useful but …
• Not intended for automated translation
• Not intended for large-scale logging and statistics
• Able to extend but not handy
• I need a proxy like tcpdump (or like tail -f)
• I need a proxy that is easy to use with crawlers
• I need a proxy fully customizable
7
proxy2
• https://github.com/inaz2/proxy2
• Single python script
• Require no external modules
• Support IPv6
• Support HTTP/1.1 persistent connection
• Support HTTPS relay/intercept
• Easy to customize with Python!
8
Demo
9
Customizing handlers
• Change User-Agent header
10
11
12
Design and
implementation
13
Disclaimer
• This script doesn’t support Python 3 yet …
• Pull Request is welcome (;´Д`)
14
Design policy
• Make it simple, less dependent
• Single python script
• Use standard modules only
• Implement it as base class
• Prepare {request,response,save}_handler()
• Users derive the class and override each handler
• Default handlers dump HTTP headers and some useful
info
15
Connection flow and handlers
16
client proxy2 server
Request
Request
Response
Response
request_handler(req)
(modify the request)
response_handler(req, res)
(modify the response)
save_handler(req, res)
(task that takes long time)
Making HTTP server is easy
• Use BaseHTTPServer module
• https://hg.python.org/cpython/file/2.7/Lib/BaseHTTPServer.py
• Server with multi-threading and IPv6 support
• Request handler
17
Roadblocks on HTTP/1.1 proxy
• HTTP/1.1 Persistent Connection
• Content-Encoding
• Hop-by-hop Headers
18
HTTP/1.1 Persistent Connection
• Reusing connection to the same server
• httplib.HTTPConnection()
• Low-level http client
• threading.local()
• Thread-local storage (as the server is multi-thread)
19
Content-Encoding
• Response body can be compressed
• For handlers, proxy2 decompress and re-compress it
• gzip and deflate module
20
Hop-by-hop Headers
• In RFC 2616 (deprecated), proxy must remove the below
headers:
• Connection, Keep-Alive, Proxy-Authenticate, Proxy-Authorization,
TE, Trailers, Transfer-Encoding, Upgrade
• RFC 7230 no longer defines the implicit list
• "hop-by-hop" header fields are required to appear in the Connection
header field (A.2)
• http://lists.w3.org/Archives/Public/ietf-http-
wg/2014JulSep/1771.html
• Although, proxy2 remove the above headers for
compatibility
21
Handling HTTPS
• HTTPS = HTTP over SSL/TLS
• When you access “https://www.example.com/”, the client
sends the HTTP request:
• CONNECT www.example.com:443 HTTP/1.1
• The proxy returns the HTTP response:
• 200 Connection Established
• After that, the client starts SSL/TLS handshake and
encrypted transmission
22
HTTPS relay
• Just relay handshakes and encrypted payloads
• proxy2 can’t understand the content
23
client proxy2 server
CONNECT
Connection Established
Handshake and
encrypted transmission
HTTPS relay
• select.select()
• pick out readable sockets in the list
• Receive data and send it to the other socket
24
HTTPS intercept (Man-in-the-Middle)
• The proxy generates the certificate for a requested domain
• And works as a HTTPS server with the generated certificate
25
client proxy2 server
CONNECT
Connection Established
Handshake and transmission Handshake and transmission
HTTPS intercept (Man-in-the-Middle)
• ssl.wrap_socket()
• Make a socket over SSL/TLS
• with a private key and the corresponding public key’s certificate
• wrap BaseHTTPRequestHandler.connection
26
Generating SSL/TLS certificates
• In this case, proxy2 depends on OpenSSL
• You know poor implementations cause severe security risks
• OpenSSL makes a Certificate Authority “proxy2 CA” and generates
certificates signed by the CA
• The browser can install the CA certificate from “http://proxy2.test/”
through proxy2
27
proxy2 CA
signed certificates
sign
“I’ll trust your sign.”
client
28
29
Recap
• Proxy is fun
• Python’s “batteries” are very powerful
• BaseHTTPServer, httplib, threading, gzip, deflate, select, ssl
• HTTP proxy is easy to understand but not simple
• proxy2 made it simple 
30
References
• proxy2: HTTPS pins and needles
• http://www.slideshare.net/inaz2/20150509-sumidasec-
47934674
• RFC 2616 (deprecated)
• https://tools.ietf.org/html/rfc2616
• RFC 7230-7235
• https://tools.ietf.org/html/rfc7230
31
Thank you!
inaz2
32

HTTPプロクシライブラリproxy2の設計と実装

  • 1.
  • 2.
    About me • inaz2 •https://twitter.com/inaz2 • https://github.com/inaz2 • Security engineer & Python programmer • Weblog: ももいろテクノロジー • http://inaz2.hatenablog.com/ 2
  • 3.
    HTTP Proxy • Thereare some proxies for caching or load balancing • But the “proxy” in this talk is a little different with these 3
  • 4.
    Do you knowProxomitron? • http://www.proxomitron.info/ • Since 1999 till 2003 4
  • 5.
    Local debug proxy •Intercept and modify the HTTP request/response 5 Request Response Logging and modifying
  • 6.
    Major debugging proxies •Useful for debugging and security testing • Burp Proxy • https://portswigger.net/burp/proxy.html • Fiddler • http://www.telerik.com/fiddler • OWASP ZAP • https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_ Project • Charles • https://www.charlesproxy.com/ • mitmproxy • https://mitmproxy.org/ 6
  • 7.
    These are usefulbut … • Not intended for automated translation • Not intended for large-scale logging and statistics • Able to extend but not handy • I need a proxy like tcpdump (or like tail -f) • I need a proxy that is easy to use with crawlers • I need a proxy fully customizable 7
  • 8.
    proxy2 • https://github.com/inaz2/proxy2 • Singlepython script • Require no external modules • Support IPv6 • Support HTTP/1.1 persistent connection • Support HTTPS relay/intercept • Easy to customize with Python! 8
  • 9.
  • 10.
    Customizing handlers • ChangeUser-Agent header 10
  • 11.
  • 12.
  • 13.
  • 14.
    Disclaimer • This scriptdoesn’t support Python 3 yet … • Pull Request is welcome (;´Д`) 14
  • 15.
    Design policy • Makeit simple, less dependent • Single python script • Use standard modules only • Implement it as base class • Prepare {request,response,save}_handler() • Users derive the class and override each handler • Default handlers dump HTTP headers and some useful info 15
  • 16.
    Connection flow andhandlers 16 client proxy2 server Request Request Response Response request_handler(req) (modify the request) response_handler(req, res) (modify the response) save_handler(req, res) (task that takes long time)
  • 17.
    Making HTTP serveris easy • Use BaseHTTPServer module • https://hg.python.org/cpython/file/2.7/Lib/BaseHTTPServer.py • Server with multi-threading and IPv6 support • Request handler 17
  • 18.
    Roadblocks on HTTP/1.1proxy • HTTP/1.1 Persistent Connection • Content-Encoding • Hop-by-hop Headers 18
  • 19.
    HTTP/1.1 Persistent Connection •Reusing connection to the same server • httplib.HTTPConnection() • Low-level http client • threading.local() • Thread-local storage (as the server is multi-thread) 19
  • 20.
    Content-Encoding • Response bodycan be compressed • For handlers, proxy2 decompress and re-compress it • gzip and deflate module 20
  • 21.
    Hop-by-hop Headers • InRFC 2616 (deprecated), proxy must remove the below headers: • Connection, Keep-Alive, Proxy-Authenticate, Proxy-Authorization, TE, Trailers, Transfer-Encoding, Upgrade • RFC 7230 no longer defines the implicit list • "hop-by-hop" header fields are required to appear in the Connection header field (A.2) • http://lists.w3.org/Archives/Public/ietf-http- wg/2014JulSep/1771.html • Although, proxy2 remove the above headers for compatibility 21
  • 22.
    Handling HTTPS • HTTPS= HTTP over SSL/TLS • When you access “https://www.example.com/”, the client sends the HTTP request: • CONNECT www.example.com:443 HTTP/1.1 • The proxy returns the HTTP response: • 200 Connection Established • After that, the client starts SSL/TLS handshake and encrypted transmission 22
  • 23.
    HTTPS relay • Justrelay handshakes and encrypted payloads • proxy2 can’t understand the content 23 client proxy2 server CONNECT Connection Established Handshake and encrypted transmission
  • 24.
    HTTPS relay • select.select() •pick out readable sockets in the list • Receive data and send it to the other socket 24
  • 25.
    HTTPS intercept (Man-in-the-Middle) •The proxy generates the certificate for a requested domain • And works as a HTTPS server with the generated certificate 25 client proxy2 server CONNECT Connection Established Handshake and transmission Handshake and transmission
  • 26.
    HTTPS intercept (Man-in-the-Middle) •ssl.wrap_socket() • Make a socket over SSL/TLS • with a private key and the corresponding public key’s certificate • wrap BaseHTTPRequestHandler.connection 26
  • 27.
    Generating SSL/TLS certificates •In this case, proxy2 depends on OpenSSL • You know poor implementations cause severe security risks • OpenSSL makes a Certificate Authority “proxy2 CA” and generates certificates signed by the CA • The browser can install the CA certificate from “http://proxy2.test/” through proxy2 27 proxy2 CA signed certificates sign “I’ll trust your sign.” client
  • 28.
  • 29.
  • 30.
    Recap • Proxy isfun • Python’s “batteries” are very powerful • BaseHTTPServer, httplib, threading, gzip, deflate, select, ssl • HTTP proxy is easy to understand but not simple • proxy2 made it simple  30
  • 31.
    References • proxy2: HTTPSpins and needles • http://www.slideshare.net/inaz2/20150509-sumidasec- 47934674 • RFC 2616 (deprecated) • https://tools.ietf.org/html/rfc2616 • RFC 7230-7235 • https://tools.ietf.org/html/rfc7230 31
  • 32.