Abstracting Features Into Custom Reverse Proxies (Or: Making Better Lemonade From Chaos)
Life isn't always simple. We often have to deal with a mishmash of applications, languages, and servers. How can we begin to standardize functionality across this chaos? Custom reverse proxies to the rescue! Using Ruby and EventMachine, learn how you can abstract high-level features and functionality into fast reverse proxies that can improve scalability, save time, and make the world happy.
- See how we've applied this across a diverse set of web service APIs to standardize the implementation of authentication, request throttling, analytics, and more.
- See how this can save development time, eliminate code duplication, make your team happy, make the public happy, and make you a hero.
- See how this can be applied to any TCP-based application for a wide-variety of use cases.
- Still think your situation is complicated? Learn about the U.S. Government's plans to standardize API access across the entire federal government. With some reverse proxy magic, this isn't quite as difficult or as foolhardy as it may first sound. It also comes with some nice benefits for both the public audience and government developers.
Sits in front of your serverDoes stuff, but is transparent to the user
Common usage in Ruby communityNginx’s role:Serve static filesDeal with slow clients
You can implement your own reverse proxyWrite it in Ruby and implement your own features
Why we’re using custom reverse proxiesWhy you might find them usefulThe basics of building one
About web servicesWanted to expose them to the world
About silosNot agriculture, organizationalAlso systematic (legacy apps, etc)
NREL~2000 employeesDifferent development teams scattered across lab
Another way to look at the segmentationDifferent groups have expertise in different languagesEven in smaller business, you might deal with segmentation
Trying to force all groups to make changes is a huge uphill battleAlso time consuming
Custom reverse proxy slips between the internet and our existing API serversProxy deals with common functionalityExisting APIs don’t changeExisting APIs can now just assume that those things are taken care of.Proxy is agnostic to backend technology
All our APIs are in one place and can be accessed in the same way.Users don’t need to understand our internal group structure to know where to find APIs.
Our API developers don’t need to worry about any of the high level functionalityAPIs simply need to exist for them to be put behind the proxy
Lot’s of other ways to achieve some of these goals, but nice benefits to the reverse proxy:Reduced code: Individual APIs don’t have to implement any code. With other solutions, even if you simplified and abstracted the code, having to do nothing wins.Enforced standardization: Individual apps can’t mess up things like authentication with a faulty implementation.New features: Everyone benefits. Example: Adding caching layer.Scaling: Load balancing between multiple backends.
- Basicem-proxy example
Doing something as data is receivedNote:You’re operating at a raw HTTP level here.You’re dealing with chunks of data, so you can’t assume you have the full request body.
It’s Ruby!This is the real power: access any arbitrary Ruby library and do custom things.
Remember you’re operating at a TCP level hereIf you want to deal with HTTP in an easy way, you’ll need to handle that manually.Other libraries to deal with HTTP.
Why do things at this low level when we’re used to nice high level frameworks like Rails and Sinatra?
Low level proxy makes it much easier to pass along the original request in it’s complete, unaltered format (headers, HTTP method, body, etc).With higher level frameworks, the raw HTTP request has already been parsed and processed by a web server by the time it hits your app.More difficult and error prone trying to recreate the original request.
EventMachine is fast.Evented systems very suitable for proxies
em-proxy adds 0.5msrack-reverse-proxy adds nearly 3msComparatively a lot, but is 3ms really worth it?
Larger requests, the problem is compounded.em-proxy adds 150msrack-reverse-proxy adds nearly 800msWhy?Rack deals with complete request and complete response.Must buffer complete request and response in memory before transferring them.Memory balloons with large file uploads and downloads.em-proxy deals with chunks of data.Streams requests and response as fast as it receives them.Memory use much lower for large file uploads and downloads, since only a single chunk is in memory at a time.
Low level by default, up to you to implement more.You make decisions like whether to buffer requests and responsesCan be used for non-HTTP things (websockets, any other TCP thing)
- Aside from authentication, rate limiting, and analytics
Imagine a 1GB file upload and 5GB data downloadIf your proxy layer buffers, there will be delays. Streaming to the server and to the client won’t be possible.Buffering is sometimes desirable (Unicorn & nginx).Other times not. Our API use case means we want the proxy to be as transparent as possible, since we don’t know what all the APIs will want to do.
If you want to modify the response body, it can be a little tricky.Be sure to update the Content-Length header appropriately.
If you want to modify the response body, and the backend returns a gzipped response.You must buffer the responseDecompress after fully received, then modify the body, then re-gzip.Be sure to update the Content-Length header too.
Digital Strategy for the Federal GovernmentInvolved in the API portion of things.
Heavy push for web servicesExpect to see a lot more federal agencies exposing their data and services as web APIs.
With more APIs, we need better organization.We also need to make it easier for agencies.
APIs spread across agencies.Just a bigger version of our internal issues.
Agencies like this model and want something like itCurrently evaluating using our API Umbrella platform, or other platforms that are variations on the same idea.Involved in getting something up and running within 6 months.
Lots of web service action in the federal government over the next year.
Not appropriate for everythingCan be useful for applying global features that can easily be layered
Custom reverse proxy does fun thingsRedis used for rate limitingMongDB for authentication and analytics
RubyConf 2012: Custom Reverse Proxies
Abstracting Features IntoCustom Reverse Proxies Or: Making Better Lemonade From Chaos Photo by Lori Greig http://www.flickr.com/photos/lori_greig/4906180111Nick Muerdter • RubyConf 2012 • November 1, 2012
Photo by Brian Lane Winfield Moorehttp://www.flickr.com/photos/doctabu/342220423
Internet Reverse Proxy Web Server Internal Network
Internet Reverse Proxy Web Server Internal Network
RUBY! CUSTOM FEATURES! EVENTMACHINE!Internet Reverse Proxy Web Server Internal Network
Proxy.start(:host => "0.0.0.0", :port => 80) do |conn| conn.server :srv, :host => "127.0.0.1", :port => 81 conn.on_data do |data| # Do something with the incoming data... data end conn.on_response do |backend, resp| # Do something with the response... resp end conn.on_finish do |backend, name| # Do something when finished... endend
conn.on_data do |data| # Modify the User-Agent on the incoming # request data.gsub(/User-Agent: .*?rn/, "User-Agent: em-proxy/0.1rn")end
redis = Redis.new(:host => "127.0.0.1")conn.on_data do |data| # Fun things with Ruby! ip = peer redis.incr(ip) dataend
parser = Http::Parser.newparser.on_headers_complete = proc do |h| # Hello, friendlier HTTP headers... puts h["User-Agent"]endconn.on_data do |data| parser << data dataend
Photo by Madison Guyhttp://www.flickr.com/photos/madison_guy/338691904
Transparency Photo by Brett Jordan http://www.flickr.com/photos/x1brett/6126873518
Speed & Efficiency Photo by jamesjustin http://www.flickr.com/photos/jamesjustin/3629097108
Flexibility Photo from The Library of Congress http://www.flickr.com/photos/library_of_congress/217904751
What Else Can You Do? Photo by paul-simpson.org http://www.flickr.com/photos/paulsimpson1976/4039170901
Photo by Keoki Seuhttp://www.flickr.com/photos/keokiseu/497331463
Buffering Photo from The Library of Congress http://www.flickr.com/photos/library_of_congress/3159321339
Content-Length Photo by Sterlic http://www.flickr.com/photos/sterlic/4299633060
gzip Photo by Kaptain Kobold http://www.flickr.com/photos/kaptainkobold/6930870617
Want Bigger? Photo by elviskennedy http://www.flickr.com/photos/elviskennedy/546541995
Photo by judepicshttp://www.flickr.com/photos/judepics/159365806
• Reverse Proxies: Fun for the whole family!• Custom Reverse Proxies: You might be able to implement more functionality at this layer than you realize.• Think Different: They can provide a different way to architect some features of your apps.
Resources & Support Photo by Musée McCord Museum http://www.flickr.com/photos/museemccordmuseum/5348751435
API Umbrella• Our full API management solution – Includes custom Event Machine based proxy• Open sourcehttps://github.com/NREL/api-umbrella (Just recently open sourced, so pardon the current state of things)
Ruby & Event Machine• em-proxy – https://github.com/igrigorik/em-proxy – Simple and very capable• ProxyMachine – https://github.com/mojombo/proxymachine – Simpler, but can only act on requests, not responses• Goliath – https://github.com/postrank-labs/goliath – More of a framework, uses em-synchrony (Fibers)
Other Reverse Proxies• HAProxy – http://haproxy.1wt.eu/ – General proxy and load balancing awesomeness• Varnish Cache – https://www.varnish-cache.org/ – Proxy caching layer coolness• nginx – http://nginx.org/ – Web server powerhouse and nice proxy
Renewable Energy APIs• http://developer.nrel.gov/ (Lots more APIs coming soon)