Your SlideShare is downloading. ×
0
AsynchronousArchitectures forImplementingScalable CloudServicesDesigning for GracefulDegradationEVAN COOKECO-FOUNDER & CTO...
Cloud services power the apps that arethe backbone of modern society. How   we work, play, and communicate.
Cloud Workloads    Can Be Unpredictable
SMS API Usage                6x spike in 5 mins
Danger!             Load higher than             instantaneous throughput Load             FAILRequestLatency          Time
Don’t FailRequests
Incoming Requests                           Load                          Balancer               Worker                   ...
Worker Pools      e.g., Apache/Nginx                              Failed                             Requests             ...
Problem Summary• Cloud services often use worker pools to handle incoming requests• When load goes beyond size of the work...
What next?A few observations based on workimplementing and scaling the Twilio APIover the past 4 years... • Twilio Voice/S...
Observation 1For many APIs, taking more time toservice a request is better than failing thatrequestImplication: in many ca...
Observation 2Matching the amount of availableresources precisely to the size of incomingrequest worker pools is challengin...
What are we going to do?Suggestion: if request concurrency wasvery cheap, we could implement delayand finer-grained resourc...
Event-driven programming and the Reactor Pattern
Event-driven programming and the Reactor Pattern         Worker              Time req = ‘GET /’;            1 req.append(‘...
Event-driven programming and the Reactor Pattern                                  Time req = ‘GET /’;                1 req...
Event-driven programming and the Reactor Pattern req = ‘GET /’; req.append(‘/r/n/r/n’); socket.write(req, fn() {   Make IO...
Event-driven programming and the Reactor Patternreq = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() {   socket.rea...
Event-driven programming and the Reactor Pattern                                 Timereq = ‘GET /’;              1req.appe...
(Some)Reactor Pattern Frameworks          c/libevent          c/libev        java/nio/netty          js/node.js       ruby...
The Callback Mess      Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’def r(resp):  print respdef w():  socket.read().addCall...
The Callback Mess      Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’yield socket.write()resp = yield socket.read()print res...
The Callback Mess      Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’yield socket.write()resp = yield socket.read()print res...
Enter gevent “gevent is a coroutine-based Python  networking library that uses greenletto provide a high-level synchronous...
Enter gevent               Simple Echo Serverfrom gevent.server import StreamServerdef echo(socket, address):    print (Ne...
Async Services with Ginkgo Ginkgo is a simple framework for composing     async gevent services with common    configuratio...
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom gink...
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer         ...
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom gink...
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom gink...
Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom gink...
Incoming Requests                    Load                   Balancer Async Server          Async          Server          ...
Incoming Requests                    Load                   Balancer  AAA                   ...           AAA             ...
Incoming Requests                            Load                           Balancer   AAA                           ...  ...
Concurrency       Admission Control• Goal: limit concurrency by delaying    or selectively failing requests• Common metric...
Delay - delay responses without failingrequests  LoadLatency
Deny - deny requests based on resource usage     LoadLatency /x       FailLatency /*
Incoming Requests                            Load                           Balancer   AAA                           ...  ...
SummaryAsync frameworks like gevent allow youto easily decouple a request from accessto constrained resources             ...
Don’t Fail Requests DecreasePerformance
twilio   Evan Cooke            @emcookeCONTENTS CONFIDENTIAL & COPYRIGHT © TWILIO INC. 2012
Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012
Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012
Upcoming SlideShare
Loading in...5
×

Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012

6,281

Published on

Cloud services power the apps that are becoming backbone of modern society. The workload of cloud APIs is typically driven by external customers and can fluctuate dramatically minute-by-minute. Rapid spikes in load can result in request failures as load increases beyond backend capacity and the size of web worker pools. This talk explores the use of asynchronous frameworks like python Twisted and gevent to implement services that can dynamically keep socket connections open and increase request latency in order to avoid request failures. We explore how that architectural approach helps Twilio provides high-availability Voice and SMS APIs.

Published in: Technology, Education

Transcript of "Asynchronous Architectures for Implementing Scalable Cloud Services - Evan Cooke - Gluecon 2012"

  1. 1. AsynchronousArchitectures forImplementingScalable CloudServicesDesigning for GracefulDegradationEVAN COOKECO-FOUNDER & CTO twilio CLOUD COMMUNICATIONS
  2. 2. Cloud services power the apps that arethe backbone of modern society. How we work, play, and communicate.
  3. 3. Cloud Workloads Can Be Unpredictable
  4. 4. SMS API Usage 6x spike in 5 mins
  5. 5. Danger! Load higher than instantaneous throughput Load FAILRequestLatency Time
  6. 6. Don’t FailRequests
  7. 7. Incoming Requests Load Balancer Worker Pool AAA ... AAA AAAThrottling Throttling Throttling W W App App App W App WServer Server Server Server W W W WThrottling Throttling Throttling
  8. 8. Worker Pools e.g., Apache/Nginx Failed Requests 100%+ 70%10% Time
  9. 9. Problem Summary• Cloud services often use worker pools to handle incoming requests• When load goes beyond size of the worker pool, requests fail
  10. 10. What next?A few observations based on workimplementing and scaling the Twilio APIover the past 4 years... • Twilio Voice/SMS Cloud APIs • 100,000 Twilio Developers • 100+ employees
  11. 11. Observation 1For many APIs, taking more time toservice a request is better than failing thatrequestImplication: in many cases, it is betterto service a request with some delayrather than failing it
  12. 12. Observation 2Matching the amount of availableresources precisely to the size of incomingrequest worker pools is challengingImplication: under load, it may bepossible delay or drop only thoserequests that truly impact resources
  13. 13. What are we going to do?Suggestion: if request concurrency wasvery cheap, we could implement delayand finer-grained resource controls muchmore easily...
  14. 14. Event-driven programming and the Reactor Pattern
  15. 15. Event-driven programming and the Reactor Pattern Worker Time req = ‘GET /’; 1 req.append(‘/r/n/r/n’); 1 socket.write(req); 10000x resp = socket.read(); 10000000x print(resp); 10
  16. 16. Event-driven programming and the Reactor Pattern Time req = ‘GET /’; 1 req.append(‘/r/n/r/n’); 1 socket.write(req); 10000x resp = socket.read(); 10000000x print(resp); 10 Huge IO latency blocks worker
  17. 17. Event-driven programming and the Reactor Pattern req = ‘GET /’; req.append(‘/r/n/r/n’); socket.write(req, fn() { Make IO socket.read(fn(resp) { operations print(resp); async and }); “callback” }); when done
  18. 18. Event-driven programming and the Reactor Patternreq = ‘GET /’;req.append(‘/r/n/r/n’);socket.write(req, fn() { socket.read(fn(resp) { print(resp); }); Central dispatch}); to coordinatereactor.run_forever(); event callbacks
  19. 19. Event-driven programming and the Reactor Pattern Timereq = ‘GET /’; 1req.append(‘/r/n/r/n’); 1socket.write(req, fn() { 10 socket.read(fn(resp) { 10 print(resp); 10 }); Result: we}); don’t blockreactor.run_forever(); the worker
  20. 20. (Some)Reactor Pattern Frameworks c/libevent c/libev java/nio/netty js/node.js ruby/eventmachine python/twisted python/gevent
  21. 21. The Callback Mess Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’def r(resp): print respdef w(): socket.read().addCallback(r)socket.write().addCallback(w)
  22. 22. The Callback Mess Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’yield socket.write()resp = yield socket.read()print resp Use deferred generators and inline callbacks
  23. 23. The Callback Mess Python Twistedreq = ‘GET /’req += ‘/r/n/r/n’yield socket.write()resp = yield socket.read()print resp Easy sequential programming with mostly implicit async IO
  24. 24. Enter gevent “gevent is a coroutine-based Python networking library that uses greenletto provide a high-level synchronous API on top of the libevent event loop.” Natively Async socket.write() resp = socket.read() print resp
  25. 25. Enter gevent Simple Echo Serverfrom gevent.server import StreamServerdef echo(socket, address): print (New connection from %s:%s % address) socket.sendall(Welcome to the echo server!rn) line = fileobj.readline() fileobj.write(line) fileobj.flush() print ("echoed %r" % line)if __name__ == __main__: server = StreamServer((0.0.0.0, 6000), echo) server.serve_forever() Easy sequential model Fully async
  26. 26. Async Services with Ginkgo Ginkgo is a simple framework for composing async gevent services with common configuration, logging, demonizing etc. https://github.com/progrium/ginkgo Let’s look a simple example that implements a TCP and HTTP server...
  27. 27. Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom ginkgo.core import Servicedef handle_http(env, start_response): start_response(200 OK, [(Content-Type, text/html)]) print new http request! return ["hello world"]def handle_tcp(socket, address): print new tcp connection! while True: socket.send(hellon) gevent.sleep(1)app = Service()app.add_service(StreamServer((127.0.0.1, 1234),handle_tcp))app.add_service(WSGIServer((127.0.0.1, 8080), handle_http))app.serve_forever()
  28. 28. Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServer Import WSGI/TCP Serversfrom ginkgo.core import Servicedef handle_http(env, start_response): start_response(200 OK, [(Content-Type, text/html)]) print new http request! return ["hello world"]def handle_tcp(socket, address): print new tcp connection! while True: socket.send(hellon) gevent.sleep(1)app = Service()app.add_service(StreamServer((127.0.0.1, 1234),handle_tcp))app.add_service(WSGIServer((127.0.0.1, 8080), handle_http))app.serve_forever()
  29. 29. Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom ginkgo.core import Service HTTP Handlerdef handle_http(env, start_response): start_response(200 OK, [(Content-Type, text/html)]) print new http request! return ["hello world"]def handle_tcp(socket, address): print new tcp connection! while True: socket.send(hellon) gevent.sleep(1)app = Service()app.add_service(StreamServer((127.0.0.1, 1234),handle_tcp))app.add_service(WSGIServer((127.0.0.1, 8080), handle_http))app.serve_forever()
  30. 30. Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom ginkgo.core import Servicedef handle_http(env, start_response): start_response(200 OK, [(Content-Type, text/html)]) print new http request! return ["hello world"]def handle_tcp(socket, address): TCP Handler print new tcp connection! while True: socket.send(hellon) gevent.sleep(1)app = Service()app.add_service(StreamServer((127.0.0.1, 1234),handle_tcp))app.add_service(WSGIServer((127.0.0.1, 8080), handle_http))app.serve_forever()
  31. 31. Async Services with Ginkgoimport geventfrom gevent.pywsgi import WSGIServerfrom gevent.server import StreamServerfrom ginkgo.core import Servicedef handle_http(env, start_response): start_response(200 OK, [(Content-Type, text/html)]) print new http request! return ["hello world"]def handle_tcp(socket, address): print new tcp connection! while True: socket.send(hellon) Service gevent.sleep(1) Compositionapp = Service()app.add_service(StreamServer((127.0.0.1, 1234),handle_tcp))app.add_service(WSGIServer((127.0.0.1, 8080), handle_http))app.serve_forever()
  32. 32. Incoming Requests Load Balancer Async Server Async Server ... Async ServerUsing our async reactor-basedapproach let’s redesign our servinginfrastructure
  33. 33. Incoming Requests Load Balancer AAA ... AAA AAA Async Async Async Server Server ServerStep 1: define an authentication andauthorization layer that will identifythe user and the resource beingrequested
  34. 34. Incoming Requests Load Balancer AAA ... AAA AAA Throttling Throttling Throttling Async Async Async Concurrency Server Server Server ManagerStep 2: add a throttling layer andconcurrency manager
  35. 35. Concurrency Admission Control• Goal: limit concurrency by delaying or selectively failing requests• Common metrics- By Account- By Resource Type- By Availability of Dependent Resources• What we’ve found useful- By (Account, Resource Type)
  36. 36. Delay - delay responses without failingrequests LoadLatency
  37. 37. Deny - deny requests based on resource usage LoadLatency /x FailLatency /*
  38. 38. Incoming Requests Load Balancer AAA ... AAA AAA Throttling Throttling Throttling App App App Concurrency Server Server Server Manager Throttling Throttling ThrottlingStep 3: allow backend resources tothrottle requests Dependent Services
  39. 39. SummaryAsync frameworks like gevent allow youto easily decouple a request from accessto constrained resources Service-wide FailureRequestLatency Time
  40. 40. Don’t Fail Requests DecreasePerformance
  41. 41. twilio Evan Cooke @emcookeCONTENTS CONFIDENTIAL & COPYRIGHT © TWILIO INC. 2012
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×