Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Concurrent Python at Beeswax - Ron Rothman - NYC Python Meetup 2020

135 views

Published on

How Beeswax scaled Python to tens of thousands of QPS.

Published in: Software
  • Be the first to comment

Concurrent Python at Beeswax - Ron Rothman - NYC Python Meetup 2020

  1. 1. Python at Scale Concurrency at Beeswax Ron Rothman
  2. 2. Advertiser* Ad Exchange Advertiser* Beeswax Advertiser* Advertiser* ad ❕ 200 ms ❕ 2 million per second ❕ non-infinite $ budget ❕ 99.99% uptime ❕ ~optimal bids ad request bid requests bids
  3. 3. Event Collection Ad Exchange Bidder Event Collector Event Log Stream BeeswaxInternet
  4. 4. Event Collection Ad Exchange Bidder Event Collector Event Log Stream BeeswaxInternet
  5. 5. Event Collection Ad Exchange Bidder Event Collector Event Log Stream Load O(10K/sec) Response time < 250 ms p99 Availability ♾️ Nines 1. Durably record the event 2. Update counters (database) 1 2
  6. 6. def handle_request(req): '''Process an incoming event.''' unmarshall(req) # processing validate(req) # processing # some business logic # processing record_event(req) # network i/o update_db_counters(req) # network i/o return_response() work work work wait wait
  7. 7. 10 requests/sec (100 ms per request)
  8. 8. def handle_request(req): '''Process an incoming event.''' unmarshall(req) # processing validate(req) # processing # some business logic # processing record_event(req) # network i/o update_db_counters(req) # network i/o return_response() work work work wait wait
  9. 9. Layers of Concurrency Many machines EC2; Autoscale groups Many processes Preforking web servers Many threads Greenlets; asyncio Containers ECS tasks Serverless Lambdas
  10. 10. Elastic Load Balancer EC2 EC2 EC2 EC2 EC2 autoscale group event notifications (HTTP requests)
  11. 11. EC2 instance event notifications (from LB) web server processes
  12. 12. web server process greenlets/threads requests
  13. 13. Threads or Greenlets?
  14. 14. Threads Greenlets Preemptive Cooperative Requires extensive locking Requires no locking Lightweight Very lightweight Leverage multiple cores* Single core *Sadly, untrue for CPython
  15. 15. Gevent ● Create & manipulate greenlets ● Allows you to do non-blocking i/o
  16. 16. def handle_request(req): '''Process an incoming event.''' unmarshall(req) # processing validate(req) # processing # some business logic # processing record_event(req) # network i/o update_db_counters(req) # network i/o return_response() work work work wait wait
  17. 17. Gevent ● Create & manipulate greenlets ● Allows you to do non-blocking i/o ● Makes (i/o) libraries that you use non-blocking! ○ "monkey patching"
  18. 18. def handle_request(req): '''Process an incoming event.''' unmarshall(req) # processing validate(req) # processing # some business logic # processing record_event(req) # network i/o update_db_counters(req) # network i/o return_response() work work work yields yields
  19. 19. Happily Ever After?
  20. 20. Reality def handle_request(req): '''Process an incoming event.''' unmarshall(req) decrypt(req) validate(req) check_whether_duplicate(req) # BLOCKING i/o # some business logic # more business logic # even more business logic record_event(req) # nonblocking i/o update_db_counters(req) # BLOCKING i/o update_other_db_counters(req) # BLOCKING i/o update_other_other_db_counters(req) # BLOCKING i/o return_response()
  21. 21. Reality def handle_request(req): '''Process an incoming event.''' unmarshall(req) decrypt(req) validate(req) check_whether_duplicate(req) # BLOCKING i/o # some business logic # more business logic # even more business logic record_event(req) # nonblocking i/o update_db_counters(req) # BLOCKING i/o update_other_db_counters(req) # BLOCKING i/o update_other_other_db_counters(req) # BLOCKING i/o return_response()
  22. 22. Reality def handle_request(req): '''Process an incoming event.''' unmarshall(req) decrypt(req) validate(req) check_whether_duplicate(req) # BLOCKING i/o # some business logic # more business logic # even more business logic record_event(req) # nonblocking i/o update_db_counters(req) # BLOCKING i/o update_other_db_counters(req) # BLOCKING i/o update_other_other_db_counters(req) # BLOCKING i/o return_response()
  23. 23. Reality def handle_request(req): '''Process an incoming event.''' unmarshall(req) decrypt(req) validate(req) check_whether_duplicate(req) # BLOCKING i/o # some business logic # more business logic # even more business logic record_event(req) # nonblocking i/o update_db_counters(req) # BLOCKING i/o update_other_db_counters(req) # BLOCKING i/o update_other_other_db_counters(req) # BLOCKING i/o return_response()
  24. 24. Redis client ✅ pure python DynamoDB client ✅ pure python Aerospike client ⛔ wrapped C code Which DB Libraries Can Be Monkey Patched?
  25. 25. Too Much Processing? Yield Often. def handle_request(req): '''Process an incoming event.''' unmarshall(req) decrypt(req) greenlet_yield() validate(req) check_whether_duplicate(req) # BLOCKING i/o # some business logic # more business logic greenlet_yield() # even more business logic record_event(req) # nonblocking i/o ...
  26. 26. Blocking C Code? Batch & Timeouts. def handle_request(req): '''Process an incoming event.''' unmarshall(req) decrypt(req) validate(req) # BLOCKING i/o check_whether_duplicate(req, timeout_ms=5, max_tries=3) # business logic record_event(req) # nonblocking i/o update_counters_in_memory(req) # occasional i/o return_response()
  27. 27. � Blocking I/O calls waste CPU � You're not as I/O bound as you think hope � C extensions play by different rules
  28. 28. ● Blocking I/O calls waste CPU ■ Gevent + monkey patch ● You're not as I/O bound as you think hope ■ Buffer & batch ● C extensions play by different rules ■ Short timeouts w/retries
  29. 29. Thank You 🙏� ron {at} beeswax.com
  30. 30. References ● Gevent ● Falcon ● Bottle ● The Sharp Corners of Gevent ● Beeswax

×