Pushing Python: Building a High Throughput, Low Latency System

33,260 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
33,260
On SlideShare
0
From Embeds
0
Number of Embeds
30,917
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Pushing Python: Building a High Throughput, Low Latency System

  1. 1. Kevin  Ballard   SFpython.org   2014-­‐03-­‐12  
  2. 2. kevin@   tellapart.com   Introductions
  3. 3. Taba •  Distributed event aggregation service import taba ... taba.RecordValue(‘winning_bid_price’, wincpm) ... $ taba-cli aggregate winning_bid_price {“name”: “winning_bid_price”, “10m”: {“count”: 14709, “total”: 5836.4}, “percentiles”: [0.07 0.16 0.32 0.84 1.33 8.03]}
  4. 4. Taba +10,000,000   events/sec   +50,000   metrics   +1,000   clients   +100   processors  
  5. 5. GET THE DATA MODEL RIGHT Lesson #1
  6. 6. Data Model
  7. 7. Data Model Event:  (‘bid_cpm’,  ‘Counter’,  time(),  0.233)       State:             Aggregate:  {“10m”:  43.9,  “1h”:  592.22}    
  8. 8. Data Model
  9. 9. Data Model
  10. 10. Data Model
  11. 11. STATE IS HARD Lesson #2
  12. 12. Centralizing State
  13. 13. GENERATORS + GREENLETS = AWESOME Lesson #3
  14. 14. Asynchronous Iterator •  JIT processing •  Automatically switches through I/O
  15. 15. CPYTHON SUFFERS FROM MEMORY FRAGMENTATION Lesson #4
  16. 16. Fragmentation •  Fragmentation is when a process’s heap is inefficiently used. •  The GC may report a low memory footprint, but the OS reports a much larger RSS.
  17. 17. Fragmentation
  18. 18. Fragmentation
  19. 19. Fragmentation
  20. 20. Fragmentation
  21. 21. Fragmentation
  22. 22. Hybrid Memory Management •  Use Cython to allocate page-sized blocks of pointers into incoming chunk •  Hand-off the whole thing to the CPython memory manager •  Whole thing gets deallocated at once
  23. 23. Hybrid Memory Management
  24. 24. Hybrid Memory Management
  25. 25. Hybrid Memory Management
  26. 26. Ratcheting •  Ratcheting is a pathological case of Fragmentation, caused by the fact that the heap must be contiguous*: •  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).
  27. 27. Ratcheting •  Ratcheting is a pathological case of Fragmentation, caused by the fact that the heap must be contiguous*: •  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).
  28. 28. Ratcheting •  Ratcheting is a pathological case of Fragmentation, caused by the fact that the heap must be contiguous*: •  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).
  29. 29. Ratcheting •  Ratcheting is a pathological case of Fragmentation, caused by the fact that the heap must be contiguous*: •  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).
  30. 30. Ratcheting •  Avoid persistent objects •  Sockets are common offenders •  Anything that has to be persistent should be created at application startup, before processing data •  Avoid letting the heap grow in the first place
  31. 31. fin. github.com/tellapart/taba       kevin@tellapart.com      |    @misterkgb     We’re  Hiring!      tellapart.com/careers  

×