Your SlideShare is downloading. ×
0
Kevin	
  Ballard	
  
SFpython.org	
  
2014-­‐03-­‐12	
  
kevin@	
  
tellapart.com	
  
Introductions
Taba
•  Distributed event aggregation service
import taba
...
taba.RecordValue(‘winning_bid_price’, wincpm)
...
$ taba-cli...
Taba
+10,000,000	
  
events/sec	
  
+50,000	
  
metrics	
  
+1,000	
  
clients	
  
+100	
  
processors	
  
GET THE DATA MODEL RIGHT
Lesson #1
Data Model
Data Model
Event:	
  (‘bid_cpm’,	
  ‘Counter’,	
  time(),	
  0.233)	
  
	
  
	
  
State:	
  
	
  
	
  
	
  
	
  
	
  
Aggr...
Data Model
Data Model
Data Model
STATE IS HARD
Lesson #2
Centralizing State
GENERATORS + GREENLETS =
AWESOME
Lesson #3
Asynchronous Iterator
•  JIT processing
•  Automatically switches through I/O
CPYTHON SUFFERS FROM MEMORY
FRAGMENTATION
Lesson #4
Fragmentation
•  Fragmentation is when a process’s heap is
inefficiently used.
•  The GC may report a low memory
footprint...
Fragmentation
Fragmentation
Fragmentation
Fragmentation
Fragmentation
Hybrid Memory Management
•  Use Cython to allocate page-sized blocks of
pointers into incoming chunk
•  Hand-off the whole...
Hybrid Memory Management
Hybrid Memory Management
Hybrid Memory Management
Ratcheting
•  Ratcheting is a pathological case of Fragmentation,
caused by the fact that the heap must be contiguous*:
• ...
Ratcheting
•  Ratcheting is a pathological case of Fragmentation,
caused by the fact that the heap must be contiguous*:
• ...
Ratcheting
•  Ratcheting is a pathological case of Fragmentation,
caused by the fact that the heap must be contiguous*:
• ...
Ratcheting
•  Ratcheting is a pathological case of Fragmentation,
caused by the fact that the heap must be contiguous*:
• ...
Ratcheting
•  Avoid persistent objects
•  Sockets are common offenders
•  Anything that has to be persistent should
be cre...
fin.
github.com/tellapart/taba	
  	
  
	
  
kevin@tellapart.com	
  	
  	
  |	
  	
  @misterkgb	
  
	
  
We’re	
  Hiring!	
...
Upcoming SlideShare
Loading in...5
×

Pushing Python: Building a High Throughput, Low Latency System

23,329

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
23,329
On Slideshare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Pushing Python: Building a High Throughput, Low Latency System"

  1. 1. Kevin  Ballard   SFpython.org   2014-­‐03-­‐12  
  2. 2. kevin@   tellapart.com   Introductions
  3. 3. Taba •  Distributed event aggregation service import taba ... taba.RecordValue(‘winning_bid_price’, wincpm) ... $ taba-cli aggregate winning_bid_price {“name”: “winning_bid_price”, “10m”: {“count”: 14709, “total”: 5836.4}, “percentiles”: [0.07 0.16 0.32 0.84 1.33 8.03]}
  4. 4. Taba +10,000,000   events/sec   +50,000   metrics   +1,000   clients   +100   processors  
  5. 5. GET THE DATA MODEL RIGHT Lesson #1
  6. 6. Data Model
  7. 7. Data Model Event:  (‘bid_cpm’,  ‘Counter’,  time(),  0.233)       State:             Aggregate:  {“10m”:  43.9,  “1h”:  592.22}    
  8. 8. Data Model
  9. 9. Data Model
  10. 10. Data Model
  11. 11. STATE IS HARD Lesson #2
  12. 12. Centralizing State
  13. 13. GENERATORS + GREENLETS = AWESOME Lesson #3
  14. 14. Asynchronous Iterator •  JIT processing •  Automatically switches through I/O
  15. 15. CPYTHON SUFFERS FROM MEMORY FRAGMENTATION Lesson #4
  16. 16. Fragmentation •  Fragmentation is when a process’s heap is inefficiently used. •  The GC may report a low memory footprint, but the OS reports a much larger RSS.
  17. 17. Fragmentation
  18. 18. Fragmentation
  19. 19. Fragmentation
  20. 20. Fragmentation
  21. 21. Fragmentation
  22. 22. Hybrid Memory Management •  Use Cython to allocate page-sized blocks of pointers into incoming chunk •  Hand-off the whole thing to the CPython memory manager •  Whole thing gets deallocated at once
  23. 23. Hybrid Memory Management
  24. 24. Hybrid Memory Management
  25. 25. Hybrid Memory Management
  26. 26. Ratcheting •  Ratcheting is a pathological case of Fragmentation, caused by the fact that the heap must be contiguous*: •  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).
  27. 27. Ratcheting •  Ratcheting is a pathological case of Fragmentation, caused by the fact that the heap must be contiguous*: •  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).
  28. 28. Ratcheting •  Ratcheting is a pathological case of Fragmentation, caused by the fact that the heap must be contiguous*: •  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).
  29. 29. Ratcheting •  Ratcheting is a pathological case of Fragmentation, caused by the fact that the heap must be contiguous*: •  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).
  30. 30. Ratcheting •  Avoid persistent objects •  Sockets are common offenders •  Anything that has to be persistent should be created at application startup, before processing data •  Avoid letting the heap grow in the first place
  31. 31. fin. github.com/tellapart/taba       kevin@tellapart.com      |    @misterkgb     We’re  Hiring!      tellapart.com/careers  
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×