Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Everything You Always Wante...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
About Me 
Piotr Przymus 
Ph...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Basic stuff 
P. Przymus 3/5...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Size of objects 
Table: Siz...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Size of objects 
sys.getsiz...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Size of containers 
sys.get...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Objects interning – fun exa...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Objects interning – fun exa...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Objects interning – explain...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Objects interning – behind ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
String interning – example ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
String interning – explaine...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
String interning – warning ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Notes on memory model 
P. P...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Mutable Containers Memory A...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
List allocation – example 
...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
List allocation strategy 
R...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
List allocation strategy - ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Overallocation of dictionar...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Various data representation...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Various data representation...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Various data representation...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Notes on garbage collector,...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Collectable garbage – recip...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Uncollectable garbage – rec...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Notes on GC in other Python...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Memory profiling tools 
P. ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Tools 
time 
psutil 
memory...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Tools – time, simple but us...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Tools – time, simple but us...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Tools – psutil 
psutil – A ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Tools – memory profiler 
me...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
memory profiler – Line-by-l...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
memory profiler – memory us...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
memory profiler – Debugger ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Tools – objgraph 
objgraph ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Tools – Heapy/Meliae 
Heapy...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Tools – Heapy 
1 from guppy...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Meliae and runsnakerun 
1 f...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Valgrind and Massif 
Valgri...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Valgrind and Massif 
MB 
75...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Massif Visualizer 
”Massif ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Other useful tools 
Web app...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Notes on malloc() in CPytho...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Notes on malloc allocation ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Notes on malloc() in CPytho...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Notes on malloc() in CPytho...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Notes on malloc() in CPytho...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
malloc() alternatives – lib...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Caution: Notes on malloc() ...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
malloc() alternatives – lib...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Summary 
P. Przymus 51/53
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
Summary 
Summary: 
Try to u...
Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary 
References 
Wesley J. Chun,...
Upcoming SlideShare
Loading in …5
×

Everything You Always Wanted to Know About Memory in Python - But Were Afraid to Ask (extended)

2,088 views

Published on

Have you ever wondered what happens to all the precious RAM after running your 'simple' CPython code? Prepare yourself for a short introduction to CPython memory management! This presentation will try to answer some memory related questions you always wondered about. It will also discuss basic memory profiling tools and techniques.

Published in: Education
  • Be the first to comment

Everything You Always Wanted to Know About Memory in Python - But Were Afraid to Ask (extended)

  1. 1. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask (extended) Piotr Przymus Nicolaus Copernicus University PyConPL 2014, Szczyrk P. Przymus 1/53
  2. 2. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary About Me Piotr Przymus PhD student / Research Assistant at Nicolaus Copernicus University. Interests: databases, GPGPU computing, datamining, High-performance computing. 8 years of Python experience. Some of my Python projects: Worked on parts of trading platform in turbineam.com (back testing, trading algorithms). Mussels bio-monitoring analysis and data mining software. Simulator of heterogeneus processing environment for evaluation of database query scheduling algorithms. P. Przymus 2/53
  3. 3. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Basic stuff P. Przymus 3/53
  4. 4. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Size of objects Table: Size of different types in bytes Type Python 32 bit 64 bit int (py-2.7) 12 24 long (py-2.7) / int (py-3.3) 14 30 +2 · number of digits float 16 24 complex 24 32 str (py-2.7) / bytes (py-3.3) 24 40 +2 · length unicode (py-2.7) / str (py-3.3) 28 52 +(2 or 4) length tuple 24 64 +(4 · length) +(8 · length) P. Przymus 4/53
  5. 5. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Size of objects sys.getsizeof(obj) From documentation Since Python 2.6 Return the size of an object in bytes. The object can be any type. All built-in objects will return correct results. May not be true for third-party extensions as it is implementation specific. Calls the object’s sizeof method and adds an additional garbage collector overhead if the object is managed by the garbage collector. P. Przymus 5/53
  6. 6. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Size of containers sys.getsizeof and containers Note that getsizeof returns the size of container object and not the size of data associated with this container. 1 a =[ Foo*100 , Bar *100 , SpamSpamSpam *100] 2 b = [1 ,2 ,3] 3 print sys . getsizeof (a), sys . getsizeof (b) 4 # 96 96 5 Listing 1: getsizeof and containers P. Przymus 6/53
  7. 7. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Objects interning – fun example 1 a = [ i % 257 for i in xrange (2**20) ] 2 Listing 2: List of interned integers 1 b = [ 1024 + i % 257 for i in xrange (2**20) ] 2 Listing 3: List of integers Any allocation difference between Listing 2 and Listing 3 ? Results measured using psutils Listing 2 – (resident=15.1M, virtual=2.3G) Listing 3 – (resident=39.5M, virtual=2.4G) P. Przymus 7/53
  8. 8. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Objects interning – fun example 1 a = [ i % 257 for i in xrange (2**20) ] 2 Listing 4: List of interned integers 1 b = [ 1024 + i % 257 for i in xrange (2**20) ] 2 Listing 5: List of integers Any allocation difference between Listing 2 and Listing 3 ? Results measured using psutils Listing 2 – (resident=15.1M, virtual=2.3G) Listing 3 – (resident=39.5M, virtual=2.4G) P. Przymus 7/53
  9. 9. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Objects interning – explained Objects and variables – general rule Objects are allocated on assignment (e.g. a = ”spam”, b = 3.2). Variables just point to objects (i.e. they do not hold the memory). Interning of Objects This is an exception to the general rule. Python implementation specific (examples from CPython). ”Often” used objects are preallocated and are shared instead of costly new alloc. Mainly due to the performance optimization. 1 a = 0; b = 0 2 a is b, a == b 3 (True , True ) 4 Listing 6: Interning of Objects 1 a = 1024; b = 1024 2 a is b, a == b 3 (False , True ) 4 Listing 7: Objects allocation P. Przymus 8/53
  10. 10. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Objects interning – behind the scenes Warning This is Python implementation dependent. This may change in the future. This is not documented because of the above reasons. For reference consult the source code. CPython 2.7 - 3.4 Single instances for: int – in range [−5, 257) str / unicode – empty string and all length=1 strings unicode / str – empty string and all length=1 strings for Latin-1 tuple – empty tuple P. Przymus 9/53
  11. 11. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary String interning – example 1 a, b = strin , string 2 a + ’g’ is b # returns False 3 intern (a+’g’) is intern (b) # returns True 4 a = [ spam %d % (i % 257) 5 for i in xrange (2**20) ] 6 # memory usage ( resident =57.6M, virtual =2.4 G) 7 a = [ intern ( spam %d % (i % 257) ) 8 for i in xrange (2**20) ] 9 # memory usage ( resident =14.9M, virtual =2.3 G) 10 Listing 8: String interning P. Przymus 10/53
  12. 12. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary String interning – explained String interning definition String interning is a method of storing only one copy of each distinct string value, which must be immutable. intern (py-2.x) / sys.intern (py-3.x) From Cpython documentation: Enter string in the table of “interned” strings. Return the interned string (string or string copy). Useful to gain a little performance on dictionary lookup (key comparisons after hashing can be done by a pointer compare instead of a string compare). Names used in programs are automatically interned Dictionaries used to hold module, class or instance attributes have interned keys. P. Przymus 11/53
  13. 13. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary String interning – warning 1 m. print_meminfo () 2 x = [] 3 for i in xrange (2**16) : 4 x. append (a*i) 5 6 del x 7 m. print_meminfo () Listing 9: String interning Memory start: (resident=7.8M, virtual=48.6M) Memory end: (resident=8.0M, virtual=48.7M) Time: (real 0m1.976s, user 0m0.584s, sys 0m1.384s) 1 m. print_meminfo () 2 x = [] 3 for i in xrange (2**16) : 4 x. append ( intern (a*i)) 5 6 del x 7 m. print_meminfo () Listing 10: String interning Memory start: (resident=7.8M, virtual=48.6M) Memory end: (resident=10.8M, virtual=51.5M) Time: (real 0m6.494s, user 0m5.232s, sys 0m1.236s) P. Przymus 12/53
  14. 14. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Notes on memory model P. Przymus 13/53
  15. 15. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Mutable Containers Memory Allocation Strategy Plan for growth and shrinkage Slightly overallocate memory needed by container. Leave room to growth. Shrink when overallocation threshold is reached. Reduce number of expensive function calls: relloc() memcpy() Use optimal layout. List, Sets, Dictionaries P. Przymus 14/53
  16. 16. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary List allocation – example Figure: List growth example P. Przymus 15/53
  17. 17. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary List allocation strategy Represented as fixed-length array of pointers. Overallocation for list growth (by append) List size growth: 4, 8, 16, 25, 35, 46, . . . For large lists less then 12.5% overallocation. Note that for 1,2,5 elements lists, more space is wasted (75%,50%,37.5%). Due to the memory actions involved, operations: at end of list are cheap (rare realloc), in the middle or beginning require memory copy or shift! List allocation size: 32 bits – 32 + (4 * length) 64 bits – 72 + (8 * length) Shrinking only when list size 1/2 of allocated space. P. Przymus 16/53
  18. 18. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary List allocation strategy - example 1 a = [] 2 for i in xrange (9): 3 a. append (i) 4 print sys . getsizeof (a) 5 # 104 6 # 104 7 # 104 8 # 104 9 # 136 10 # 136 11 # 136 12 # 136 13 # 200 14 Listing 11: Using getsizeof to check list overallocation P. Przymus 17/53
  19. 19. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Overallocation of dictionaries/sets Represented as fixed-length hash tables. Overallocation for dict/sets – when 2/3 of capacity is reached. if number of elements 50000: quadruple the capacity else: double the capacity 1 // dict growth strategy 2 (mp - ma_used 50000 ? 2 : 4) * mp - ma_used ; 3 // set growth strategy 4 so -used 50000 ? so - used *2 : so - used *4) ; 5 Dict/Set growth/shrink code 1 for ( newsize = PyDict_MINSIZE ; 2 newsize = minused newsize 0; 3 newsize = 1); 4 Shrinkage if dictionary/set fill (real and dummy elements) is much larger than used elements (real elements) i.e. lot of keys have been deleted. P. Przymus 18/53
  20. 20. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Various data representation 1 # Fields : field1 , field2 , field3 , ... , field8 2 # Data : foo 1, foo 2, foo 3, ... , foo 8 3 class OldStyleClass : # only py -2. x 4 ... 5 class NewStyleClass ( object ): # default for py -3. x 6 ... 7 class NewStyleClassSlots ( object ): 8 __slots__ = (’field1 ’, ’field2 ’, ...) 9 ... 10 import collections as c 11 NamedTuple = c. namedtuple (’nt ’, [ ’field1 ’, ... ,]) 12 13 TupleData = (’value1 ’, ’value2 ’, ....) 14 ListaData = [’value1 ’, ’value2 ’, ....] 15 DictData = {’field1 ’:, ’value2 ’, ....} 16 Listing 12: Various data representation P. Przymus 19/53
  21. 21. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Various data representation – allocated memory 0 MB 50 MB 100 MB 150 MB NewStyle ClassWithSlots ListaData TupleData NamedTuple DictData New StyleClass Old StyleClass Python 2.x Python 3.x Figure: Allocated memory after creating 100000 objects with 8 fields each P. Przymus 20/53
  22. 22. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Various data representation – allocated memory 0 MB 50 MB 100 MB 150 MB 200 MB 250 MB 300 MB 350 MB tuple_fields OldStyleClass NewStyleClassSlots NewStyleClass namedtuples_fields list_fields dict_fields slpython2.7 python pypy jython Figure: Allocated memory after creating 100000 objects with 8 fields each - Python 2.7, Stackless Python 2.7, PyPy, Jython P. Przymus 21/53
  23. 23. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Notes on garbage collector, reference count and cycles Python garbage collector Uses reference counting. Offers cycle detection. Objects garbage-collected when count goes to 0. Reference increment, e.g.: object creation, additional aliases, passed to function Reference decrement, e.g.: local reference goes out of scope, alias is destroyed, alias is reassigned Warning – from documentation Objects that have del () methods and are part of a reference cycle cause the entire reference cycle to be uncollectable! Python does not collect such cycles automatically. It is not possible for Python to guess a safe order in which to run the del () methods. P. Przymus 22/53
  24. 24. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Collectable garbage – recipe 1 class CollectableGarbage : 2 pass 3 4 a = CollectableGarbage () 5 b = CollectableGarbage () 6 a.x = b 7 b.x = a 8 9 del a 10 del b 11 import gc 12 print gc. collect () # 4 13 print gc. garbage 14 # [] 15 Listing 13: Garbage in Python P. Przymus 23/53
  25. 25. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Uncollectable garbage – recipe 1 class Garbage : 2 def __del__ ( self ): pass 3 4 a = Garbage () 5 b = Garbage () 6 a.x = b 7 b.x = a 8 9 del a 10 del b 11 import gc 12 print gc. collect () # 4 13 print gc. garbage 14 # [ __main__ . Garbage instance at 0 x1071490e0 , __main__ . Garbage instance at 0 x107149128 15 Listing 14: Garbage in Python P. Przymus 24/53
  26. 26. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Notes on GC in other Python versions Jython Uses the JVM’s built-in garbage collection – so no need to copy cPython’s reference-counting implementation. PyPy Supports pluggable garbage collectors - so various GC available. Default incminimark which does ”major collections incrementally (i.e. one major collection is split along some number of minor collections, rather than being done all at once after a specific minor collection)” P. Przymus 25/53
  27. 27. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Memory profiling tools P. Przymus 26/53
  28. 28. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Tools time psutil memory profiler objgraph Meliae (could be combined with runsnakerun) Heapy Valgrind and Massif (and Massif Visualizer) P. Przymus 27/53
  29. 29. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Tools – time, simple but useful time Simple but useful Use ”/usr/bin/time -v” and not ”time” as usually it something different. Average total (data+stack+text) memory use of the process, in Kilobytes. Maximum resident set size of the process during its lifetime, in Kilobytes. See manual for more. P. Przymus 28/53
  30. 30. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Tools – time, simple but useful 1 Command being timed : python universe - new .py 2 User time ( seconds ): 0.38 3 System time ( seconds ): 1.61 4 Percent of CPU this job got: 26% 5 Elapsed ( wall clock ) time (h:mm:ss or m:ss): 0:07.46 6 Average shared text size ( kbytes ): 0 7 Average unshared data size ( kbytes ): 0 8 Average stack size ( kbytes ): 0 9 Average total size ( kbytes ): 0 10 Maximum resident set size ( kbytes ): 22900 11 Average resident set size ( kbytes ): 0 12 Major ( requiring I/O) page faults : 64 13 Minor ( reclaiming a frame ) page faults : 6370 14 Voluntary context switches : 3398 15 Involuntary context switches : 123 16 Swaps : 0 17 File system inputs : 25656 18 File system outputs : 0 19 Socket messages sent : 0 20 Socket messages received : 0 21 Signals delivered : 0 22 Page size ( bytes ): 4096 23 Exit status : 0 P. Przymus Listing 15: Results 29/53
  31. 31. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Tools – psutil psutil – A cross-platform process and system utilities module for Python. 1 import psutil 2 import os 3 ... 4 p = psutil . Process (os. getpid ()) 5 pinfo = p. as_dict () 6 ... 7 print pinfo [’ memory_percent ’], 8 print pinfo [’ memory_info ’]. rss , pinfo [’ memory_info ’]. vms Listing 16: Various data representation P. Przymus 30/53
  32. 32. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Tools – memory profiler memory profiler – a module for monitoring memory usage of a python program. Recommended dependency: psutil. May work as: Line-by-line profiler. Memory usage monitoring (memory in time). Debugger trigger – setting debugger breakpoints. P. Przymus 31/53
  33. 33. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary memory profiler – Line-by-line profiler Preparation To track particular functions use profile decorator. Running 1 python -m memory_profiler 1 Line # Mem usage Increment Line Contents 2 ================================================ 3 45 9.512 MiB 0.000 MiB @profile 4 46 def create_lot_of_stuff ( times = 10000 , cl = OldStyleClass ): 5 47 9.516 MiB 0.004 MiB ret = [] 6 48 9.516 MiB 0.000 MiB t = foo %d 7 49 156.449 MiB 146.934 MiB for i in xrange ( times ): 8 50 156.445 MiB -0.004 MiB l = [ t % (j + i %8) for j in xrange (8) ] 9 51 156.449 MiB 0.004 MiB c = cl (*l) 10 52 156.449 MiB 0.000 MiB ret . append (c) 11 53 156.449 MiB 0.000 MiB return ret Listing 17: Results P. Przymus 32/53
  34. 34. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary memory profiler – memory usage monitoring Preparation To track particular functions use profile decorator. Running and plotting 1 mprof run -- python python uniwerse .py -f 100 100 -s 100 100 10 2 mprof plot Figure: Results P. Przymus 33/53
  35. 35. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary memory profiler – Debugger trigger 1 eror@eror - laptop :˜$ python -m memory_profiler --pdb - mmem =10 uniwerse .py -s 100 100 10 2 Current memory 20.80 MiB exceeded the maximumof 10.00 MiB 3 Stepping into the debugger 4 / home / eror / uniwerse .py (52) connect () 5 - self . adj . append (n) 6 ( Pdb ) Listing 18: Debugger trigger – setting debugger breakpoints. P. Przymus 34/53
  36. 36. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Tools – objgraph objgraph – draws Python object reference graphs with graphviz. 1 import objgraph 2 x = [] 3 y = [x, [x], dict (x=x)] 4 objgraph . show_refs ([y], filename =’sample - graph . png ’) 5 objgraph . show_backrefs ([x], filename =’sample - backref - graph . png ’ ) Listing 19: Tutorial example Figure: Reference graph Figure: Back reference graph P. Przymus 35/53
  37. 37. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Tools – Heapy/Meliae Heapy The heap analysis toolset. It can be used to find information about the objects in the heap and display the information in various ways. part of ”Guppy-PE – A Python Programming Environment” Meliae Python Memory Usage Analyzer ”This project is similar to heapy (in the ’guppy’ project), in its attempt to understand how memory has been allocated.” runsnakerun GUI support. P. Przymus 36/53
  38. 38. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Tools – Heapy 1 from guppy import hpy 2 hp=hpy () 3 h1 = hp. heap () 4 l = [ range (i) for i in xrange (2**10) ] 5 h2 = hp. heap () 6 print h2 - h1 Listing 20: Heapy example 1 Partition of a set of 294937 objects . Total size = 11538088 bytes . 2 Index Count % Size % Cumulative % Kind ( class / dict of class ) 3 0 293899 100 7053576 61 7053576 61 int 4 1 1025 0 4481544 39 11535120 100 list 5 2 6 0 1680 0 11536800 100 dict (no owner ) 6 3 2 0 560 0 11537360 100 dict of guppy .etc . Glue . Owner 7 4 1 0 456 0 11537816 100 types . FrameType 8 5 2 0 144 0 11537960 100 guppy . etc. Glue . Owner 9 6 2 0 128 0 11538088 100 str Listing 21: Results P. Przymus 37/53
  39. 39. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Meliae and runsnakerun 1 from meliae import scanner 2 scanner . dump_all_objects ( representation_meliae . dump ) 3 # In shell : runsnakemem representation_meliae . dump Listing 22: Heapy example P. Przymus Figure: Meliae and runsnakerun 38/53
  40. 40. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Valgrind and Massif Valgrind – a programming tool for memory debugging, leak detection, and profiling. Rather low level. Massif – a heap profiler. Measures how much heap memory programs use. 1 valgrind --trace - children = yes --tool = massif python src .py 2 ms_print massif . out .* Listing 23: Valgrind and Massif Number of snapshots: 50 Detailed snapshots: [2, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 26, -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 0 0 0 0 0 0 1 100,929,329 2,811,592 2,786,746 24,846 0 2 183,767,328 4,799,320 4,754,218 45,102 0 P. Przymus 39/53
  41. 41. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Valgrind and Massif MB 75.66ˆ # | @@@@# | :@@@ @ # | @@@@:@ @ @ # | @@@@ @@:@ @ @ # | @@@@@ @@ @@:@ @ @ # | @@@ @@@ @@ @@:@ @ @ # | @@:::@ @ @@@ @@ @@:@ @ @ # | @@@@@ :: @ @ @@@ @@ @@:@ @ @ # | @@@@ @ @ :: @ @ @@@ @@ @@:@ @ @ # | :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ # | :::::@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ # | @::::: :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ #: | @:@@: ::: :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ #: | @@@@@:@@: ::: :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ #: | @@@@@ @ @:@@: ::: :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ #: | @@@@@ @@ @ @:@@: ::: :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ #: | @@:@@@ @@ @@ @ @:@@: ::: :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ #: | @@@@ :@@@ @@ @@ @ @:@@: ::: :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ #: | @@::@@@@ :@@@ @@ @@ @ @:@@: ::: :@@@ @@ @ @ :: @ @ @@@ @@ @@:@ @ @ #: 0 +-----------------------------------------------------------------------Gi 0 3.211 P. Przymus 40/53
  42. 42. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Massif Visualizer ”Massif Visualizer is a tool that - who’d guess that - visualizes massif data.” Figure: Massive Visualizer P. Przymus 41/53
  43. 43. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Other useful tools Web applications memory leaks dowser – cherrypy application that displays sparklines of python object counts. dozer – wsgi middleware version of the cherrypy memory leak debugger (any wsgi application). Build Python in debug mode (./configure –with-pydebug . . . ). Maintains list of all active objects. Upon exit (or every statement in interactive mode), print all existing references. Trac total allocation. valgrind (examples on earlier slides) CPython can cooperate with valgrind (for = py-2.7, py-3.2) Use special build option ”–with-valgrind” for more. gdb-heap (gdb extension) low level, still experimental can be attached to running processes may be used with core file P. Przymus 42/53
  44. 44. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Notes on malloc() in CPython P. Przymus 43/53
  45. 45. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Notes on malloc allocation malloc memory allocation in Linux GLIBC malloc uses both brk and mmap for memory allocation. Using brk()/sbrk() syscalls which increase or decrease a continuous amount of memory allocated to the process. Using the mmap()/munmap() syscalls which manage an arbitrary amount of memory and map it into virtual address space of the process. Allocation strategy may be partially controlled. Figure: brk example P. Przymus 44/53
  46. 46. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Notes on malloc() in CPython Current CPython implementations are not affected Example warning Following example Did not affect all OS e.q. there are examples of vulnerable Linux configurations, on the other hand Mac OS X was not affected. Probably is effectively eliminated (won’t affect modern systems). P. Przymus 45/53
  47. 47. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Notes on malloc() in CPython 1 import gc 2 if __name__ == ’__main__ ’: 3 meminfo . print_meminfo () 4 l = [] 5 for i in xrange (1 ,100) : 6 ll = [ { } for j in xrange (1000000 / i) ] 7 ll = ll [::2] 8 l. extend (ll) 9 10 meminfo . print_meminfo () 11 del l 12 del ll 13 gc. collect () 14 meminfo . print_meminfo () Listing 24: Evil example P. Przymus 46/53
  48. 48. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Notes on malloc() in CPython 1 0.4% ( resident =7.4M, virtual =46.5 M) 2 36.9% ( resident =739.7M, virtual =779.4 M) 3 35.9% ( resident =720.0M, virtual =759.2 M) 4 Listing 25: Affected system 1 0.4% ( resident =7.6M, virtual =53.9 M) 2 38.3% ( resident =765.9M, virtual =813.6 M) 3 1.1% ( resident =22.9M, virtual =70.1 M) 4 Listing 26: Not affected system P. Przymus 47/53
  49. 49. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary malloc() alternatives – libjemalloc and libtcmalloc Pros: In some cases using different malloc() implementation ”may” help to retrieve memory from CPython back to system. Cons: But equally it may work against you. 1 $LD_PRELOAD =/ usr / lib / libjemalloc .so .1 python int_float_alloc .py 2 $ LD_PRELOAD =/ usr / lib / libtcmalloc_minimal .so .4 python int_float_alloc .py Listing 27: Changing memory allocator P. Przymus 48/53
  50. 50. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Caution: Notes on malloc() alternatives Things to keep in mind malloc() alternatives will use different memory allocation strategies which may drastically change memory consumption of Your program. When considering malloc replacement: Check memory usage at various checkpoints. Check the minimum and maximum memory consumption between control points! Compare performance (as this may also change). P. Przymus 49/53
  51. 51. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary malloc() alternatives – libjemalloc and libtcmalloc Step malloc jemalloc tcmalloc res virt res virt res virt step 1 7.4M 46.5M 8.0M 56.9M 9.4M 56.1M step 2 40.0M 79.1M 41.6M 88.9M 42.5M 89.3M step 3 16.2M 55.3M 8.2M 88.9M 42.5M 89.3M step 4 40.0M 84.3M 41.5M 100.9M 51.5M 98.4M step 5 8.2M 47.3M 8.5M 100.9M 51.5M 98.4M P. Przymus 50/53
  52. 52. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Summary P. Przymus 51/53
  53. 53. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary Summary Summary: Try to understand better underlying memory model. Pay attention to hot spots. Use profiling tools. ”Seek and destroy” – find the root cause of the memory leak and fix it ;) Quick and sometimes dirty solutions: Delegate memory intensive work to other process. Regularly restart process. Go for low hanging fruits (e.g. slots , different allocators). P. Przymus 52/53
  54. 54. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary References Wesley J. Chun, Principal CyberWeb Consulting, ”Python 103... MMMM: Understanding Python’s Memory Model, Mutability, Methods” David Malcolm, Red Hat, ”Dude – Where’s My RAM?” A deep dive into how Python uses memory. Evan Jones, Improving Python’s Memory Allocator Alexander Slesarev, Memory reclaiming in Python Marcus Nilsson, Python memory management and TCMalloc, http://pushingtheweb.com/2010/06/python-and-tcmalloc/ Source code of Python Tools documentation P. Przymus 53/53

×