Have you ever wondered what happens to all the precious RAM after running your 'simple' CPython code? Prepare yourself for a short introduction to CPython memory management! This presentation will try to answer some memory related questions you always wondered about. It will also discuss basic memory profiling tools and techniques.
Everything You Always Wanted to Know About Memory in Python - But Were Afraid to Ask (extended)
1. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Everything You Always Wanted to Know About
Memory in Python
But Were Afraid to Ask
(extended)
Piotr Przymus
Nicolaus Copernicus University
PyConPL 2014,
Szczyrk
P. Przymus 1/53
2. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
About Me
Piotr Przymus
PhD student / Research Assistant at Nicolaus Copernicus University.
Interests: databases, GPGPU computing, datamining, High-performance
computing.
8 years of Python experience.
Some of my Python projects:
Worked on parts of trading platform in turbineam.com (back testing,
trading algorithms).
Mussels bio-monitoring analysis and data mining software.
Simulator of heterogeneus processing environment for evaluation of
database query scheduling algorithms.
P. Przymus 2/53
3. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Basic stuff
P. Przymus 3/53
4. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Size of objects
Table: Size of different types in bytes
Type Python
32 bit 64 bit
int (py-2.7) 12 24
long (py-2.7) / int (py-3.3) 14 30
+2 · number of digits
float 16 24
complex 24 32
str (py-2.7) / bytes (py-3.3) 24 40
+2 · length
unicode (py-2.7) / str (py-3.3) 28 52
+(2 or 4) length
tuple 24 64
+(4 · length) +(8 · length)
P. Przymus 4/53
5. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Size of objects
sys.getsizeof(obj)
From documentation
Since Python 2.6
Return the size of an object in bytes. The object can be any type.
All built-in objects will return correct results.
May not be true for third-party extensions as it is implementation
specific.
Calls the object’s sizeof method and adds an additional garbage
collector overhead if the object is managed by the garbage collector.
P. Przymus 5/53
6. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Size of containers
sys.getsizeof and containers
Note that getsizeof returns the size of container object and not the size of
data associated with this container.
1 a =[ Foo*100 , Bar *100 , SpamSpamSpam *100]
2 b = [1 ,2 ,3]
3 print sys . getsizeof (a), sys . getsizeof (b)
4 # 96 96
5
Listing 1: getsizeof and containers
P. Przymus 6/53
7. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Objects interning – fun example
1 a = [ i % 257 for i in xrange (2**20) ]
2
Listing 2: List of interned integers
1 b = [ 1024 + i % 257 for i in xrange (2**20) ]
2
Listing 3: List of integers
Any allocation difference between Listing 2 and Listing 3 ?
Results measured using psutils
Listing 2 – (resident=15.1M, virtual=2.3G)
Listing 3 – (resident=39.5M, virtual=2.4G)
P. Przymus 7/53
8. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Objects interning – fun example
1 a = [ i % 257 for i in xrange (2**20) ]
2
Listing 4: List of interned integers
1 b = [ 1024 + i % 257 for i in xrange (2**20) ]
2
Listing 5: List of integers
Any allocation difference between Listing 2 and Listing 3 ?
Results measured using psutils
Listing 2 – (resident=15.1M, virtual=2.3G)
Listing 3 – (resident=39.5M, virtual=2.4G)
P. Przymus 7/53
9. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Objects interning – explained
Objects and variables – general rule
Objects are allocated on assignment (e.g. a = ”spam”, b = 3.2).
Variables just point to objects (i.e. they do not hold the memory).
Interning of Objects
This is an exception to the general rule.
Python implementation specific (examples from CPython).
”Often” used objects are preallocated and are shared instead of costly
new alloc.
Mainly due to the performance optimization.
1 a = 0; b = 0
2 a is b, a == b
3 (True , True )
4
Listing 6: Interning of Objects
1 a = 1024; b = 1024
2 a is b, a == b
3 (False , True )
4
Listing 7: Objects allocation
P. Przymus 8/53
10. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Objects interning – behind the scenes
Warning
This is Python implementation dependent.
This may change in the future.
This is not documented because of the above reasons.
For reference consult the source code.
CPython 2.7 - 3.4
Single instances for:
int – in range [−5, 257)
str / unicode – empty string and all length=1 strings
unicode / str – empty string and all length=1 strings for Latin-1
tuple – empty tuple
P. Przymus 9/53
11. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
String interning – example
1 a, b = strin , string
2 a + ’g’ is b # returns False
3 intern (a+’g’) is intern (b) # returns True
4 a = [ spam %d % (i % 257)
5 for i in xrange (2**20) ]
6 # memory usage ( resident =57.6M, virtual =2.4 G)
7 a = [ intern ( spam %d % (i % 257) )
8 for i in xrange (2**20) ]
9 # memory usage ( resident =14.9M, virtual =2.3 G)
10
Listing 8: String interning
P. Przymus 10/53
12. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
String interning – explained
String interning definition
String interning is a method of storing only one copy of each distinct string
value, which must be immutable.
intern (py-2.x) / sys.intern (py-3.x)
From Cpython documentation:
Enter string in the table of “interned” strings.
Return the interned string (string or string copy).
Useful to gain a little performance on dictionary lookup (key
comparisons after hashing can be done by a pointer compare instead of
a string compare).
Names used in programs are automatically interned
Dictionaries used to hold module, class or instance attributes have
interned keys.
P. Przymus 11/53
13. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
String interning – warning
1 m. print_meminfo ()
2 x = []
3 for i in xrange (2**16) :
4 x. append (a*i)
5
6 del x
7 m. print_meminfo ()
Listing 9: String interning
Memory start:
(resident=7.8M, virtual=48.6M)
Memory end:
(resident=8.0M, virtual=48.7M)
Time:
(real 0m1.976s, user 0m0.584s, sys
0m1.384s)
1 m. print_meminfo ()
2 x = []
3 for i in xrange (2**16) :
4 x. append ( intern (a*i))
5
6 del x
7 m. print_meminfo ()
Listing 10: String interning
Memory start:
(resident=7.8M, virtual=48.6M)
Memory end:
(resident=10.8M, virtual=51.5M)
Time:
(real 0m6.494s, user 0m5.232s, sys
0m1.236s)
P. Przymus 12/53
14. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Notes on memory model
P. Przymus 13/53
15. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Mutable Containers Memory Allocation Strategy
Plan for growth and shrinkage
Slightly overallocate memory needed by container.
Leave room to growth.
Shrink when overallocation threshold is reached.
Reduce number of expensive function calls:
relloc()
memcpy()
Use optimal layout.
List, Sets, Dictionaries
P. Przymus 14/53
16. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
List allocation – example
Figure: List growth example
P. Przymus 15/53
17. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
List allocation strategy
Represented as fixed-length array of pointers.
Overallocation for list growth (by append)
List size growth: 4, 8, 16, 25, 35, 46, . . .
For large lists less then 12.5% overallocation.
Note that for 1,2,5 elements lists, more space is wasted
(75%,50%,37.5%).
Due to the memory actions involved, operations:
at end of list are cheap (rare realloc),
in the middle or beginning require memory copy or shift!
List allocation size:
32 bits – 32 + (4 * length)
64 bits – 72 + (8 * length)
Shrinking only when list size 1/2 of allocated space.
P. Przymus 16/53
18. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
List allocation strategy - example
1 a = []
2 for i in xrange (9):
3 a. append (i)
4 print sys . getsizeof (a)
5 # 104
6 # 104
7 # 104
8 # 104
9 # 136
10 # 136
11 # 136
12 # 136
13 # 200
14
Listing 11: Using getsizeof to check list overallocation
P. Przymus 17/53
19. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Overallocation of dictionaries/sets
Represented as fixed-length hash tables.
Overallocation for dict/sets – when 2/3 of capacity is reached.
if number of elements 50000: quadruple the capacity
else: double the capacity
1 // dict growth strategy
2 (mp - ma_used 50000 ? 2 : 4) * mp - ma_used ;
3 // set growth strategy
4 so -used 50000 ? so - used *2 : so - used *4) ;
5
Dict/Set growth/shrink code
1 for ( newsize = PyDict_MINSIZE ;
2 newsize = minused newsize 0;
3 newsize = 1);
4
Shrinkage if dictionary/set fill (real and dummy elements) is much larger
than used elements (real elements) i.e. lot of keys have been deleted.
P. Przymus 18/53
20. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Various data representation
1 # Fields : field1 , field2 , field3 , ... , field8
2 # Data : foo 1, foo 2, foo 3, ... , foo 8
3 class OldStyleClass : # only py -2. x
4 ...
5 class NewStyleClass ( object ): # default for py -3. x
6 ...
7 class NewStyleClassSlots ( object ):
8 __slots__ = (’field1 ’, ’field2 ’, ...)
9 ...
10 import collections as c
11 NamedTuple = c. namedtuple (’nt ’, [ ’field1 ’, ... ,])
12
13 TupleData = (’value1 ’, ’value2 ’, ....)
14 ListaData = [’value1 ’, ’value2 ’, ....]
15 DictData = {’field1 ’:, ’value2 ’, ....}
16
Listing 12: Various data representation
P. Przymus 19/53
21. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Various data representation – allocated memory
0 MB 50 MB 100 MB 150 MB
NewStyle
ClassWithSlots
ListaData
TupleData
NamedTuple
DictData
New
StyleClass
Old
StyleClass
Python 2.x Python 3.x
Figure: Allocated memory after creating 100000 objects with 8 fields each
P. Przymus 20/53
22. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Various data representation – allocated memory
0 MB 50 MB 100 MB 150 MB 200 MB 250 MB 300 MB 350 MB
tuple_fields
OldStyleClass
NewStyleClassSlots
NewStyleClass
namedtuples_fields
list_fields
dict_fields
slpython2.7 python pypy jython
Figure: Allocated memory after creating 100000 objects with 8 fields each - Python
2.7, Stackless Python 2.7, PyPy, Jython
P. Przymus 21/53
23. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Notes on garbage collector, reference count and cycles
Python garbage collector
Uses reference counting.
Offers cycle detection.
Objects garbage-collected when count goes to 0.
Reference increment, e.g.: object creation, additional aliases, passed to
function
Reference decrement, e.g.: local reference goes out of scope, alias is
destroyed, alias is reassigned
Warning – from documentation
Objects that have del () methods and are part of a reference cycle cause
the entire reference cycle to be uncollectable!
Python does not collect such cycles automatically.
It is not possible for Python to guess a safe order in which to run the
del () methods.
P. Przymus 22/53
24. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Collectable garbage – recipe
1 class CollectableGarbage :
2 pass
3
4 a = CollectableGarbage ()
5 b = CollectableGarbage ()
6 a.x = b
7 b.x = a
8
9 del a
10 del b
11 import gc
12 print gc. collect () # 4
13 print gc. garbage
14 # []
15
Listing 13: Garbage in Python
P. Przymus 23/53
25. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Uncollectable garbage – recipe
1 class Garbage :
2 def __del__ ( self ): pass
3
4 a = Garbage ()
5 b = Garbage ()
6 a.x = b
7 b.x = a
8
9 del a
10 del b
11 import gc
12 print gc. collect () # 4
13 print gc. garbage
14 # [ __main__ . Garbage instance at 0 x1071490e0 , __main__ .
Garbage instance at 0 x107149128
15
Listing 14: Garbage in Python
P. Przymus 24/53
26. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Notes on GC in other Python versions
Jython
Uses the JVM’s built-in garbage collection – so no need to copy cPython’s
reference-counting implementation.
PyPy
Supports pluggable garbage collectors - so various GC available.
Default incminimark which does ”major collections incrementally (i.e.
one major collection is split along some number of minor collections,
rather than being done all at once after a specific minor collection)”
P. Przymus 25/53
27. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Memory profiling tools
P. Przymus 26/53
28. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Tools
time
psutil
memory profiler
objgraph
Meliae (could be combined with runsnakerun)
Heapy
Valgrind and Massif (and Massif Visualizer)
P. Przymus 27/53
29. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Tools – time, simple but useful
time
Simple but useful
Use ”/usr/bin/time -v” and not ”time” as usually it something different.
Average total (data+stack+text) memory use of the process, in
Kilobytes.
Maximum resident set size of the process during its lifetime, in Kilobytes.
See manual for more.
P. Przymus 28/53
30. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Tools – time, simple but useful
1 Command being timed : python universe - new .py
2 User time ( seconds ): 0.38
3 System time ( seconds ): 1.61
4 Percent of CPU this job got: 26%
5 Elapsed ( wall clock ) time (h:mm:ss or m:ss): 0:07.46
6 Average shared text size ( kbytes ): 0
7 Average unshared data size ( kbytes ): 0
8 Average stack size ( kbytes ): 0
9 Average total size ( kbytes ): 0
10 Maximum resident set size ( kbytes ): 22900
11 Average resident set size ( kbytes ): 0
12 Major ( requiring I/O) page faults : 64
13 Minor ( reclaiming a frame ) page faults : 6370
14 Voluntary context switches : 3398
15 Involuntary context switches : 123
16 Swaps : 0
17 File system inputs : 25656
18 File system outputs : 0
19 Socket messages sent : 0
20 Socket messages received : 0
21 Signals delivered : 0
22 Page size ( bytes ): 4096
23 Exit status : 0
P. Przymus Listing 15: Results 29/53
31. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Tools – psutil
psutil – A cross-platform process and system utilities module for Python.
1 import psutil
2 import os
3 ...
4 p = psutil . Process (os. getpid ())
5 pinfo = p. as_dict ()
6 ...
7 print pinfo [’ memory_percent ’],
8 print pinfo [’ memory_info ’]. rss , pinfo [’ memory_info ’]. vms
Listing 16: Various data representation
P. Przymus 30/53
32. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Tools – memory profiler
memory profiler – a module for monitoring memory usage of a python
program.
Recommended dependency: psutil.
May work as:
Line-by-line profiler.
Memory usage monitoring (memory in time).
Debugger trigger – setting debugger breakpoints.
P. Przymus 31/53
33. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
memory profiler – Line-by-line profiler
Preparation
To track particular functions use profile decorator.
Running
1 python -m memory_profiler
1 Line # Mem usage Increment Line Contents
2 ================================================
3 45 9.512 MiB 0.000 MiB @profile
4 46 def create_lot_of_stuff (
times = 10000 , cl = OldStyleClass ):
5 47 9.516 MiB 0.004 MiB ret = []
6 48 9.516 MiB 0.000 MiB t = foo %d
7 49 156.449 MiB 146.934 MiB for i in xrange ( times ):
8 50 156.445 MiB -0.004 MiB l = [ t % (j + i %8)
for j in xrange (8) ]
9 51 156.449 MiB 0.004 MiB c = cl (*l)
10 52 156.449 MiB 0.000 MiB ret . append (c)
11 53 156.449 MiB 0.000 MiB return ret
Listing 17: Results
P. Przymus 32/53
34. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
memory profiler – memory usage monitoring
Preparation
To track particular functions use profile decorator.
Running and plotting
1 mprof run -- python python uniwerse .py -f 100 100 -s 100
100 10
2 mprof plot
Figure: Results
P. Przymus 33/53
35. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
memory profiler – Debugger trigger
1 eror@eror - laptop :˜$ python -m memory_profiler --pdb - mmem =10
uniwerse .py -s 100 100 10
2 Current memory 20.80 MiB exceeded the maximumof 10.00 MiB
3 Stepping into the debugger
4 / home / eror / uniwerse .py (52) connect ()
5 - self . adj . append (n)
6 ( Pdb )
Listing 18: Debugger trigger – setting debugger breakpoints.
P. Przymus 34/53
36. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Tools – objgraph
objgraph – draws Python object reference graphs with graphviz.
1 import objgraph
2 x = []
3 y = [x, [x], dict (x=x)]
4 objgraph . show_refs ([y], filename =’sample - graph . png ’)
5 objgraph . show_backrefs ([x], filename =’sample - backref - graph . png ’
)
Listing 19: Tutorial example
Figure: Reference graph Figure: Back reference graph
P. Przymus 35/53
37. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Tools – Heapy/Meliae
Heapy
The heap analysis toolset. It can be used to find information about the
objects in the heap and display the information in various ways.
part of ”Guppy-PE – A Python Programming Environment”
Meliae
Python Memory Usage Analyzer
”This project is similar to heapy (in the ’guppy’ project), in its attempt
to understand how memory has been allocated.”
runsnakerun GUI support.
P. Przymus 36/53
38. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Tools – Heapy
1 from guppy import hpy
2 hp=hpy ()
3 h1 = hp. heap ()
4 l = [ range (i) for i in xrange (2**10) ]
5 h2 = hp. heap ()
6 print h2 - h1
Listing 20: Heapy example
1 Partition of a set of 294937 objects . Total size = 11538088
bytes .
2 Index Count % Size % Cumulative % Kind ( class / dict
of class )
3 0 293899 100 7053576 61 7053576 61 int
4 1 1025 0 4481544 39 11535120 100 list
5 2 6 0 1680 0 11536800 100 dict (no owner )
6 3 2 0 560 0 11537360 100 dict of guppy .etc .
Glue . Owner
7 4 1 0 456 0 11537816 100 types . FrameType
8 5 2 0 144 0 11537960 100 guppy . etc. Glue .
Owner
9 6 2 0 128 0 11538088 100 str
Listing 21: Results
P. Przymus 37/53
39. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Meliae and runsnakerun
1 from meliae import scanner
2 scanner . dump_all_objects ( representation_meliae . dump )
3 # In shell : runsnakemem representation_meliae . dump
Listing 22: Heapy example
P. Przymus Figure: Meliae and runsnakerun 38/53
40. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Valgrind and Massif
Valgrind – a programming tool for memory debugging, leak detection,
and profiling. Rather low level.
Massif – a heap profiler. Measures how much heap memory programs
use.
1 valgrind --trace - children = yes --tool = massif python src .py
2 ms_print massif . out .*
Listing 23: Valgrind and Massif
Number of snapshots: 50
Detailed snapshots: [2, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 26, --------------------------------------------------------------------------------
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
0 0 0 0 0 0
1 100,929,329 2,811,592 2,786,746 24,846 0
2 183,767,328 4,799,320 4,754,218 45,102 0
P. Przymus 39/53
42. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Massif Visualizer
”Massif Visualizer is a tool that - who’d guess that - visualizes massif data.”
Figure: Massive Visualizer
P. Przymus 41/53
43. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Other useful tools
Web applications memory leaks
dowser – cherrypy application that displays sparklines of python object
counts.
dozer – wsgi middleware version of the cherrypy memory leak debugger
(any wsgi application).
Build Python in debug mode (./configure –with-pydebug . . . ).
Maintains list of all active objects.
Upon exit (or every statement in interactive mode), print all existing
references.
Trac total allocation.
valgrind (examples on earlier slides)
CPython can cooperate with valgrind (for = py-2.7, py-3.2)
Use special build option ”–with-valgrind” for more.
gdb-heap (gdb extension)
low level, still experimental
can be attached to running processes
may be used with core file
P. Przymus 42/53
44. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Notes on malloc() in CPython
P. Przymus 43/53
45. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Notes on malloc allocation
malloc memory allocation in Linux
GLIBC malloc uses both brk and mmap for memory allocation.
Using brk()/sbrk() syscalls which increase or decrease a continuous
amount of memory allocated to the process.
Using the mmap()/munmap() syscalls which manage an arbitrary
amount of memory and map it into virtual address space of the process.
Allocation strategy may be partially controlled.
Figure: brk example
P. Przymus 44/53
46. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Notes on malloc() in CPython
Current CPython implementations are not affected
Example warning
Following example
Did not affect all OS e.q.
there are examples of vulnerable Linux configurations,
on the other hand Mac OS X was not affected.
Probably is effectively eliminated (won’t affect modern systems).
P. Przymus 45/53
47. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Notes on malloc() in CPython
1 import gc
2 if __name__ == ’__main__ ’:
3 meminfo . print_meminfo ()
4 l = []
5 for i in xrange (1 ,100) :
6 ll = [ { } for j in xrange (1000000 / i) ]
7 ll = ll [::2]
8 l. extend (ll)
9
10 meminfo . print_meminfo ()
11 del l
12 del ll
13 gc. collect ()
14 meminfo . print_meminfo ()
Listing 24: Evil example
P. Przymus 46/53
48. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Notes on malloc() in CPython
1 0.4% ( resident =7.4M, virtual
=46.5 M)
2 36.9% ( resident =739.7M, virtual
=779.4 M)
3 35.9% ( resident =720.0M, virtual
=759.2 M)
4
Listing 25: Affected system
1 0.4% ( resident =7.6M, virtual
=53.9 M)
2 38.3% ( resident =765.9M, virtual
=813.6 M)
3 1.1% ( resident =22.9M, virtual
=70.1 M)
4
Listing 26: Not affected system
P. Przymus 47/53
49. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
malloc() alternatives – libjemalloc and libtcmalloc
Pros:
In some cases using different malloc() implementation ”may” help to
retrieve memory from CPython back to system.
Cons:
But equally it may work against you.
1 $LD_PRELOAD =/ usr / lib / libjemalloc .so .1 python
int_float_alloc .py
2 $ LD_PRELOAD =/ usr / lib / libtcmalloc_minimal .so .4 python
int_float_alloc .py
Listing 27: Changing memory allocator
P. Przymus 48/53
50. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Caution: Notes on malloc() alternatives
Things to keep in mind
malloc() alternatives will use different memory allocation strategies
which may drastically change memory consumption of Your program.
When considering malloc replacement:
Check memory usage at various checkpoints.
Check the minimum and maximum memory consumption between
control points!
Compare performance (as this may also change).
P. Przymus 49/53
51. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
malloc() alternatives – libjemalloc and libtcmalloc
Step malloc jemalloc tcmalloc
res virt res virt res virt
step 1 7.4M 46.5M 8.0M 56.9M 9.4M 56.1M
step 2 40.0M 79.1M 41.6M 88.9M 42.5M 89.3M
step 3 16.2M 55.3M 8.2M 88.9M 42.5M 89.3M
step 4 40.0M 84.3M 41.5M 100.9M 51.5M 98.4M
step 5 8.2M 47.3M 8.5M 100.9M 51.5M 98.4M
P. Przymus 50/53
52. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Summary
P. Przymus 51/53
53. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
Summary
Summary:
Try to understand better underlying memory model.
Pay attention to hot spots.
Use profiling tools.
”Seek and destroy” – find the root cause of the memory leak and fix it ;)
Quick and sometimes dirty solutions:
Delegate memory intensive work to other process.
Regularly restart process.
Go for low hanging fruits (e.g. slots , different allocators).
P. Przymus 52/53
54. Basic stuff Notes on memory model Memory profiling tools Notes on malloc() in CPython Summary
References
Wesley J. Chun, Principal CyberWeb Consulting, ”Python 103...
MMMM: Understanding Python’s Memory Model, Mutability, Methods”
David Malcolm, Red Hat, ”Dude – Where’s My RAM?” A deep dive into
how Python uses memory.
Evan Jones, Improving Python’s Memory Allocator
Alexander Slesarev, Memory reclaiming in Python
Marcus Nilsson, Python memory management and TCMalloc,
http://pushingtheweb.com/2010/06/python-and-tcmalloc/
Source code of Python
Tools documentation
P. Przymus 53/53