SlideShare a Scribd company logo
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Everything You Always Wanted to Know About
Memory in Python
But Were Afraid to Ask
Piotr Przymus
Nicolaus Copernicus University
Europython 2014,
Berlin
P. Przymus 1/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
About Me
Piotr Przymus
PhD student / Research Assistant at Nicolaus Copernicus University.
Interests: databases, GPGPU computing, datamining.
8 years of Python experience.
Some of my Python projects:
Parts of trading platform in turbineam.com (back testing, trading
algorithms)
Mussels bio-monitoring analysis and data mining software.
Simulator of heterogeneus processing environment for evaluation of
database query scheduling algorithms.
P. Przymus 2/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Size of objects
Table: Size of different types in bytes
Type Python
32 bit 64 bit
int (py-2.7) 12 24
long (py-2.7) / int (py-3.3) 14 30
+2 · number of digits
float 16 24
complex 24 32
str (py-2.7) / bytes (py-3.3) 24 40
+2 · length
unicode (py-2.7) / str (py-3.3) 28 52
+(2 or 4) ∗ length
tuple 24 64
+(4 · length) +(8 · length)
P. Przymus 3/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
DIY – check size of objects
sys.getsizeof(obj)
From documentation
Since Python 2.6
Return the size of an object in bytes. The object can be any type.
All built-in objects will return correct results.
May not be true for third-party extensions as it is implementation
specific.
Calls the object’s sizeof method and adds an additional garbage
collector overhead if the object is managed by the garbage collector.
P. Przymus 4/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – fun example
1 a = [ i % 257 for i in xrange (2**20)]
2
Listing 1: List of interned integers
1 b = [ 1024 + i % 257 for i in xrange (2**20)]
2
Listing 2: List of integers
Any allocation difference between Listing 1 and Listing 2 ?
Results measured using psutils
Listing 1 – (resident=15.1M, virtual=2.3G)
Listing 2 – (resident=39.5M, virtual=2.4G)
P. Przymus 5/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – fun example
1 a = [ i % 257 for i in xrange (2**20)]
2
Listing 3: List of interned integers
1 b = [ 1024 + i % 257 for i in xrange (2**20)]
2
Listing 4: List of integers
Any allocation difference between Listing 1 and Listing 2 ?
Results measured using psutils
Listing 1 – (resident=15.1M, virtual=2.3G)
Listing 2 – (resident=39.5M, virtual=2.4G)
P. Przymus 5/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – explained
Objects and variables – general rule
Objects are allocated on assignment
Variables just point to objects (i.e. they do not hold the memory)
Interning of Objects
This is an exception to the general rule.
Python implementation specific (examples from CPython).
”Often” used objects are preallocated and are shared instead of costly
new alloc.
Mainly due to the performance optimization.
1 >>> a = 0, b = 0
2 >>> a is b, a == b
3 (True , True)
4
Listing 5: Interning of Objects
1 >>> a = 1024 , b = 1024
2 >>> a is b, a == b
3 (False , True)
4
Listing 6: Objects allocation
P. Przymus 6/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – behind the scenes
Warning
This is Python implementation dependent.
This may change in the future.
This is not documented because of the above reasons.
For reference consult the source code.
CPython 2.7 - 3.4
Single instances for:
int – in range [−5, 257)
str / unicode – empty string and all length=1 strings
unicode / str – empty string and all length=1 strings for Latin-1
tuple – empty tuple
P. Przymus 7/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
String interning – example
1 >>> a, b = "strin", "string"
2 >>> a + ’g’ is b # returns False
3 >>> intern(a+’g’) is intern(b) # returns True
4 >>> a = [ "spam %d" % (i % 257)
5 for i in xrange (2**20)]
6 >>> # memory usage (resident =57.6M, virtual =2.4G)
7 >>> a = [ intern("spam %d" % (i % 257))
8 for i in xrange (2**20)]
9 >>> # memory usage (resident =14.9M, virtual =2.3G)
10
Listing 7: String interning
P. Przymus 8/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
String interning – explained
String interning definition
String interning is a method of storing only one copy of each distinct string
value, which must be immutable.
intern (py-2.x) / sys.intern (py-3.x)
From Cpython documentation:
Enter string in the table of “interned” strings.
Return the interned string (string or string copy).
Useful to gain a little performance on dictionary lookup (key
comparisons after hashing can be done by a pointer compare instead of
a string compare).
Names used in programs are automatically interned
Dictionaries used to hold module, class or instance attributes have
interned keys.
P. Przymus 9/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Mutable Containers Memory Allocation Strategy
Plan for growth and shrinkage
Slightly overallocate memory neaded by container.
Leave room to growth.
Shrink when overallocation threshold is reached.
Reduce number of expensive function calls:
relloc()
memcpy()
Use optimal layout.
List, Sets, Dictionaries
P. Przymus 10/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
List allocation – example
Figure: List growth example
P. Przymus 11/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
List allocation strategy
Represented as fixed-length array of pointers.
Overallocation for list growth (by append)
List size growth: 4, 8, 16, 25, 35, 46, . . .
For large lists less then 12.5%
Due to the memory actions involved, operations:
at end of list are cheap (rare realloc),
in the middle or beginning require memory copy or shift!
Note that for 1,2,5 elements lists, space is wasted.
List allocation size:
32 bits – 32 + (4 * length)
64 bits – 72 + (8 * length)
Shrinking only when list size < 1/2 of allocated space.
P. Przymus 12/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Overallocation of dictionaries/sets
Represented as fixed-length hash tables.
Overallocation for dict/sets – when 2/3 of capacity is reached.
if number of elements < 50000: quadruple the capacity
else: double the capacity
1 // dict growth strategy
2 (mp ->ma_used >50000 ? 2 : 4) * mp ->ma_used;
3 // set growth strategy
4 so ->used >50000 ? so ->used *2 : so ->used *4);
5
Dict/Set growth/shrink code
1 for (newsize = PyDict_MINSIZE ;
2 newsize <= minused && newsize > 0;
3 newsize <<= 1);
4
Shrinkage if dictionary/set fill (real and dummy elements) is much larger
than used elements (real elements) i.e. lot of keys have been deleted.
P. Przymus 13/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Various data representation
1 # Fields: field1 , field2 , field3 , ..., field8
2 # Data: "foo 1", "foo 2", "foo 3", ..., "foo 8"
3 class OldStyleClass : #only py -2.x
4 ...
5 class NewStyleClass (object): # default for py -3.x
6 ...
7 class NewStyleClassSlots (object):
8 __slots__ = (’field1 ’, ’field2 ’, ...)
9 ...
10 import collections as c
11 NamedTuple = c.namedtuple(’nt’, [ ’field1 ’, ... ,])
12
13 TupleData = (’value1 ’, ’value2 ’, ....)
14 ListaData = [’value1 ’, ’value2 ’, ....]
15 DictData = {’field1 ’:, ’value2 ’, ....}
16
Listing 8: Various data representation
P. Przymus 14/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Various data representation – allocated memory
0 MB 50 MB 100 MB 150 MB
Old
StyleClass
New
StyleClass
DictData
NamedTuple
TupleData
ListaData
NewStyle
ClassWithSlots
Python 2.x Python 3.x
Figure: Allocated memory after creating 100000 objects with 8 fields each
P. Przymus 15/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Notes on garbage collector, reference count and cycles
Python garbage collector
Uses reference counting.
Offers cycle detection.
Objects garbage-collected when count goes to 0.
Reference increment, e.g.: object creation, additional aliases, passed to
function
Reference decrement, e.g.: local reference goes out of scope, alias is
destroyed, alias is reassigned
Warning – from documentation
Objects that have del () methods and are part of a reference cycle cause
the entire reference cycle to be uncollectable!
Python doesn’t collect such cycles automatically.
It is not possible for Python to guess a safe order in which to run the
del () methods.
P. Przymus 16/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools
psutil
memory profiler
objgraph
Meliae (could be combined with runsnakerun)
Heapy
P. Przymus 17/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – psutil
psutil – A cross-platform process and system utilities module for Python.
1 import psutil
2 import os
3 ...
4 p = psutil.Process(os.getpid ())
5 pinfo = p.as_dict ()
6 ...
7 print pinfo[’memory_percent ’],
8 print pinfo[’memory_info ’].rss , pinfo[’memory_info ’]. vms
Listing 9: Various data representation
P. Przymus 18/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – memory profiler
memory profiler – a module for monitoring memory usage of a python
program.
Recommended dependency: psutil.
May work as:
Line-by-line profiler.
Memory usage monitoring (memory in time).
Debugger trigger – setting debugger breakpoints.
P. Przymus 19/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – Line-by-line profiler
Preparation
To track particular functions use profile decorator.
Running
1 python -m memory_profiler
1 Line # Mem usage Increment Line Contents
2 ================================================
3 45 9.512 MiB 0.000 MiB @profile
4 46 def create_lot_of_stuff (
times = 10000 , cl = OldStyleClass ):
5 47 9.516 MiB 0.004 MiB ret = []
6 48 9.516 MiB 0.000 MiB t = "foo %d"
7 49 156.449 MiB 146.934 MiB for i in xrange(times):
8 50 156.445 MiB -0.004 MiB l = [ t % (j + i%8)
for j in xrange (8)]
9 51 156.449 MiB 0.004 MiB c = cl(*l)
10 52 156.449 MiB 0.000 MiB ret.append(c)
11 53 156.449 MiB 0.000 MiB return ret
Listing 10: Results
P. Przymus 20/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – memory usage monitoring
Preparation
To track particular functions use profile decorator.
Running and plotting
1 mprof run --python python uniwerse.py -f 100 100 -s 100
100 10
2 mprof plot
Figure: Results
P. Przymus 21/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – Debugger trigger
1 eror@eror -laptop :˜$ python -m memory_profiler --pdb -mmem =10
uniwerse.py -s 100 100 10
2 Current memory 20.80 MiB exceeded the maximumof 10.00 MiB
3 Stepping into the debugger
4 > /home/eror/uniwerse.py (52) connect ()
5 -> self.adj.append(n)
6 (Pdb)
Listing 11: Debugger trigger – setting debugger breakpoints.
P. Przymus 22/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – objgraph
objgraph – draws Python object reference graphs with graphviz.
1 import objgraph
2 x = []
3 y = [x, [x], dict(x=x)]
4 objgraph.show_refs ([y], filename=’sample -graph.png’)
5 objgraph. show_backrefs ([x], filename=’sample -backref -graph.png’
)
Listing 12: Tutorial example
Figure: Reference graph Figure: Back reference graph
P. Przymus 23/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – Heapy/Meliae
Heapy
The heap analysis toolset. It can be used to find information about the
objects in the heap and display the information in various ways.
part of ”Guppy-PE – A Python Programming Environment”
Meliae
Python Memory Usage Analyzer
”This project is similar to heapy (in the ’guppy’ project), in its attempt
to understand how memory has been allocated.”
runsnakerun GUI support.
P. Przymus 24/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – Heapy
1 from guppy import hpy
2 hp=hpy()
3 h1 = hp.heap ()
4 l = [ range(i) for i in xrange (2**10)]
5 h2 = hp.heap ()
6 print h2 - h1
Listing 13: Heapy example
1 Partition of a set of 294937 objects. Total size = 11538088
bytes.
2 Index Count % Size % Cumulative % Kind (class / dict
of class)
3 0 293899 100 7053576 61 7053576 61 int
4 1 1025 0 4481544 39 11535120 100 list
5 2 6 0 1680 0 11536800 100 dict (no owner)
6 3 2 0 560 0 11537360 100 dict of guppy.etc.
Glue.Owner
7 4 1 0 456 0 11537816 100 types.FrameType
8 5 2 0 144 0 11537960 100 guppy.etc.Glue.
Owner
9 6 2 0 128 0 11538088 100 str
Listing 14: Results
P. Przymus 25/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Meliae and runsnakerun
1 from meliae import scanner
2 scanner. dump_all_objects (" representation_meliae .dump")
3 # In shell: runsnakemem representation_meliae .dump
Listing 15: Heapy example
Figure: Meliae and runsnakerunP. Przymus 26/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
malloc() alternatives – libjemalloc and libtcmalloc
Pros:
In some cases using different malloc() implementation ”may” help to
retrieve memory from CPython back to system.
Cons:
But equally it may work against you.
1 $LD_PRELOAD ="/usr/lib/libjemalloc .so.1" python
int_float_alloc .py
2 $ LD_PRELOAD="/usr/lib/ libtcmalloc_minimal .so.4" python
int_float_alloc .py
Listing 16: Changing memory allocator
P. Przymus 27/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
malloc() alternatives – libjemalloc and libtcmalloc
Step malloc jemalloc tcmalloc
res virt res virt res virt
step 1 7.4M 46.5M 8.0M 56.9M 9.4M 56.1M
step 2 40.0M 79.1M 41.6M 88.9M 42.5M 89.3M
step 3 16.2M 55.3M 8.2M 88.9M 42.5M 89.3M
step 4 40.0M 84.3M 41.5M 100.9M 51.5M 98.4M
step 5 8.2M 47.3M 8.5M 100.9M 51.5M 98.4M
P. Przymus 28/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Other useful tools
Build Python in debug mode (./configure –with-pydebug . . . ).
Maintains list of all active objects.
Upon exit (or every statement in interactive mode), print all existing
references.
Trac total allocation.
valgrind – a programming tool for memory debugging, leak detection,
and profiling. Rather low level.
CPython can cooperate with valgrind (for >= py-2.7, py-3.2)
gdb-heap (gdb extension)
low level, still experimental
can be attached to running processes
may be used with core file
Web applications memory leaks
dowser – cherrypy application that displays sparklines of python object
counts.
dozer – wsgi middleware version of the cherrypy memory leak debugger
(any wsgi application).
P. Przymus 29/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Summary
Summary:
Try to understand better underlying memory model.
Pay attention to hot spots.
Use profiling tools.
”Seek and destroy” – find the root cause of the memory leak and fix it ;)
Quick and sometimes dirty solutions:
Delegate memory intensive work to other process.
Regularly restart process.
Go for low hanging fruits (e.g. slots , different allocators).
P. Przymus 30/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
References
Wesley J. Chun, Principal CyberWeb Consulting, ”Python 103...
MMMM: Understanding Python’s Memory Model, Mutability, Methods”
David Malcolm, Red Hat, ”Dude – Where’s My RAM?” A deep dive into
how Python uses memory.
Evan Jones, Improving Python’s Memory Allocator
Alexander Slesarev, Memory reclaiming in Python
Source code of Python
Tools documentation
P. Przymus 31/31

More Related Content

What's hot

Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02
Fariz Darari
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
extremecoders
 
Introduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data AnalyticsIntroduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data Analytics
Phoenix
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internal
Hyunghun Cho
 
Python Interview Questions And Answers
Python Interview Questions And AnswersPython Interview Questions And Answers
Python Interview Questions And Answers
H2Kinfosys
 
Python basic
Python basicPython basic
Python basic
Saifuddin Kaijar
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
Matt Harrison
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
hyunyoung Lee
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
aeberspaecher
 
Programming at Compile Time
Programming at Compile TimeProgramming at Compile Time
Programming at Compile Time
emBO_Conference
 
Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3
Chariza Pladin
 
Python for Linux System Administration
Python for Linux System AdministrationPython for Linux System Administration
Python for Linux System Administration
vceder
 
Programming Under Linux In Python
Programming Under Linux In PythonProgramming Under Linux In Python
Programming Under Linux In Python
Marwan Osman
 
Intro to Functions Python
Intro to Functions PythonIntro to Functions Python
Intro to Functions Python
primeteacher32
 
A Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonA Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with Python
Tariq Rashid
 
Programming in Python
Programming in Python Programming in Python
Programming in Python
Tiji Thomas
 
Python basics
Python basicsPython basics
Python basics
Hoang Nguyen
 
Intro to Python Programming Language
Intro to Python Programming LanguageIntro to Python Programming Language
Intro to Python Programming Language
Dipankar Achinta
 
Python interview questions and answers
Python interview questions and answersPython interview questions and answers
Python interview questions and answers
RojaPriya
 
Industry - Program analysis and verification - Type-preserving Heap Profiler ...
Industry - Program analysis and verification - Type-preserving Heap Profiler ...Industry - Program analysis and verification - Type-preserving Heap Profiler ...
Industry - Program analysis and verification - Type-preserving Heap Profiler ...
ICSM 2011
 

What's hot (20)

Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
 
Introduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data AnalyticsIntroduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data Analytics
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internal
 
Python Interview Questions And Answers
Python Interview Questions And AnswersPython Interview Questions And Answers
Python Interview Questions And Answers
 
Python basic
Python basicPython basic
Python basic
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
Programming at Compile Time
Programming at Compile TimeProgramming at Compile Time
Programming at Compile Time
 
Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3
 
Python for Linux System Administration
Python for Linux System AdministrationPython for Linux System Administration
Python for Linux System Administration
 
Programming Under Linux In Python
Programming Under Linux In PythonProgramming Under Linux In Python
Programming Under Linux In Python
 
Intro to Functions Python
Intro to Functions PythonIntro to Functions Python
Intro to Functions Python
 
A Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonA Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with Python
 
Programming in Python
Programming in Python Programming in Python
Programming in Python
 
Python basics
Python basicsPython basics
Python basics
 
Intro to Python Programming Language
Intro to Python Programming LanguageIntro to Python Programming Language
Intro to Python Programming Language
 
Python interview questions and answers
Python interview questions and answersPython interview questions and answers
Python interview questions and answers
 
Industry - Program analysis and verification - Type-preserving Heap Profiler ...
Industry - Program analysis and verification - Type-preserving Heap Profiler ...Industry - Program analysis and verification - Type-preserving Heap Profiler ...
Industry - Program analysis and verification - Type-preserving Heap Profiler ...
 

Similar to Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask

Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
Amund Tveit
 
pythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docxpythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docx
RameshMishra84
 
Python Libraries and Modules
Python Libraries and ModulesPython Libraries and Modules
Python Libraries and Modules
RaginiJain21
 
See through C
See through CSee through C
See through C
Tushar B Kute
 
FDP-faculty deveopmemt program on python
FDP-faculty deveopmemt program on pythonFDP-faculty deveopmemt program on python
FDP-faculty deveopmemt program on python
kannikadg
 
Pypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequelPypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequel
Mark Rees
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
Gilles Louppe
 
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docx
rohithprabhas1
 
Python Orientation
Python OrientationPython Orientation
Python Orientation
Pavan Devarakonda
 
ADK COLEGE.pptx
ADK COLEGE.pptxADK COLEGE.pptx
ADK COLEGE.pptx
Ashirwad2
 
James Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on PythonJames Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on Python
CP-Union
 
DS LAB MANUAL.pdf
DS LAB MANUAL.pdfDS LAB MANUAL.pdf
DS LAB MANUAL.pdf
Builders Engineering College
 
Data structures using C
Data structures using CData structures using C
Data structures using C
Pdr Patnaik
 
Ds12 140715025807-phpapp02
Ds12 140715025807-phpapp02Ds12 140715025807-phpapp02
Ds12 140715025807-phpapp02
Salman Qamar
 
Introduction to Data Structures, Data Structures using C.pptx
Introduction to Data Structures, Data Structures using C.pptxIntroduction to Data Structures, Data Structures using C.pptx
Introduction to Data Structures, Data Structures using C.pptx
poongothai11
 
2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english
Jen Yee Hong
 
PyCon Estonia 2019
PyCon Estonia 2019PyCon Estonia 2019
PyCon Estonia 2019
Travis Oliphant
 
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Ontico
 
cs8251 unit 1 ppt
cs8251 unit 1 pptcs8251 unit 1 ppt
cs8251 unit 1 ppt
praveenaprakasam
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
Steve Caron
 

Similar to Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask (20)

Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
pythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docxpythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docx
 
Python Libraries and Modules
Python Libraries and ModulesPython Libraries and Modules
Python Libraries and Modules
 
See through C
See through CSee through C
See through C
 
FDP-faculty deveopmemt program on python
FDP-faculty deveopmemt program on pythonFDP-faculty deveopmemt program on python
FDP-faculty deveopmemt program on python
 
Pypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequelPypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequel
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docx
 
Python Orientation
Python OrientationPython Orientation
Python Orientation
 
ADK COLEGE.pptx
ADK COLEGE.pptxADK COLEGE.pptx
ADK COLEGE.pptx
 
James Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on PythonJames Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on Python
 
DS LAB MANUAL.pdf
DS LAB MANUAL.pdfDS LAB MANUAL.pdf
DS LAB MANUAL.pdf
 
Data structures using C
Data structures using CData structures using C
Data structures using C
 
Ds12 140715025807-phpapp02
Ds12 140715025807-phpapp02Ds12 140715025807-phpapp02
Ds12 140715025807-phpapp02
 
Introduction to Data Structures, Data Structures using C.pptx
Introduction to Data Structures, Data Structures using C.pptxIntroduction to Data Structures, Data Structures using C.pptx
Introduction to Data Structures, Data Structures using C.pptx
 
2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english
 
PyCon Estonia 2019
PyCon Estonia 2019PyCon Estonia 2019
PyCon Estonia 2019
 
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
 
cs8251 unit 1 ppt
cs8251 unit 1 pptcs8251 unit 1 ppt
cs8251 unit 1 ppt
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask

  • 1. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask Piotr Przymus Nicolaus Copernicus University Europython 2014, Berlin P. Przymus 1/31
  • 2. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References About Me Piotr Przymus PhD student / Research Assistant at Nicolaus Copernicus University. Interests: databases, GPGPU computing, datamining. 8 years of Python experience. Some of my Python projects: Parts of trading platform in turbineam.com (back testing, trading algorithms) Mussels bio-monitoring analysis and data mining software. Simulator of heterogeneus processing environment for evaluation of database query scheduling algorithms. P. Przymus 2/31
  • 3. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Size of objects Table: Size of different types in bytes Type Python 32 bit 64 bit int (py-2.7) 12 24 long (py-2.7) / int (py-3.3) 14 30 +2 · number of digits float 16 24 complex 24 32 str (py-2.7) / bytes (py-3.3) 24 40 +2 · length unicode (py-2.7) / str (py-3.3) 28 52 +(2 or 4) ∗ length tuple 24 64 +(4 · length) +(8 · length) P. Przymus 3/31
  • 4. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References DIY – check size of objects sys.getsizeof(obj) From documentation Since Python 2.6 Return the size of an object in bytes. The object can be any type. All built-in objects will return correct results. May not be true for third-party extensions as it is implementation specific. Calls the object’s sizeof method and adds an additional garbage collector overhead if the object is managed by the garbage collector. P. Przymus 4/31
  • 5. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Objects interning – fun example 1 a = [ i % 257 for i in xrange (2**20)] 2 Listing 1: List of interned integers 1 b = [ 1024 + i % 257 for i in xrange (2**20)] 2 Listing 2: List of integers Any allocation difference between Listing 1 and Listing 2 ? Results measured using psutils Listing 1 – (resident=15.1M, virtual=2.3G) Listing 2 – (resident=39.5M, virtual=2.4G) P. Przymus 5/31
  • 6. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Objects interning – fun example 1 a = [ i % 257 for i in xrange (2**20)] 2 Listing 3: List of interned integers 1 b = [ 1024 + i % 257 for i in xrange (2**20)] 2 Listing 4: List of integers Any allocation difference between Listing 1 and Listing 2 ? Results measured using psutils Listing 1 – (resident=15.1M, virtual=2.3G) Listing 2 – (resident=39.5M, virtual=2.4G) P. Przymus 5/31
  • 7. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Objects interning – explained Objects and variables – general rule Objects are allocated on assignment Variables just point to objects (i.e. they do not hold the memory) Interning of Objects This is an exception to the general rule. Python implementation specific (examples from CPython). ”Often” used objects are preallocated and are shared instead of costly new alloc. Mainly due to the performance optimization. 1 >>> a = 0, b = 0 2 >>> a is b, a == b 3 (True , True) 4 Listing 5: Interning of Objects 1 >>> a = 1024 , b = 1024 2 >>> a is b, a == b 3 (False , True) 4 Listing 6: Objects allocation P. Przymus 6/31
  • 8. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Objects interning – behind the scenes Warning This is Python implementation dependent. This may change in the future. This is not documented because of the above reasons. For reference consult the source code. CPython 2.7 - 3.4 Single instances for: int – in range [−5, 257) str / unicode – empty string and all length=1 strings unicode / str – empty string and all length=1 strings for Latin-1 tuple – empty tuple P. Przymus 7/31
  • 9. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References String interning – example 1 >>> a, b = "strin", "string" 2 >>> a + ’g’ is b # returns False 3 >>> intern(a+’g’) is intern(b) # returns True 4 >>> a = [ "spam %d" % (i % 257) 5 for i in xrange (2**20)] 6 >>> # memory usage (resident =57.6M, virtual =2.4G) 7 >>> a = [ intern("spam %d" % (i % 257)) 8 for i in xrange (2**20)] 9 >>> # memory usage (resident =14.9M, virtual =2.3G) 10 Listing 7: String interning P. Przymus 8/31
  • 10. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References String interning – explained String interning definition String interning is a method of storing only one copy of each distinct string value, which must be immutable. intern (py-2.x) / sys.intern (py-3.x) From Cpython documentation: Enter string in the table of “interned” strings. Return the interned string (string or string copy). Useful to gain a little performance on dictionary lookup (key comparisons after hashing can be done by a pointer compare instead of a string compare). Names used in programs are automatically interned Dictionaries used to hold module, class or instance attributes have interned keys. P. Przymus 9/31
  • 11. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Mutable Containers Memory Allocation Strategy Plan for growth and shrinkage Slightly overallocate memory neaded by container. Leave room to growth. Shrink when overallocation threshold is reached. Reduce number of expensive function calls: relloc() memcpy() Use optimal layout. List, Sets, Dictionaries P. Przymus 10/31
  • 12. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References List allocation – example Figure: List growth example P. Przymus 11/31
  • 13. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References List allocation strategy Represented as fixed-length array of pointers. Overallocation for list growth (by append) List size growth: 4, 8, 16, 25, 35, 46, . . . For large lists less then 12.5% Due to the memory actions involved, operations: at end of list are cheap (rare realloc), in the middle or beginning require memory copy or shift! Note that for 1,2,5 elements lists, space is wasted. List allocation size: 32 bits – 32 + (4 * length) 64 bits – 72 + (8 * length) Shrinking only when list size < 1/2 of allocated space. P. Przymus 12/31
  • 14. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Overallocation of dictionaries/sets Represented as fixed-length hash tables. Overallocation for dict/sets – when 2/3 of capacity is reached. if number of elements < 50000: quadruple the capacity else: double the capacity 1 // dict growth strategy 2 (mp ->ma_used >50000 ? 2 : 4) * mp ->ma_used; 3 // set growth strategy 4 so ->used >50000 ? so ->used *2 : so ->used *4); 5 Dict/Set growth/shrink code 1 for (newsize = PyDict_MINSIZE ; 2 newsize <= minused && newsize > 0; 3 newsize <<= 1); 4 Shrinkage if dictionary/set fill (real and dummy elements) is much larger than used elements (real elements) i.e. lot of keys have been deleted. P. Przymus 13/31
  • 15. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Various data representation 1 # Fields: field1 , field2 , field3 , ..., field8 2 # Data: "foo 1", "foo 2", "foo 3", ..., "foo 8" 3 class OldStyleClass : #only py -2.x 4 ... 5 class NewStyleClass (object): # default for py -3.x 6 ... 7 class NewStyleClassSlots (object): 8 __slots__ = (’field1 ’, ’field2 ’, ...) 9 ... 10 import collections as c 11 NamedTuple = c.namedtuple(’nt’, [ ’field1 ’, ... ,]) 12 13 TupleData = (’value1 ’, ’value2 ’, ....) 14 ListaData = [’value1 ’, ’value2 ’, ....] 15 DictData = {’field1 ’:, ’value2 ’, ....} 16 Listing 8: Various data representation P. Przymus 14/31
  • 16. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Various data representation – allocated memory 0 MB 50 MB 100 MB 150 MB Old StyleClass New StyleClass DictData NamedTuple TupleData ListaData NewStyle ClassWithSlots Python 2.x Python 3.x Figure: Allocated memory after creating 100000 objects with 8 fields each P. Przymus 15/31
  • 17. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Notes on garbage collector, reference count and cycles Python garbage collector Uses reference counting. Offers cycle detection. Objects garbage-collected when count goes to 0. Reference increment, e.g.: object creation, additional aliases, passed to function Reference decrement, e.g.: local reference goes out of scope, alias is destroyed, alias is reassigned Warning – from documentation Objects that have del () methods and are part of a reference cycle cause the entire reference cycle to be uncollectable! Python doesn’t collect such cycles automatically. It is not possible for Python to guess a safe order in which to run the del () methods. P. Przymus 16/31
  • 18. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools psutil memory profiler objgraph Meliae (could be combined with runsnakerun) Heapy P. Przymus 17/31
  • 19. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – psutil psutil – A cross-platform process and system utilities module for Python. 1 import psutil 2 import os 3 ... 4 p = psutil.Process(os.getpid ()) 5 pinfo = p.as_dict () 6 ... 7 print pinfo[’memory_percent ’], 8 print pinfo[’memory_info ’].rss , pinfo[’memory_info ’]. vms Listing 9: Various data representation P. Przymus 18/31
  • 20. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – memory profiler memory profiler – a module for monitoring memory usage of a python program. Recommended dependency: psutil. May work as: Line-by-line profiler. Memory usage monitoring (memory in time). Debugger trigger – setting debugger breakpoints. P. Przymus 19/31
  • 21. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References memory profiler – Line-by-line profiler Preparation To track particular functions use profile decorator. Running 1 python -m memory_profiler 1 Line # Mem usage Increment Line Contents 2 ================================================ 3 45 9.512 MiB 0.000 MiB @profile 4 46 def create_lot_of_stuff ( times = 10000 , cl = OldStyleClass ): 5 47 9.516 MiB 0.004 MiB ret = [] 6 48 9.516 MiB 0.000 MiB t = "foo %d" 7 49 156.449 MiB 146.934 MiB for i in xrange(times): 8 50 156.445 MiB -0.004 MiB l = [ t % (j + i%8) for j in xrange (8)] 9 51 156.449 MiB 0.004 MiB c = cl(*l) 10 52 156.449 MiB 0.000 MiB ret.append(c) 11 53 156.449 MiB 0.000 MiB return ret Listing 10: Results P. Przymus 20/31
  • 22. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References memory profiler – memory usage monitoring Preparation To track particular functions use profile decorator. Running and plotting 1 mprof run --python python uniwerse.py -f 100 100 -s 100 100 10 2 mprof plot Figure: Results P. Przymus 21/31
  • 23. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References memory profiler – Debugger trigger 1 eror@eror -laptop :˜$ python -m memory_profiler --pdb -mmem =10 uniwerse.py -s 100 100 10 2 Current memory 20.80 MiB exceeded the maximumof 10.00 MiB 3 Stepping into the debugger 4 > /home/eror/uniwerse.py (52) connect () 5 -> self.adj.append(n) 6 (Pdb) Listing 11: Debugger trigger – setting debugger breakpoints. P. Przymus 22/31
  • 24. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – objgraph objgraph – draws Python object reference graphs with graphviz. 1 import objgraph 2 x = [] 3 y = [x, [x], dict(x=x)] 4 objgraph.show_refs ([y], filename=’sample -graph.png’) 5 objgraph. show_backrefs ([x], filename=’sample -backref -graph.png’ ) Listing 12: Tutorial example Figure: Reference graph Figure: Back reference graph P. Przymus 23/31
  • 25. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – Heapy/Meliae Heapy The heap analysis toolset. It can be used to find information about the objects in the heap and display the information in various ways. part of ”Guppy-PE – A Python Programming Environment” Meliae Python Memory Usage Analyzer ”This project is similar to heapy (in the ’guppy’ project), in its attempt to understand how memory has been allocated.” runsnakerun GUI support. P. Przymus 24/31
  • 26. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – Heapy 1 from guppy import hpy 2 hp=hpy() 3 h1 = hp.heap () 4 l = [ range(i) for i in xrange (2**10)] 5 h2 = hp.heap () 6 print h2 - h1 Listing 13: Heapy example 1 Partition of a set of 294937 objects. Total size = 11538088 bytes. 2 Index Count % Size % Cumulative % Kind (class / dict of class) 3 0 293899 100 7053576 61 7053576 61 int 4 1 1025 0 4481544 39 11535120 100 list 5 2 6 0 1680 0 11536800 100 dict (no owner) 6 3 2 0 560 0 11537360 100 dict of guppy.etc. Glue.Owner 7 4 1 0 456 0 11537816 100 types.FrameType 8 5 2 0 144 0 11537960 100 guppy.etc.Glue. Owner 9 6 2 0 128 0 11538088 100 str Listing 14: Results P. Przymus 25/31
  • 27. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Meliae and runsnakerun 1 from meliae import scanner 2 scanner. dump_all_objects (" representation_meliae .dump") 3 # In shell: runsnakemem representation_meliae .dump Listing 15: Heapy example Figure: Meliae and runsnakerunP. Przymus 26/31
  • 28. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References malloc() alternatives – libjemalloc and libtcmalloc Pros: In some cases using different malloc() implementation ”may” help to retrieve memory from CPython back to system. Cons: But equally it may work against you. 1 $LD_PRELOAD ="/usr/lib/libjemalloc .so.1" python int_float_alloc .py 2 $ LD_PRELOAD="/usr/lib/ libtcmalloc_minimal .so.4" python int_float_alloc .py Listing 16: Changing memory allocator P. Przymus 27/31
  • 29. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References malloc() alternatives – libjemalloc and libtcmalloc Step malloc jemalloc tcmalloc res virt res virt res virt step 1 7.4M 46.5M 8.0M 56.9M 9.4M 56.1M step 2 40.0M 79.1M 41.6M 88.9M 42.5M 89.3M step 3 16.2M 55.3M 8.2M 88.9M 42.5M 89.3M step 4 40.0M 84.3M 41.5M 100.9M 51.5M 98.4M step 5 8.2M 47.3M 8.5M 100.9M 51.5M 98.4M P. Przymus 28/31
  • 30. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Other useful tools Build Python in debug mode (./configure –with-pydebug . . . ). Maintains list of all active objects. Upon exit (or every statement in interactive mode), print all existing references. Trac total allocation. valgrind – a programming tool for memory debugging, leak detection, and profiling. Rather low level. CPython can cooperate with valgrind (for >= py-2.7, py-3.2) gdb-heap (gdb extension) low level, still experimental can be attached to running processes may be used with core file Web applications memory leaks dowser – cherrypy application that displays sparklines of python object counts. dozer – wsgi middleware version of the cherrypy memory leak debugger (any wsgi application). P. Przymus 29/31
  • 31. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Summary Summary: Try to understand better underlying memory model. Pay attention to hot spots. Use profiling tools. ”Seek and destroy” – find the root cause of the memory leak and fix it ;) Quick and sometimes dirty solutions: Delegate memory intensive work to other process. Regularly restart process. Go for low hanging fruits (e.g. slots , different allocators). P. Przymus 30/31
  • 32. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References References Wesley J. Chun, Principal CyberWeb Consulting, ”Python 103... MMMM: Understanding Python’s Memory Model, Mutability, Methods” David Malcolm, Red Hat, ”Dude – Where’s My RAM?” A deep dive into how Python uses memory. Evan Jones, Improving Python’s Memory Allocator Alexander Slesarev, Memory reclaiming in Python Source code of Python Tools documentation P. Przymus 31/31