Uploaded on

Slides for presentation about Python GC (Garbage Collector) and memory management in Python (CPython version 2.7)

Slides for presentation about Python GC (Garbage Collector) and memory management in Python (CPython version 2.7)

More in: Software , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Python GC Dmitry Alimov Software Developer Zodiac Interactive 2014
  • 2. Garbage collection The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program. Was invented by John McCarthy around 1959 to solve problems in Lisp. Used in Lisp, Smalltalk, Python, Java, Ruby, Perl, C#, D, Haskell, Schema, Objective-C, etc. Basic algorithms: - Reference counting - Mark-and-sweep - Mark-and-compact - Copying collector - Generational collector
  • 3. Memory in Python
  • 4. PyMem_Malloc(), PyMem_Realloc(), PyMem_Free() PyMem_New(), PyMem_Resize(), PyMem_Del() Memory Management Other languages have "variables“, Python has "names" or "identifiers". Everything is an object >>> b = a>>> a = 2>>> a = 1 Memory management involves a private heap containing all objects and data structures.
  • 5. sys.getsizeof(object[, default]) >>> import sys >>> a = 123 >>> sys.getsizeof(a) 24 # 64-bit version Return the size of an object in bytes (without GC overhead). __sizeof__() >>> a.__sizeof__() 24 # 64-bit version sys.getsizeof and __sizeof__ Return the size of an object in bytes. The object can be any type of object. getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector. >>> sys.getsizeof(tuple((1, 2, 3))) 72 >>> tuple((1, 2, 3)).__sizeof__() 48
  • 6. id(object) >>> a = 123 >>> id(a) 30522672L This function returns the string starting at memory address address. ctypes.string_at(address[, size]) >>> ctypes.string_at(id(a), 24) 'x06x00x00x00x00x00x00x00xc0G)x1ex00x00x00x00{ x00x00x00x00x00x00x00' >>> struct.unpack('QQQ', ctypes.string_at(id(a), 24)) (6, 506021824, 123) id and ctypes.string_at Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. CPython implementation detail: This is the address of the object in memory.
  • 7. >>> sys.getrefcount(a) 8 >>> struct.unpack('QQQ', ctypes.string_at(id(a), 24)) (6, 506021824, 123) >>> type(a) <type 'int'> >>> id(type(a)) 506021824L >>> a 123 >>> ctypes.c_long.from_address(id(a)) c_long(6) Return the reference count of the object. The count returned is generally one higher than you might expect, because it includes the (temporary) reference as an argument to getrefcount(). sys.getrefcount(object) Unpack the string (presumably packed by pack(fmt, ...)) according to the given format. struct.unpack(fmt, string) Q | unsigned long long | integer type | 8 bytes
  • 8. >>> struct.unpack('QQQ', ctypes.string_at(id(a), 24)) (6, 506021824, 123) C code: typedef struct { PyObject_HEAD long ob_ival; } PyIntObject; #define PyObject_HEAD _PyObject_HEAD_EXTRA Py_ssize_t ob_refcnt; struct _typeobject *ob_type; #define _PyObject_HEAD_EXTRA struct _object *_ob_next; struct _object *_ob_prev;
  • 9. Garbage Collector in Python
  • 10. First garbage collection algorithm is known as reference counting. It was invented by George Collins in 1960. Reference Counting Py_INCREF/Py_DECREF If something decref'ed to 0, it should have been deallocated immediately at that time.
  • 11. GC methods gc.get_referrers(*objs) Return the list of objects that directly refer to any of objs. gc.get_referents(*objs) Return a list of objects directly referred to by any of the arguments.
  • 12. Cyclic references
  • 13. Generational algorithm of GC 3 Generations with thresholds: - generation 0 (youngest): 700 - generation 1 (middle): 10 - generation 2 (oldest): 10 >>> import gc >>> gc.get_threshold() (700, 10, 10) To limit the cost of garbage collection, there are two strategies: - make each collection faster, e.g. by scanning fewer objects - do less collections Except objects with a __del__ method! -> gc.garbage Full collection if the ratio: long_lived_pending / long_lived_total > 25% (Python 2.7+)
  • 14. Py_TPFLAGS_HAVE_GC flag >>> Py_TPFLAGS_HAVE_GC = 1 << 14 >>> bool(type(1).__flags__ & Py_TPFLAGS_HAVE_GC) False >>> bool(type([]).__flags__ & Py_TPFLAGS_HAVE_GC) True TYPE* PyObject_GC_New(TYPE, PyTypeObject *type) TYPE* PyObject_GC_NewVar(TYPE, PyTypeObject *type, Py_ssize_t size) The Py_TPFLAGS_HAVE_GC flag is set. Need provide an implementation of the tp_traverse handler. /* Adds op to the set of container objects tracked by GC */ void PyObject_GC_Track(PyObject *op) Object types which are “containers” for other objects C API:
  • 15. Generation 0 Generation 0 Linked list Generation 0
  • 16. Generation 0 Generation 1
  • 17. Weak References >>> import weakref >>> class A(object): pass >>> a = A() >>> b = weakref.ref(a) >>> weakref.getweakrefcount(a) 1 >>> p = weakref.proxy(a) >>> b() <__main__.A object at 0x0000000001EE64A8> >>> del a >>> b() None >>> b <weakref at 0000000001E8C408; dead> >>> p <weakproxy at 0000000001EAC458 to NoneType at 00000001E297348> Weak reference is a reference that does not protect the referenced object from collection by a garbage collector, unlike a strong reference.
  • 18. Debug gc.DEBUG_* gc.set_debug(gc.DEBUG_LEAK) Heapy (http://guppy-pe.sourceforge.net/) Memory profiler (https://pypi.python.org/pypi/memory_profiler) Python Object Graphs (http://mg.pov.lt/objgraph/) gdb-heap (https://fedorahosted.org/gdb-heap/)
  • 19. Thank you
  • 20. http://en.wikipedia.org/wiki/Garbage_collection_(computer_science) http://docs.python.org/2/library/gc.html http://svn.python.org/view/python/trunk/Modules/gcmodule.c?revision=81029 http://patshaughnessy.net/2013/10/30/generational-gc-in-python-and-ruby http://asvetlov.blogspot.ru/2008/11/blog-post.html http://habrahabr.ru/post/193890/ http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html http://foobarnbaz.com/2012/07/08/understanding-python-variables/ http://habrahabr.ru/company/wargaming/blog/198140/ http://en.wikipedia.org/wiki/Weak_reference References
  • 21. Q & A @delimitry