The document discusses garbage collection in Python. It introduces concepts like the heap, mutator, and collector. It then describes the reference counting algorithm used by CPython, which has challenges with cyclic references. It also discusses mark and sweep garbage collection used by PyPy, which can collect cyclic garbage but requires stopping program execution. The document aims to provide an overview of memory management techniques in Python.
3. Motivation
Managing memory manually is hard.
Who owns the memory?
Should I free these resources?
What happens with double frees?
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 3 / 61
7. Basic concepts
Heap
A data structure in which objects may be allocated or deallocated in any
order.
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 7 / 61
8. Basic concepts
Heap
A data structure in which objects may be allocated or deallocated in any
order.
Mutator
The part of a running program which executes application code.
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 8 / 61
9. Basic concepts
Heap
A data structure in which objects may be allocated or deallocated in any
order.
Mutator
The part of a running program which executes application code.
Collector
The part of a running program responsible of garbage collection.
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 9 / 61
10. Garbage collection
Definition
Garbage collection is automatic memory management. While the
mutator runs , it routinely allocates memory from the heap. If more
memory than available is needed, the collector reclaims unused memory
and returns it to the heap.
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 10 / 61
11. CPython GC
CPython implementation has garbage collection.
CPython GC algorithm is Reference counting with cycle detector
It also has a generational GC.
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 11 / 61
46. Reference counting
Pros: Is incremental, as it works, it frees memory.
Cons: Detecting Cycles could be hard.
Cons: Size overhead on objects.
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 46 / 61
48. PyPy GC
Agnostic GC
Different implementations over time
Nowadays it uses incminmark
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 48 / 61
49. Young objects
[elem * 2 for elem in elements]
balance = (a / b / c) * 4
’asdadsasd -xxx’.replace(’x’, ’y’). replace(’a’, ’
foo.bar()
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 49 / 61
51. PyPy GC
Minor and Major collection
Objects are moved only once
Major collection is done incrementally (to avoid long stops)
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 51 / 61
54. Mark and Sweep Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 54 / 61
55. Mark and Sweep Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 55 / 61
56. Mark and Sweep Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 56 / 61
57. Mark and Sweep Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 57 / 61
58. Mark and sweep
Pros: Can collect cycles.
Cons: Basic implementation stops the world
Francisco Fernandez Castano (@fcofdezc) Python GC April 17, 2015 58 / 61