Python internals
And
How they may affect
Your code
:About me
I have never let my schooling interfere with my education.
― Mark Twain
If you want to know a bit more, check out my answers in
Stackoverflow or my Github repositories. ;-)
The Law of Leaky Abstractions (Joel Spolsky)
All non-trivial abstractions, to some degree, are leakyAll non-trivial abstractions, to some degree, are leaky
● TCP attempts to provide a complete abstraction of an underlying unreliable network
● Something as simple as iterating over a large two-dimensional array
● The SQL language is meant to abstract away the procedural steps that are
needed to query a database
● Even though network libraries like NFS and SMB let you treat files on remote
machines "as if" they were local.
● C++ string classes are supposed to let you pretend that strings are first-class data.
Wassily Kandinsky, On White 2, 1923
an abstract painting on white canvas
Objects are Python’s abstraction for data.
Every object has an identity, a type and a value.
Python data model at a glance
(Objects, values and types)
● Identity.
● The is operator and id() function.
● Object’s type.
● Mutable and immutable object.
● An object’s mutability is determined by its type.
● Objects are never explicitly destroyed.
● Garbage collection.
Small digits are singleton
There is only one copy of each small number.
-5 through to 256 are all cached like this.
>>> 10 is 2*5
True
>>> 100 is 25*4
True
>>>
>>> 300 is 6*50
False
>>> 300 == 6*50
True
Strings are interned
String interning: storing only one copy of each distinct
string value and reusing a singleton copy on demand.
Comparing pointers in C is so much faster than comparing
the contents.
Most things are dictionary (lots of comparison, global
namespace, objects attributes stores in dictionary, etc.)
Dictionaries used to hold module, class or instance
attributes have interned keys.
When to intern?
You can use intern() function which will enter
string in the table of “interned”, and the is operator
for identity testing instead of regular string comparison.
Use custom interning when you are dealing with:
● Large numbers of strings
● Lots of dictionary access or other equality tests
Peephole optimizations
Peephole optimizer is part of the compiler
● Expressions are simplified.
● Some mutable objects are replaced with immutables.
Expressions are simplified
Python will precalculate expressions:
● Numeric calculations
● Sequences
Mutables replaced by immutables
Membership tests against a literal:
>>> my_var in {'foo', 'bar', 'baz'}
>>>
>>> my_var in ['Royale', 'with', 'cheese']
● The set() becomes a frozenset()
● The list() becomes a tuple()
● Iterating over string VS a container object (list, set, dict, etc.)
● Comparison is pretty much a nightmare.(specially when mixing float with an integer
type)
>>> timeit.timeit("562949953420000.7 < 562949953421000")
0.34722089767456055
>>>
>>> timeit.timeit("562949953420000.7 < 562949953422000")
0.113677978515625
● Smart optimizations (object.__length_hint__, range.__contain__,
etc.)
Some internal optimization tips
Python performs a series of checks, returning the result if one of the checks succeeds.
● Signs of the two values
● Compares the exponent of the float to the length of the integer.
(whether the integer is "too big" to be a float)
● If all of these checks fail, it is necessary to construct two new Python objects to compare
When comparing a float 'v' to an integer/long 'w', the worst case is that:
● v and w have the same sign (both positive or both negative),
● the integer w has few enough bits that it can be held in the size_t type (typically 32 or 64
bits),
● the integer w has at least 49 bits,
● the exponent of the float v is the same as the number of bits in w.
Comparison is pretty much a nightmare
Take a gander at Internal types
A few types used internally by the interpreter are exposed to the user.
● Code objects
● Frame objects
● Traceback objects
● Slice objects
● Static method objects
● Class method objects
Code objects
Code objects represent byte-compiled executable Python code, or bytecode.
● difference between a code object and a function object
● Special read-only attributes:
● co_name
● co_argcount
● co_cellvars
● co_code
● etc.
Frame objects
Frame objects represent execution frames. They may occur in traceback objects .
● Special read-only attributes:
f_back
f_code
f_locals
Etc.
● Special writable attributes:
f_trace
frame.clear()
This method clears all references to local variables held by the frame.
Traceback objects
Traceback objects represent a stack trace of an exception.
It is accessible as the third item of the tuple returned by sys.exc_info()
If the interpreter is interactive, it is also made available to the user as
sys.last_traceback
Slice objects
Slice objects are used to represent slices for __getitem__() methods.
● Special read-only attributes: start, stop, step
● Slice objects support one method: slice.indices(length)
>>> s = slice(0,20,2)
>>> a = range(10)
>>>
>>> a[s]
[0, 2, 4, 6, 8]
>>>
>>> s.indices(10)
(0, 10, 2)
>>>
>>> s.indices(30)
(0, 20, 2)
Static method objects
Static method objects provide a way of defeating the transformation of function objects to
method objects.
Static method objects are created by the built-in staticmethod() constructor.
class C:
@staticmethod
def func(arg1, arg2, ...): ...
Class method objects
A class method object, like a static method object, is a wrapper around another object that
alters the way in which that object is retrieved from classes and class instances.
Class method objects are created by the built-in classmethod() constructor.
class C:
@classmethod
def fcunc(cls, arg1, arg2, ...): ...

Python internals and how they affect your code - kasra ahmadvand

  • 1.
    Python internals And How theymay affect Your code
  • 2.
    :About me I havenever let my schooling interfere with my education. ― Mark Twain If you want to know a bit more, check out my answers in Stackoverflow or my Github repositories. ;-)
  • 3.
    The Law ofLeaky Abstractions (Joel Spolsky) All non-trivial abstractions, to some degree, are leakyAll non-trivial abstractions, to some degree, are leaky
  • 4.
    ● TCP attemptsto provide a complete abstraction of an underlying unreliable network ● Something as simple as iterating over a large two-dimensional array ● The SQL language is meant to abstract away the procedural steps that are needed to query a database ● Even though network libraries like NFS and SMB let you treat files on remote machines "as if" they were local. ● C++ string classes are supposed to let you pretend that strings are first-class data.
  • 5.
    Wassily Kandinsky, OnWhite 2, 1923 an abstract painting on white canvas
  • 6.
    Objects are Python’sabstraction for data. Every object has an identity, a type and a value.
  • 7.
    Python data modelat a glance (Objects, values and types)
  • 8.
    ● Identity. ● Theis operator and id() function. ● Object’s type. ● Mutable and immutable object. ● An object’s mutability is determined by its type. ● Objects are never explicitly destroyed. ● Garbage collection.
  • 9.
    Small digits aresingleton There is only one copy of each small number. -5 through to 256 are all cached like this. >>> 10 is 2*5 True >>> 100 is 25*4 True >>> >>> 300 is 6*50 False >>> 300 == 6*50 True
  • 10.
    Strings are interned Stringinterning: storing only one copy of each distinct string value and reusing a singleton copy on demand. Comparing pointers in C is so much faster than comparing the contents. Most things are dictionary (lots of comparison, global namespace, objects attributes stores in dictionary, etc.) Dictionaries used to hold module, class or instance attributes have interned keys.
  • 11.
    When to intern? Youcan use intern() function which will enter string in the table of “interned”, and the is operator for identity testing instead of regular string comparison. Use custom interning when you are dealing with: ● Large numbers of strings ● Lots of dictionary access or other equality tests
  • 12.
    Peephole optimizations Peephole optimizeris part of the compiler ● Expressions are simplified. ● Some mutable objects are replaced with immutables.
  • 13.
    Expressions are simplified Pythonwill precalculate expressions: ● Numeric calculations ● Sequences
  • 14.
    Mutables replaced byimmutables Membership tests against a literal: >>> my_var in {'foo', 'bar', 'baz'} >>> >>> my_var in ['Royale', 'with', 'cheese'] ● The set() becomes a frozenset() ● The list() becomes a tuple()
  • 15.
    ● Iterating overstring VS a container object (list, set, dict, etc.) ● Comparison is pretty much a nightmare.(specially when mixing float with an integer type) >>> timeit.timeit("562949953420000.7 < 562949953421000") 0.34722089767456055 >>> >>> timeit.timeit("562949953420000.7 < 562949953422000") 0.113677978515625 ● Smart optimizations (object.__length_hint__, range.__contain__, etc.) Some internal optimization tips
  • 16.
    Python performs aseries of checks, returning the result if one of the checks succeeds. ● Signs of the two values ● Compares the exponent of the float to the length of the integer. (whether the integer is "too big" to be a float) ● If all of these checks fail, it is necessary to construct two new Python objects to compare When comparing a float 'v' to an integer/long 'w', the worst case is that: ● v and w have the same sign (both positive or both negative), ● the integer w has few enough bits that it can be held in the size_t type (typically 32 or 64 bits), ● the integer w has at least 49 bits, ● the exponent of the float v is the same as the number of bits in w. Comparison is pretty much a nightmare
  • 17.
    Take a ganderat Internal types A few types used internally by the interpreter are exposed to the user. ● Code objects ● Frame objects ● Traceback objects ● Slice objects ● Static method objects ● Class method objects
  • 18.
    Code objects Code objectsrepresent byte-compiled executable Python code, or bytecode. ● difference between a code object and a function object ● Special read-only attributes: ● co_name ● co_argcount ● co_cellvars ● co_code ● etc.
  • 19.
    Frame objects Frame objectsrepresent execution frames. They may occur in traceback objects . ● Special read-only attributes: f_back f_code f_locals Etc. ● Special writable attributes: f_trace frame.clear() This method clears all references to local variables held by the frame.
  • 20.
    Traceback objects Traceback objectsrepresent a stack trace of an exception. It is accessible as the third item of the tuple returned by sys.exc_info() If the interpreter is interactive, it is also made available to the user as sys.last_traceback
  • 21.
    Slice objects Slice objectsare used to represent slices for __getitem__() methods. ● Special read-only attributes: start, stop, step ● Slice objects support one method: slice.indices(length) >>> s = slice(0,20,2) >>> a = range(10) >>> >>> a[s] [0, 2, 4, 6, 8] >>> >>> s.indices(10) (0, 10, 2) >>> >>> s.indices(30) (0, 20, 2)
  • 22.
    Static method objects Staticmethod objects provide a way of defeating the transformation of function objects to method objects. Static method objects are created by the built-in staticmethod() constructor. class C: @staticmethod def func(arg1, arg2, ...): ...
  • 23.
    Class method objects Aclass method object, like a static method object, is a wrapper around another object that alters the way in which that object is retrieved from classes and class instances. Class method objects are created by the built-in classmethod() constructor. class C: @classmethod def fcunc(cls, arg1, arg2, ...): ...