Profiling and optimization

About:
Optimization - what, when & where?
General Optimization strategies
Python specific optimizations
Profiling

Golden Rule:

"First make it work.
Then make it right.
Then make it fast!"
- Kent Beck

The process:

1.Get it right.
2.Test it's right.
3.Profile if slow.
4.Optimize.
5.Repeat from 2.

 test suites
 source control

Make it right & pretty!

Good Programming :-)

 General optimization strategies
 Python optimizations

Optimization:
Aims to improve
Not perfect, the result
Programming with performance tips

When to start?
Need for optimization
 are you sure you need to do it at all?
 is your code really so bad?
 benchmarking
 fast enough vs. faster

Time for optimization
 is it worth the time to tune it?
 how much time is going to be spent running that
code?

When to start?
Cost of optimization
 costly developer time
 addition of new features
 new bugs in algorithms
 speed vs. space

Optimize only if necessary!

Where to start?
 Are you sure you're done coding?
frosting a half-baked cake
Premature optimization is the root of all evil!
- Don Knuth
 Working, well-architected code is always a must

General strategies
Algorithms - the big-O notation
Architecture
Choice of Data structures
LRU techniques
Loop invariant code out of loops
Nested loops
try...catch instead of if...else
Multithreading for I/O bound code
DBMS instead of flat files

General strategies
Big – O – The Boss!

performance of the algorithms
a function of N - the input size to the algorithm
 O(1) - constant time
 O(ln n) - logarithmic

 O(n) - linear
 O(n2) - quadratic

Common big-O’s
Order Said to be Examples
“…. time”
--------------------------------------------------
O(1) constant key in dict
dict[key] = value
list.append(item)
O(ln n) logarithmic Binary search
O(n) linear item in sequence
str.join(list)
O(n ln n) list.sort()
O(n2) quadratic Nested loops (with constant time bodies)

Note the notation
O(N2) O(N)
def slow(it): def fast(it):
result = [] result = []
for item in it: for item in it:
result.insert(0, item) result.append(item)
return result result.reverse( )
return result
result = list(it)

Big-O’s of Python Building blocks
 lists - vectors
 dictionaries - hash tables
 sets - hash tables

Let, L be any list, T any string (plain or Unicode); D
any dict; S any set, with (say) numbers as items
(with O(1) hashing and comparison) and x any
number:

O(1) - len( L ), len(T), len( D ), len(S), L [i],
T [i], D[i], del D[i], if x in D, if x in S,
S .add( x ), S.remove( x ), additions or
removals to/from the right end of L

O(N) - Loops on L, T, D, S, general additions or
removals to/from L (not at the right end),
all methods on T, if x in L, if x in T,
most methods on L, all shallow copies

O(N log N) - L .sort in general (but O(N) if L is
already nearly sorted or reverse-sorted)

Right Data Structure
 lists, sets, dicts, tuples
 collections - deque, defaultdict, namedtuple
 Choose them based on the functionality
 search an element in a sequence
 append

 intersection

 remove from middle

 dictionary initializations

 my_list = range(n)
n in my_list
 my_list = set(range(n))
n in my_list

 my_list[start:end] = []
 my_deque.rotate(-end)
for counter in (end-start):
my_deque.pop()

s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]

 d = defaultdict(list)
for k, v in s:
d[k].append(v)
d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

 d = {}
for k, v in s:
d.setdefault(k, []).append(v)
d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

Python Performance Tips
 built-in modules
 string concatenation
 lookups and local variables
 dictionary initialization
 dictionary lookups
 import statements
 loops

Built-ins
- Highly optimized
- Sort a list of tuples by it’s n-th field

 def sortby(somelist, n):
nlist = [(x[n], x) for x in somelist]
nlist.sort()
return [val for (key, val) in nlist]
n = 1
import operator
nlist.sort(key=operator.itemgetter(n))

String Concatenation
 s = ""
for substring in list:
s += substring
 s = "".join(list)
 out = "<html>" + head + prologue + query + tail +
"</html>"
 out = "<html>%s%s%s%s</html>" % (head,
prologue, query, tail)
 out = "<html>%(head)s%(prologue)s%(query)s%
(tail)s</html>" % locals()

Searching:
 using ‘in’
 O(1) if RHS is set/dictionary

 O(N) if RHS is string/list/tuple

 using ‘hasattr’
 if the searched value is an attribute

 if the searched value is not an attribute

Loops:
 list comprehensions
 map as for loop moved to c – if the body of the loop is a
function call

 newlist = []
for word in oldlist:
newlist.append(word.upper())

 newlist = [s.upper() for s in oldlist]

 newlist = map(str.upper, oldlist)

Lookups and Local variables:
 evaluating function references in loops
 accessing local variables vs global variables

 upper = str.upper
newlist = []
append = newlist.append
for word in oldlist:
append(upper(word))

Dictionaries
 Initialization -- try... Except
 Lookups -- string.maketrans

Regular expressions:
 RE's better than writing a loop
 Built-in string functions better than RE's

 Compiled re's are significantly faster

 re.search('^[A-Za-z]+$', source)
 x = re.compile('^[A-Za-z]+$').search
x(source)

Imports
 avoid import *
 use only when required(inside functions)

 lazy imports

exec and eval
 better to avoid
 Compile and evaluate

Summary on loop optimization - (extracted from an
essay by Guido)
 only optimize when there is a proven speed bottleneck
 small is beautiful

 use intrinsic operations

 avoid calling functions written in Python in your inner
loop
 local variables are faster than globals

 try to use map(), filter() or reduce() to replace an
explicit for loop(map with built-in, for loop with inline)
 check your algorithms for quadratic behaviour

 and last but not least: collect data. Python's excellent
profile module can quickly show the bottleneck in your
code

Might be unintentional, better not to be intuitive!

The right answer to improve performance
- Use PROFILERS

Spot it Right!
 Hotspots
 Fact and fake( - Profiler Vs Programmers intuition!)
Threads
 IO operations

 Logging

 Encoding and Decoding

 Lookups

 Rewrite just the hotspots!
 Psyco/Pyrex
 C extensions

Profilers
 timeit/time.clock
 profile/cprofile
 Visualization
 RunSnakeRun
 Gprof2Dot

 PycallGraph

timeit
 precise performance of small code snippets.
 the two convenience functions - timeit and repeat
 timeit.repeat(stmt[, setup[, timer[, repeat=3[,
number=1000000]]]])
 timeit.timeit(stmt[, setup[, timer[, number=1000000]]])

 can also be used from command line
 python -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h]
[statement ...]

timeit
import timeit

 timeit.timeit('for i in xrange(10): oct(i)', gc.enable()')
1.7195474706909972

 timeit.timeit('for i in range(10): oct(i)', 'gc.enable()')
2.1380978155005295

 python -m timeit -n1000 -s'x=0' 'x+=1'
1000 loops, best of 3: 0.0166 usec per loop

 python -m timeit -n1000 -s'x=0' 'x=x+1'

timeit
import timeit

 python -mtimeit "try:" " str.__nonzero__" "except
AttributeError:" " pass"

 python -mtimeit "try:" " int.__nonzero__" "except
AttributeError:" " pass"

timeit
test_timeit.py

 def f():
try:
str.__nonzero__
except AttributeError:
pass

if __name__ == '__main__':
f()

 python -mtimeit -s "from test_timeit import f" "f()"

cProfile/profile
 Deterministic profiling
 The run time performance
 With statistics
 Small snippets bring big changes!

 import cProfile
cProfile.run(command[, filename])

 python -m cProfile myscript.py [-o output_file] [-s
sort_order]

cProfile statistics
E:pycon12>profile_example.py
100004 function calls in 0.306 CPU seconds

Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.014 0.014 0.014 0.014 :0(setprofile)
1 0.000 0.000 0.292 0.292 <string>:1(<module>)
1 0.000 0.000 0.306 0.306 profile:0(example())
0 0.000 0.000 profile:0(profiler)
1 0.162 0.162 0.292 0.292 profile_example.py:10(example)
100000 0.130 0.000 0.130 0.000 profile_example.py:2(check)

Using the stats
 The pstats module
 View and compare stats
 import cProfile
cProfile.run('foo()', 'fooprof')
import pstats
p = pstats.Stats('fooprof')

 p.strip_dirs().sort_stats(-1).print_stats()
 p.sort_stats('cumulative').print_stats(10)
 p.sort_stats('file').print_stats('__init__')

Visualization
 A picture is worth a thousand words!
 Other tools to visualize profiles
 kcachegrind
 RunSnakeRun

 GProf2Dot

 PyCallGraph

 PyProf2CallTree

RunSnakeRun
 E:pycon12>runsnake D:simulation_gui.profile

Don't be too clever.
Don't sweat it too much.
 Develop an instinct for the sort of code that
Python runs well.

References
 http://docs.python.org
 http://wiki.python.org/moin/PythonSpeed/PerformanceTips/
 http://sschwarzer.com/download/optimization_europython2006.pdf
 http://oreilly.com/python/excerpts/python-in-a-nutshell/testing-
debugging.html

Profiling and optimization

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Profiling and optimization

Similar to Profiling and optimization (20)

Recently uploaded

Recently uploaded (20)

Profiling and optimization