- Gayatri Nittala
About:Optimization - what, when & where?General Optimization strategiesPython specific optimizationsProfiling
Golden Rule:    "First make it work.          Then make it right.               Then make it fast!"                       ...
The process: 1.Get it right.     2.Test its right.            3.Profile if slow.                  4.Optimize.             ...
Make it right & pretty! Good Programming :-)    General optimization strategies    Python optimizations
Optimization:Aims to improveNot perfect, the resultProgramming with performance tips
When to start?Need for optimization  are you sure you need to do it at all?  is your code really so bad?         bench...
When to start?Cost of optimization   costly developer time   addition of new features   new bugs in algorithms   spee...
Where to start? Are you sure youre done coding? frosting a half-baked cake Premature optimization is the root of all ev...
General strategies  Algorithms - the big-O notation  Architecture  Choice of Data structures  LRU techniques  Loop in...
General strategies  Big – O – The Boss!  performance of the algorithms  a function of N - the input size to the algorith...
Common big-O’sOrder      Said to be Examples           “…. time”--------------------------------------------------O(1)    ...
Note the notation  O(N2)                         O(N)  def slow(it):                 def fast(it):    result = []         ...
Big-O’s of Python Building blocks   lists - vectors   dictionaries - hash tables   sets - hash tables
Big-O’s of Python Building blocks  Let, L be any list, T any string (plain or Unicode); D   any dict; S any set, with (say...
Big-O’s of Python Building blocks  O(N) - Loops on L, T, D, S, general additions or          removals to/from L (not at t...
Right Data Structure   lists, sets, dicts, tuples   collections - deque, defaultdict, namedtuple   Choose them based on...
Right Data Structure   my_list = range(n)    n in my_list   my_list = set(range(n))    n in my_list   my_list[start:end...
Right Data Structure  s = [(yellow, 1), (blue, 2), (yellow, 3), (blue, 4), (red, 1)]   d = defaultdict(list)    for k, v ...
Python Performance Tips   built-in modules   string concatenation   lookups and local variables   dictionary initializ...
Built-ins  - Highly optimized  - Sort a list of tuples by it’s n-th field   def sortby(somelist, n):      nlist = [(x[n],...
String Concatenation   s = ""     for substring in list:         s += substring   s = "".join(list)   out = "<html>" + ...
Searching:  using ‘in’    O(1) if RHS is set/dictionary    O(N) if RHS is string/list/tuple  using ‘hasattr’    if th...
Loops:  list comprehensions  map as for loop moved to c – if the body of the loop is a   function call    newlist = [] ...
Lookups and Local variables:  evaluating function references in loops  accessing local variables vs global variables   ...
Dictionaries  Initialization -- try... Except  Lookups -- string.maketransRegular expressions:  REs better than writi...
Imports  avoid import *  use only when required(inside functions)  lazy importsexec and eval  better to avoid  Comp...
Summary on loop optimization - (extracted from an                                  essay by Guido)  only optimize when th...
Might be unintentional, better not to be intuitive!The right answer to improve performance          - Use PROFILERS
Spot it Right!   Hotspots   Fact and fake( - Profiler Vs Programmers intuition!)   Threads    IO operations    Loggin...
Profilers   timeit/time.clock   profile/cprofile   Visualization     RunSnakeRun     Gprof2Dot     PycallGraph
timeit   precise performance of small code snippets.   the two convenience functions - timeit and repeat    timeit.repe...
timeit  import timeit   timeit.timeit(for i in xrange(10): oct(i), gc.enable())  1.7195474706909972   timeit.timeit(for ...
timeit  import timeit   python -mtimeit "try:" "   str.__nonzero__" "except    AttributeError:" " pass"  1000000 loops, b...
timeit  test_timeit.py   def f():       try:         str.__nonzero__       except AttributeError:         pass    if __na...
cProfile/profile   Deterministic profiling   The run time performance   With statistics   Small snippets bring big cha...
cProfile statistics  E:pycon12>profile_example.py   100004 function calls in 0.306 CPU seconds   Ordered by: standard name...
Using the stats   The pstats module   View and compare stats      import cProfile       cProfile.run(foo(), fooprof)   ...
Visualization   A picture is worth a thousand words!   Other tools to visualize profiles     kcachegrind     RunSnakeR...
RunSnakeRun  E:pycon12>runsnake D:simulation_gui.profile
Dont be too clever.Dont sweat it too much. Develop an instinct for the sort of code that Python runs well.
References   http://docs.python.org   http://wiki.python.org/moin/PythonSpeed/PerformanceTips/   http://sschwarzer.com/...
Questions?
Upcoming SlideShare
Loading in …5
×

Profiling and optimization

1,241 views
1,090 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,241
On SlideShare
0
From Embeds
0
Number of Embeds
162
Actions
Shares
0
Downloads
27
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Profiling and optimization

  1. 1. - Gayatri Nittala
  2. 2. About:Optimization - what, when & where?General Optimization strategiesPython specific optimizationsProfiling
  3. 3. Golden Rule: "First make it work. Then make it right. Then make it fast!" - Kent Beck
  4. 4. The process: 1.Get it right. 2.Test its right. 3.Profile if slow. 4.Optimize. 5.Repeat from 2. test suites source control
  5. 5. Make it right & pretty! Good Programming :-)  General optimization strategies  Python optimizations
  6. 6. Optimization:Aims to improveNot perfect, the resultProgramming with performance tips
  7. 7. When to start?Need for optimization  are you sure you need to do it at all?  is your code really so bad?  benchmarking  fast enough vs. fasterTime for optimization  is it worth the time to tune it?  how much time is going to be spent running that code?
  8. 8. When to start?Cost of optimization  costly developer time  addition of new features  new bugs in algorithms  speed vs. space Optimize only if necessary!
  9. 9. Where to start? Are you sure youre done coding? frosting a half-baked cake Premature optimization is the root of all evil! - Don Knuth Working, well-architected code is always a must
  10. 10. General strategies Algorithms - the big-O notation Architecture Choice of Data structures LRU techniques Loop invariant code out of loops Nested loops try...catch instead of if...else Multithreading for I/O bound code DBMS instead of flat files
  11. 11. General strategies Big – O – The Boss! performance of the algorithms a function of N - the input size to the algorithm  O(1) - constant time  O(ln n) - logarithmic  O(n) - linear  O(n2) - quadratic
  12. 12. Common big-O’sOrder Said to be Examples “…. time”--------------------------------------------------O(1) constant key in dict dict[key] = value list.append(item)O(ln n) logarithmic Binary searchO(n) linear item in sequence str.join(list)O(n ln n) list.sort()O(n2) quadratic Nested loops (with constant time bodies)
  13. 13. Note the notation O(N2) O(N) def slow(it): def fast(it): result = [] result = [] for item in it: for item in it: result.insert(0, item) result.append(item) return result result.reverse( ) return result result = list(it)
  14. 14. Big-O’s of Python Building blocks  lists - vectors  dictionaries - hash tables  sets - hash tables
  15. 15. Big-O’s of Python Building blocks Let, L be any list, T any string (plain or Unicode); D any dict; S any set, with (say) numbers as items (with O(1) hashing and comparison) and x any number: O(1) - len( L ), len(T), len( D ), len(S), L [i], T [i], D[i], del D[i], if x in D, if x in S, S .add( x ), S.remove( x ), additions or removals to/from the right end of L
  16. 16. Big-O’s of Python Building blocks O(N) - Loops on L, T, D, S, general additions or removals to/from L (not at the right end), all methods on T, if x in L, if x in T, most methods on L, all shallow copies O(N log N) - L .sort in general (but O(N) if L is already nearly sorted or reverse-sorted)
  17. 17. Right Data Structure  lists, sets, dicts, tuples  collections - deque, defaultdict, namedtuple  Choose them based on the functionality  search an element in a sequence  append  intersection  remove from middle  dictionary initializations
  18. 18. Right Data Structure  my_list = range(n) n in my_list  my_list = set(range(n)) n in my_list  my_list[start:end] = []  my_deque.rotate(-end) for counter in (end-start): my_deque.pop()
  19. 19. Right Data Structure s = [(yellow, 1), (blue, 2), (yellow, 3), (blue, 4), (red, 1)]  d = defaultdict(list) for k, v in s: d[k].append(v) d.items() [(blue, [2, 4]), (red, [1]), (yellow, [1, 3])]  d = {} for k, v in s: d.setdefault(k, []).append(v) d.items() [(blue, [2, 4]), (red, [1]), (yellow, [1, 3])]
  20. 20. Python Performance Tips  built-in modules  string concatenation  lookups and local variables  dictionary initialization  dictionary lookups  import statements  loops
  21. 21. Built-ins - Highly optimized - Sort a list of tuples by it’s n-th field  def sortby(somelist, n): nlist = [(x[n], x) for x in somelist] nlist.sort() return [val for (key, val) in nlist] n = 1 import operator nlist.sort(key=operator.itemgetter(n))
  22. 22. String Concatenation  s = "" for substring in list: s += substring  s = "".join(list)  out = "<html>" + head + prologue + query + tail + "</html>"  out = "<html>%s%s%s%s</html>" % (head, prologue, query, tail)  out = "<html>%(head)s%(prologue)s%(query)s% (tail)s</html>" % locals()
  23. 23. Searching:  using ‘in’  O(1) if RHS is set/dictionary  O(N) if RHS is string/list/tuple  using ‘hasattr’  if the searched value is an attribute  if the searched value is not an attribute
  24. 24. Loops:  list comprehensions  map as for loop moved to c – if the body of the loop is a function call  newlist = [] for word in oldlist: newlist.append(word.upper())  newlist = [s.upper() for s in oldlist]  newlist = map(str.upper, oldlist)
  25. 25. Lookups and Local variables:  evaluating function references in loops  accessing local variables vs global variables  upper = str.upper newlist = [] append = newlist.append for word in oldlist: append(upper(word))
  26. 26. Dictionaries  Initialization -- try... Except  Lookups -- string.maketransRegular expressions:  REs better than writing a loop  Built-in string functions better than REs  Compiled res are significantly faster  re.search(^[A-Za-z]+$, source)  x = re.compile(^[A-Za-z]+$).search x(source)
  27. 27. Imports  avoid import *  use only when required(inside functions)  lazy importsexec and eval  better to avoid  Compile and evaluate
  28. 28. Summary on loop optimization - (extracted from an essay by Guido)  only optimize when there is a proven speed bottleneck  small is beautiful  use intrinsic operations  avoid calling functions written in Python in your inner loop  local variables are faster than globals  try to use map(), filter() or reduce() to replace an explicit for loop(map with built-in, for loop with inline)  check your algorithms for quadratic behaviour  and last but not least: collect data. Pythons excellent profile module can quickly show the bottleneck in your code
  29. 29. Might be unintentional, better not to be intuitive!The right answer to improve performance - Use PROFILERS
  30. 30. Spot it Right!  Hotspots  Fact and fake( - Profiler Vs Programmers intuition!) Threads  IO operations  Logging  Encoding and Decoding  Lookups  Rewrite just the hotspots!  Psyco/Pyrex  C extensions
  31. 31. Profilers  timeit/time.clock  profile/cprofile  Visualization  RunSnakeRun  Gprof2Dot  PycallGraph
  32. 32. timeit  precise performance of small code snippets.  the two convenience functions - timeit and repeat  timeit.repeat(stmt[, setup[, timer[, repeat=3[, number=1000000]]]])  timeit.timeit(stmt[, setup[, timer[, number=1000000]]])  can also be used from command line  python -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h] [statement ...]
  33. 33. timeit import timeit  timeit.timeit(for i in xrange(10): oct(i), gc.enable()) 1.7195474706909972  timeit.timeit(for i in range(10): oct(i), gc.enable()) 2.1380978155005295  python -m timeit -n1000 -sx=0 x+=1 1000 loops, best of 3: 0.0166 usec per loop  python -m timeit -n1000 -sx=0 x=x+1 1000 loops, best of 3: 0.0169 usec per loop
  34. 34. timeit import timeit  python -mtimeit "try:" " str.__nonzero__" "except AttributeError:" " pass" 1000000 loops, best of 3: 1.53 usec per loop  python -mtimeit "try:" " int.__nonzero__" "except AttributeError:" " pass" 10000000 loops, best of 3: 0.102 usec per loop
  35. 35. timeit test_timeit.py  def f(): try: str.__nonzero__ except AttributeError: pass if __name__ == __main__: f()  python -mtimeit -s "from test_timeit import f" "f()" 100000 loops, best of 3: 2.5 usec per loop
  36. 36. cProfile/profile  Deterministic profiling  The run time performance  With statistics  Small snippets bring big changes!  import cProfile cProfile.run(command[, filename])  python -m cProfile myscript.py [-o output_file] [-s sort_order]
  37. 37. cProfile statistics E:pycon12>profile_example.py 100004 function calls in 0.306 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.014 0.014 0.014 0.014 :0(setprofile) 1 0.000 0.000 0.292 0.292 <string>:1(<module>) 1 0.000 0.000 0.306 0.306 profile:0(example()) 0 0.000 0.000 profile:0(profiler) 1 0.162 0.162 0.292 0.292 profile_example.py:10(example) 100000 0.130 0.000 0.130 0.000 profile_example.py:2(check)
  38. 38. Using the stats  The pstats module  View and compare stats  import cProfile cProfile.run(foo(), fooprof) import pstats p = pstats.Stats(fooprof)  p.strip_dirs().sort_stats(-1).print_stats()  p.sort_stats(cumulative).print_stats(10)  p.sort_stats(file).print_stats(__init__)
  39. 39. Visualization  A picture is worth a thousand words!  Other tools to visualize profiles  kcachegrind  RunSnakeRun  GProf2Dot  PyCallGraph  PyProf2CallTree
  40. 40. RunSnakeRun  E:pycon12>runsnake D:simulation_gui.profile
  41. 41. Dont be too clever.Dont sweat it too much. Develop an instinct for the sort of code that Python runs well.
  42. 42. References  http://docs.python.org  http://wiki.python.org/moin/PythonSpeed/PerformanceTips/  http://sschwarzer.com/download/optimization_europython2006.pdf  http://oreilly.com/python/excerpts/python-in-a-nutshell/testing- debugging.html
  43. 43. Questions?

×