Successfully reported this slideshow.
Upcoming SlideShare
×

# High-Performance Python

9,867 views

Published on

Delivered by Ben Lerner at the 2016 New York R Conference on April 8th and 9th at Work-Bench.

Published in: Data & Analytics
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### High-Performance Python

1. 1. High-Performance Python
2. 2. Python is fast! • Python is fast to write, but natively 10x - 100x slower than C. • Python has great C interop, so you can use C for the slow parts. • This makes Python competitive with C.
3. 3. Before you try this at home… • “Premature optimization is the root of all evil.” • Use external standards for how fast your code needs to be. • Remember: performance is a tradeoff against readability,   maintainability, and developer time.
4. 4. Part 1: General Optimization
5. 5. Proﬁle Your Code • 95%+ of your code is irrelevant to performance. • A proﬁler will tells you which 5% is important.
6. 6. Proﬁle Your Code In Python, use cProﬁle: source: https://ymichael.com/2014/03/08/proﬁling-python-with-cproﬁle.html
7. 7. Basics • Make sure your Big-O performance is optimal. • Move operations outside of loops. • Use cacheing for repeated calculations. • Apply algebraic simpliﬁcations.
8. 8. Accidentally Quadratic The *most* common issue: def find_intersection(list_one, list_two): intersection = [] for a in list_one: if a in list_two: intersection.append(a) return intersection
9. 9. Accidentally Quadratic The *most* common issue: def find_intersection(list_one, list_two): intersection = [] for a in list_one: if a in list_two: intersection.append(a) return intersection def find_intersection(list_one, list_two): intersection = [] list_two = set(list_two) for a in list_one: if a in list_two: intersection.append(a) return intersection
11. 11. Part II: Python Optimization
12. 12. Libraries • Use numpy, scipy, pandas, scikit-learn, etc. • Incredible built-in functionality.    If you need something esoteric, try combining   built-ins or adapting a more general built-in approach. • Extremely fast, thoroughly optimized, and best of all, already written.
13. 13. Pure Python Tips • Function calls are expensive. Avoid them and avoid recursion. • Check the runtime of built-in data types. • Make variables local. Global lookups are expensive. • Use map/ﬁlter/reduce instead of for loops, they’re written in C.
14. 14. • Vectorize! numpy arrays are much faster than lists. Mixed Tips
15. 15. • Vectorize! numpy arrays are much faster than lists. Mixed Tips def complex_sum(in_list): in_list = [(a + 2) for a in in_list] # more transformations return sum(in_list) def complex_sum(in_list): in_list = np.array(in_list) in_list += 2 # more transformations return in_list.sum()
16. 16. Mixed Tips • Vectorize! numpy arrays are much faster than lists. • Array allocation can be a bottleneck.   Try moving it outside of loops.
17. 17. Mixed Tips • Vectorize! numpy arrays are much faster than lists. • Array allocation can be a bottleneck.   Try moving it outside of loops. n = 10 ** 3 output = 0 for i in xrange(10**9): result = np.zeros(n) ## calculations ## output += result.sum() result = np.zeros(10**3) output = 0 for i in xrange(10**9): result[:] = 0 # zero out array ## calculations ## output += result.sum()
18. 18. • Cython: inline C code directly into Python. Last Resort: C
19. 19. def fib(int n): cdef int a, b, temp a = 0 b = 1 while b < n: temp = b b = a + b a = temp • Cython: inline C code directly into Python. Last Resort: C def fib(n): a = 0 b = 1 while b < n: temp = b b = a + b a = temp return b
20. 20. • Cython: inline C code directly into Python. Last Resort: C def fib(int n): cdef int a, b, temp a = 0 b = 1 while b < n: temp = b b = a + b a = temp return b
21. 21. Last Resort: C • Cython: inline C code directly into Python. • C extensions: write C and call it from Python.
22. 22. Last Resort: C • Cython: inline C code directly into Python. • C extensions: write C and call it from Python. • Limit these techniques to hot loops.
23. 23. Things I haven’t mentioned • multithreading: basically doesn’t work in Python • pypy: A Python JIT compiler with a different ecosystem
24. 24. Warning Optimization is addictive.
25. 25. Conclusions • Avoid premature optimizations!  Have objective benchmarks you’re trying to hit. • Proﬁle your code.  You will be surprised by the results. • The gold standard for performance is highly-tuned C (that’s already been written by someone else)
26. 26. Resources • Programming Pearls (Jon Bentley) • accidentallyquadratic.tumblr.com • Performance Engineering of Software Systems, 6.172, MIT OpenCourseWare • cProﬁle Docs • Cython Docs • Guido Van Rossum’s advice:  python.org/doc/essays/list2str General Python Speciﬁc Contact me: ben@caffeinatedanalytics.com