Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Julia + R for Data Science by Work-Bench 9453 views
- R for Everything by Work-Bench 9446 views
- Improving Data Interoperability for... by Work-Bench 9341 views
- Building Scalable Prediction Servic... by Work-Bench 9522 views
- What We Learned Building an R-Pytho... by Work-Bench 9665 views
- Data Science Challenges in Personal... by Work-Bench 10165 views

9,545 views

Published on

Delivered by Ben Lerner at the 2016 New York R Conference on April 8th and 9th at Work-Bench.

Published in:
Data & Analytics

No Downloads

Total views

9,545

On SlideShare

0

From Embeds

0

Number of Embeds

9,003

Shares

0

Downloads

12

Comments

0

Likes

2

No embeds

No notes for slide

- 1. High-Performance Python
- 2. Python is fast! • Python is fast to write, but natively 10x - 100x slower than C. • Python has great C interop, so you can use C for the slow parts. • This makes Python competitive with C.
- 3. Before you try this at home… • “Premature optimization is the root of all evil.” • Use external standards for how fast your code needs to be. • Remember: performance is a tradeoff against readability, maintainability, and developer time.
- 4. Part 1: General Optimization
- 5. Proﬁle Your Code • 95%+ of your code is irrelevant to performance. • A proﬁler will tells you which 5% is important.
- 6. Proﬁle Your Code In Python, use cProﬁle: source: https://ymichael.com/2014/03/08/proﬁling-python-with-cproﬁle.html
- 7. Basics • Make sure your Big-O performance is optimal. • Move operations outside of loops. • Use cacheing for repeated calculations. • Apply algebraic simpliﬁcations.
- 8. Accidentally Quadratic The *most* common issue: def find_intersection(list_one, list_two): intersection = [] for a in list_one: if a in list_two: intersection.append(a) return intersection
- 9. Accidentally Quadratic The *most* common issue: def find_intersection(list_one, list_two): intersection = [] for a in list_one: if a in list_two: intersection.append(a) return intersection def find_intersection(list_one, list_two): intersection = [] list_two = set(list_two) for a in list_one: if a in list_two: intersection.append(a) return intersection
- 10. Business Logic Leverage business logic. You’ll often have NP-Complete optimizations to make. The underlying business reasoning should guide your approximations.
- 11. Part II: Python Optimization
- 12. Libraries • Use numpy, scipy, pandas, scikit-learn, etc. • Incredible built-in functionality. If you need something esoteric, try combining built-ins or adapting a more general built-in approach. • Extremely fast, thoroughly optimized, and best of all, already written.
- 13. Pure Python Tips • Function calls are expensive. Avoid them and avoid recursion. • Check the runtime of built-in data types. • Make variables local. Global lookups are expensive. • Use map/ﬁlter/reduce instead of for loops, they’re written in C.
- 14. • Vectorize! numpy arrays are much faster than lists. Mixed Tips
- 15. • Vectorize! numpy arrays are much faster than lists. Mixed Tips def complex_sum(in_list): in_list = [(a + 2) for a in in_list] # more transformations return sum(in_list) def complex_sum(in_list): in_list = np.array(in_list) in_list += 2 # more transformations return in_list.sum()
- 16. Mixed Tips • Vectorize! numpy arrays are much faster than lists. • Array allocation can be a bottleneck. Try moving it outside of loops.
- 17. Mixed Tips • Vectorize! numpy arrays are much faster than lists. • Array allocation can be a bottleneck. Try moving it outside of loops. n = 10 ** 3 output = 0 for i in xrange(10**9): result = np.zeros(n) ## calculations ## output += result.sum() result = np.zeros(10**3) output = 0 for i in xrange(10**9): result[:] = 0 # zero out array ## calculations ## output += result.sum()
- 18. • Cython: inline C code directly into Python. Last Resort: C
- 19. def fib(int n): cdef int a, b, temp a = 0 b = 1 while b < n: temp = b b = a + b a = temp • Cython: inline C code directly into Python. Last Resort: C def fib(n): a = 0 b = 1 while b < n: temp = b b = a + b a = temp return b
- 20. • Cython: inline C code directly into Python. Last Resort: C def fib(int n): cdef int a, b, temp a = 0 b = 1 while b < n: temp = b b = a + b a = temp return b
- 21. Last Resort: C • Cython: inline C code directly into Python. • C extensions: write C and call it from Python.
- 22. Last Resort: C • Cython: inline C code directly into Python. • C extensions: write C and call it from Python. • Limit these techniques to hot loops.
- 23. Things I haven’t mentioned • multithreading: basically doesn’t work in Python • pypy: A Python JIT compiler with a different ecosystem
- 24. Warning Optimization is addictive.
- 25. Conclusions • Avoid premature optimizations! Have objective benchmarks you’re trying to hit. • Proﬁle your code. You will be surprised by the results. • The gold standard for performance is highly-tuned C (that’s already been written by someone else)
- 26. Resources • Programming Pearls (Jon Bentley) • accidentallyquadratic.tumblr.com • Performance Engineering of Software Systems, 6.172, MIT OpenCourseWare • cProﬁle Docs • Cython Docs • Guido Van Rossum’s advice: python.org/doc/essays/list2str General Python Speciﬁc Contact me: ben@caffeinatedanalytics.com

No public clipboards found for this slide

Be the first to comment