SlideShare a Scribd company logo
Pythran – C++ for Snakes
Static Compiler for High Performance
Serge Guelton & Pierrick Brunet & Mehdi Amini
PyData - May 4th 2014
Get the slides: http://goo.gl/6dgra0
1 33
Disclaimer
Timings were performed using OSX 10.9, an i7-3740QM CPU @ 2.70GHz, Python
2.7.6, current Pythran (branch Pydata2014), pypy 2.2.1, numba 0.13.1, gcc 4.8, Clang
r207887.
I am not Pythonista, but I’m interested in performance in general.
Daily job: driver-level C code, assembly, multi-threaded C++, GPU, ...
1 33
Disclaimer
Timings were performed using OSX 10.9, an i7-3740QM CPU @ 2.70GHz, Python
2.7.6, current Pythran (branch Pydata2014), pypy 2.2.1, numba 0.13.1, gcc 4.8, Clang
r207887.
I am not Pythonista, but I’m interested in performance in general.
Daily job: driver-level C code, assembly, multi-threaded C++, GPU, ...
Note: I love to troll, so let’s have a
beer later and talk about how awful
Python is! ;-)
1 33
Disclaimer
Timings were performed using OSX 10.9, an i7-3740QM CPU @ 2.70GHz, Python
2.7.6, current Pythran (branch Pydata2014), pypy 2.2.1, numba 0.13.1, gcc 4.8, Clang
r207887.
I am not Pythonista, but I’m interested in performance in general.
Daily job: driver-level C code, assembly, multi-threaded C++, GPU, ...
Note: I love to troll, so let’s have a
beer later and talk about how awful
Python is! ;-)
By the way this talk is written in Latex and takes more than 10 seconds to compile.
2 33
Prototyping Tools for Scientific Computing
2 33
Prototyping Tools for Scientific Computing
3 33
Tools for Scientific Computing in Python
FORTRAN + f2py
C + SWIG
4 33
I Do not Know Much About Python But it Does not Matter Because...
I only care about the white spot.
5 33
Regular IPython Session
>>> import numpy as np
>>> def rosen(x):
... return sum (100.*(x[1:]-x[: -1]**2.)**2.
... + (1-x[: -1])**2.)
>>> import numpy as np
>>> r = np.random.rand (100000)
>>> %timeit rosen(r)
10 loops , best of 3: 35.1 ms per loop
In mathematical optimization, the Rosenbrock function is a non-convex
function used as a performance test problem for optimization algorithms
introduced by Howard H. Rosenbrock in 1960.[1] It is also known as
Rosenbrock’s valley or Rosenbrock’s banana function. (Wikipedia)
6 33
IPython Session with Pythran
>>> %load_ext pythranmagic
>>> %% pythran
import numpy as np
#pythran export rosen(float [])
def rosen(x):
return sum (100.*(x[1:]-x[: -1]**2.)**2.
+ (1-x[: -1])**2.)
>>> import numpy as np
>>> r = np.random.rand (100000)
>>> %timeit rosen(r)
10000 loops , best of 3: 121 us per loop
6 33
IPython Session with Pythran
>>> %load_ext pythranmagic
>>> %% pythran
import numpy as np
#pythran export rosen(float [])
def rosen(x):
return sum (100.*(x[1:]-x[: -1]**2.)**2.
+ (1-x[: -1])**2.)
>>> import numpy as np
>>> r = np.random.rand (100000)
>>> %timeit rosen(r)
10000 loops , best of 3: 121 us per loop
That’s a ×290 speedup!
7 33
Pythran’s Meta Program Generator
Python Module [.py]OpenMP
Pythran
C++ meta-programboost::python pythonic++
g++Type Info NT2
Native Module [.so]
8 33
Pythran Moto
Goals
1. Performance First
sacrifice language feature
2. Subset of Python
backward compatibility matters
pythran ≈ python - useless stuff
(i.e. class, eval, introspection, polymorphic variable)
3. Modular Compilation
focus on numerical kernels
Means
1. Static Compilation
2. Static Optimizations
3. Vectorization & Parallelization
9 33
A Story of Python & C++
def dot(l0 , l1):
return sum(x*y for x,y in zip(l0 ,l1))
template <class T0 , class T1 >
auto dot(T0&& l0 , T1&& l1)
-> decltype(/* ... */)
{
return pythonic ::sum(
pythonic ::map(
operator_ :: multiply (),
pythonic ::zip(std:: forward <T0 >(l0),
std:: forward <T1 >(l1))
) );
}
10 33
C++ as a Back-end
A High Level Language
Rich Standard Template Library
Object Oriented Programming
Meta-Programming
Glimpses of type inference C++11
Variadic Templates C++11
A Low Level Language
Few Hidden Costs: ”you don’t pay for what you don’t use”
Direct access to vector instruction units SSE, AVX, . . .
Good for Parallel Programming OpenMP, TBB, . . .
(and the parallel STL is proposed for C++17)
11 33
Let’s Dive Into the Backend Runtime...
template <class Op , class S0 , class ... Iters >
auto map_aux (Op &op , S0 const &seq , Iters ... iters)
-> sequence <decltype(op(* seq.begin (), *iters ...)) >
{
decltype(_map(op , seq , iters ...)) s;
auto iter = std :: back_inserter (s);
for(auto& iseq : seq)
*iter ++= op(iseq , *iters ++...);
return s;
}
template <class Op , class S0 , class ... SN >
auto map ( Op op , S0 const &seq , SN const &... seqs )
-> decltype(_map( op , seq , seqs.begin ()...))
{
return _map(op , seq , seqs.begin ()...);
}
11 33
Let’s Dive Into the Backend Runtime... I’m Kidding!
template <class Op , class S0 , class ... Iters >
auto map_aux (Op &op , S0 const &seq , Iters ... iters)
-> sequence <decltype(op(* seq.begin (), *iters ...)) >
{
decltype(_map(op , seq , iters ...)) s;
auto iter = std :: back_inserter (s);
for(auto& iseq : seq)
*iter ++= op(iseq , *iters ++...);
return s;
}
template <class Op , class S0 , class ... SN >
auto map ( Op op , S0 const &seq , SN const &... seqs )
-> decltype(_map( op , seq , seqs.begin ()...))
{
return _map(op , seq , seqs.begin ()...);
}
12 33
Static Compilation
Buys time for many time-consuming analyses
Points-to, used-def, variable scope, memory effects, function purity. . .
Unleashes powerful C++ optimizations
Lazy Evaluation, Map Parallelizations, Constant Folding
Requires static type inference
#pythran export foo(int list , float)
Only annotate exported functions!
13 33
Pythran’s Moto 1/3
Gather as many information as possible
(Typing is just one information among others)
14 33
Example of Analysis: Points-to
def foo(a,b):
c = a or b
return c*2
Where does c points to ?
15 33
Example of Analysis: Argument Effects
def fib(n):
return n if n<2 else fib(n-1) + fib(n-2)
def bar(l):
return map(fib , l)
def foo(l):
return map(fib , random.sample(l, 3))
Do fibo, bar and foo update their arguments?
16 33
Example of Analysis: Pure Functions
def f0(a):
return a**2
def f1(a):
b = f0(a)
print b
return b
l = list (...)
map(f0 , l)
map(f1 , l)
Are f0 and f1 pure functions?
17 33
Example of Analysis: Use - Def Chains
a = ’1’
if cond:
a = int(a)
else:
a = 3
print a
a = 4
Which version of a is seen by the print statement?
18 33
Pythran’s Moto 2/3
Gather as many information as possible
Turn them into Code Optimizations!
19 33
Example of Code Optimization: False Polymorphism
a = cos (1)
if cond:
a = str(a)
else:
a = None
foo(a)
Is this code snippet statically typable?
20 33
Example of Code Optimization: Lazy Iterators
def valid_conversion (n):
l = map(math.cos , range(n))
return sum(l)
def invalid_conversion (n):
l = map(math.cos , range(n))
l[0] = 1
return sum(l) + max(l)
Which map can be converted into an imap
21 33
Example of Code Optimization: Constant Folding
def esieve(n):
candidates = range (2, n+1)
return sorted(
set(candidates) - set(p*i
for p in candidates
for i in range(p, n+1))
)
cache = esieve (100)
Can we evalute esieve at compile time?
22 33
Pythran’s Moto 3/3
Gather as many information as possible
Turn them into Code Optimizations
Vectorize! Parallelize!
23 33
Explicit Parallelization
def hyantes(xmin , ymin , xmax , ymax , step , range_ , range_x , range_y , t):
pt = [[0]* range_y for _ in range(range_x )]
#omp parallel for
for i in xrange(range_x ):
for j in xrange(range_y ):
s = 0
for k in t:
tmp = 6368.* math.acos(math.cos(xmin+step*i)* math.cos( k[0] ) *
math.cos(( ymin+step*j)-k[1]) +
math.sin(xmin+step*i)* math.sin(k[0]))
if tmp < range_:
s+=k[2] / (1+ tmp)
pt[i][j] = s
return pt
Tool CPython Pythran OpenMP
Timing 639.0ms 44.8ms 11.2ms
Speedup ×1 ×14.2 ×57.7
24 33
Library Level Optimizations
Numpy is the key
Basic block of Python Scientific Computing
High-level Array Manipulations
Many common functions implemented
1. This smells like FORTRAN
2. For the compiler guy, FORTRAN smells good
3. Unlock vectorization & parallelization of Numpy code!
Cython is known to be ”as fast as C”, the only way we found to beat it is to be ”as fast as Fortran”, hence:
Pythran
25 33
Efficient Numpy Expressions
Expression Templates
1. A classic C++ meta-programming optimization
2. Brings Lazy Evaluation to C++
3. Equivalent to loop fusion
More Optimizations
vectorization through boost::simd and nt2
parallelization through #pragma omp
26 33
Julia Set, a Cython Example
def run_julia(cr , ci , N, bound , lim , cutoff ):
julia = np.empty ((N, N), np.uint32)
grid_x = np.linspace(-bound , bound , N)
t0 = time ()
#omp parallel for
for i, x in enumerate(grid_x ):
for j, y in enumerate(grid_x ):
julia[i,j] = kernel(x, y, cr , ci , lim , cutoff)
return julia , time () - t0
From Scipy2013 Cython Tutorial.
Tool CPython Cython Pythran +OpenMP
Timing 3630ms 4.3ms 3.71ms 1.52ms
Speedup ×1 ×837 ×970 ×2368
27 33
Mandelbrot, a Numba example
@autojit
def mandel(x, y, max_iters ):
"""
Given the real and imaginary parts of a complex number ,
determine if it is a candidate for membership in the Mandelbrot
set given a fixed number of iterations.
"""
i = 0
c = complex(x, y)
z = 0.0j
for i in range(max_iters ):
z = z**2 + c
if abs(z)**2 >= 4:
return i
return 255
@autojit
def create_fractal (min_x , max_x , min_y , max_y , image , iters )
height = image.shape [0]
width = image.shape [1]
pixel_size_x = (max_x - min_x) / width
pixel_size_y = (max_y - min_y) / height
for x in range(width ):
real = min_x + x * pixel_size_x
for y in range(height ):
imag = min_y + y * pixel_size_y
color = mandel(real , imag , iters)
image[y, x] = color
return image
Tool CPython Numba Pythran
Timing 8170ms 56ms 47.2ms
Speedup ×1 ×145 ×173
”g++ -Ofast” here
28 33
Nqueens, to Show Some Cool Python
# Pure -Python implementation of
# itertools. permutations ()
# Why? Because we can :-)
def permutations (iterable , r=None ):
pool = tuple(iterable)
n = len(pool)
if r is None:
r = n
indices = range(n)
cycles = range(n-r+1, n+1)[:: -1]
yield tuple(pool[i] for i in indices [:r])
while n:
for i in reversed(xrange(r)):
cycles[i] -= 1
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
else:
j = cycles[i]
indices[i], indices[-j] = 
indices[-j], indices[i]
yield tuple(pool[i] for i in indices [:r])
break
else:
return
#pythran export n_queens(int)
def n_queens( queen_count ):
"""N-Queens solver.
Args:
queen_count: the number of queens to solve
for. This is also the board size.
Yields:
Solutions to the problem. Each yielded
value is looks like (3, 8, 2, ..., 6)
where each number is the column position
for the queen , and the index into the
tuple indicates the row.
"""
out =list ()
cols = range( queen_count )
for vec in permutations (cols ,None ):
if ( queen_count == len(set(vec[i]+i
for i in cols ))):
out.append(vec)
return out
Solving the NQueen problem, using generator, generator
expression, list comprehension, sets. . .
http://code.google.com/p/unladen-swallow/
Tool CPython PyPy Pythran
Timing 2640.6ms 501.1ms 693.3ms
Speedup ×1 ×5.27 ×3.8
29 33
Numpy Benchmarks
https://github.com/serge-sans-paille/numpy-benchmarks/
Made with Matplotlib last night!
Debian clang version 3.5-1 exp1, x86-64, i7-3520M CPU @ 2.90GHz. No OpenMP/Vectorization for Pythran.
30 33
Crushing the Head of the Snake
I would have titled it ”What every Pythonista should know about Python!”
(hopefully we’ll get the video soon).
31 33
The Compiler Is not the Solution to Keyboard-Chair Interface, Is It?
def total(arr):
s = 0
for j in range(len(arr )):
s += arr[j]
return s
def varsum(arr):
vs = 0
for j in range(len(arr )):
mean = (total(arr) / len(arr ))
vs += (arr[j] - mean) ** 2
return vs
#pythran export stddev(float64 list list)
def stddev(partitions ):
ddof =1
final = 0.0
for part in partitions:
m = total(part) / len(part)
# Find the mean of the entire grup.
gtotal = total ([ total(p) for p in partitions ])
glength = total ([ len(p) for p in partitions ])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m **
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof ))
Version Awful Less Awful OK Differently OK OK Numpy
CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms
Pythran
31 33
The Compiler Is not the Solution to Keyboard-Chair Interface, Is It?
def total(arr):
s = 0
for j in range(len(arr )):
s += arr[j]
return s
def varsum(arr):
vs = 0
for j in range(len(arr )):
mean = (total(arr) / len(arr ))
vs += (arr[j] - mean) ** 2
return vs
#pythran export stddev(float64 list list)
def stddev(partitions ):
ddof =1
final = 0.0
for part in partitions:
m = total(part) / len(part)
# Find the mean of the entire grup.
gtotal = total ([ total(p) for p in partitions ])
glength = total ([ len(p) for p in partitions ])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m **
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof ))
Version Awful Less Awful OK Differently OK OK Numpy
CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms
Pythran 1.38s
31 33
The Compiler Is not the Solution to Keyboard-Chair Interface, Is It?
def total(arr):
s = 0
for j in range(len(arr )):
s += arr[j]
return s
def varsum(arr):
vs = 0
for j in range(len(arr )):
mean = (total(arr) / len(arr ))
vs += (arr[j] - mean) ** 2
return vs
#pythran export stddev(float64 list list)
def stddev(partitions ):
ddof =1
final = 0.0
for part in partitions:
m = total(part) / len(part)
# Find the mean of the entire grup.
gtotal = total ([ total(p) for p in partitions ])
glength = total ([ len(p) for p in partitions ])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m **
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof ))
Version Awful Less Awful OK Differently OK OK Numpy
CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms
Pythran 1.38s 4.7ms
31 33
The Compiler Is not the Solution to Keyboard-Chair Interface, Is It?
def total(arr):
s = 0
for j in range(len(arr )):
s += arr[j]
return s
def varsum(arr):
vs = 0
for j in range(len(arr )):
mean = (total(arr) / len(arr ))
vs += (arr[j] - mean) ** 2
return vs
#pythran export stddev(float64 list list)
def stddev(partitions ):
ddof =1
final = 0.0
for part in partitions:
m = total(part) / len(part)
# Find the mean of the entire grup.
gtotal = total ([ total(p) for p in partitions ])
glength = total ([ len(p) for p in partitions ])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m **
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof ))
Version Awful Less Awful OK Differently OK OK Numpy
CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms
Pythran 1.38s 4.7ms 4.8ms
31 33
The Compiler Is not the Solution to Keyboard-Chair Interface, Is It?
def total(arr):
s = 0
for j in range(len(arr )):
s += arr[j]
return s
def varsum(arr):
vs = 0
for j in range(len(arr )):
mean = (total(arr) / len(arr ))
vs += (arr[j] - mean) ** 2
return vs
#pythran export stddev(float64 list list)
def stddev(partitions ):
ddof =1
final = 0.0
for part in partitions:
m = total(part) / len(part)
# Find the mean of the entire grup.
gtotal = total ([ total(p) for p in partitions ])
glength = total ([ len(p) for p in partitions ])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m **
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof ))
Version Awful Less Awful OK Differently OK OK Numpy
CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms
Pythran 1.38s 4.7ms 4.8ms 5.8ms
31 33
The Compiler Is not the Solution to Keyboard-Chair Interface, Is It?
def total(arr):
s = 0
for j in range(len(arr )):
s += arr[j]
return s
def varsum(arr):
vs = 0
for j in range(len(arr )):
mean = (total(arr) / len(arr ))
vs += (arr[j] - mean) ** 2
return vs
#pythran export stddev(float64 list list)
def stddev(partitions ):
ddof =1
final = 0.0
for part in partitions:
m = total(part) / len(part)
# Find the mean of the entire grup.
gtotal = total ([ total(p) for p in partitions ])
glength = total ([ len(p) for p in partitions ])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m **
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof ))
Version Awful Less Awful OK Differently OK OK Numpy
CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms
Pythran 1.38s 4.7ms 4.8ms 5.8ms 4.7ms
31 33
The Compiler Is not the Solution to Keyboard-Chair Interface, Is It?
def total(arr):
s = 0
for j in range(len(arr )):
s += arr[j]
return s
def varsum(arr):
vs = 0
for j in range(len(arr )):
mean = (total(arr) / len(arr ))
vs += (arr[j] - mean) ** 2
return vs
#pythran export stddev(float64 list list)
def stddev(partitions ):
ddof =1
final = 0.0
for part in partitions:
m = total(part) / len(part)
# Find the mean of the entire grup.
gtotal = total ([ total(p) for p in partitions ])
glength = total ([ len(p) for p in partitions ])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m **
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof ))
Version Awful Less Awful OK Differently OK OK Numpy
CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms
Pythran 1.38s 4.7ms 4.8ms 5.8ms 4.7ms 1.7ms
32 33
Engineering Stuff
Get It
Follow the source: https://github.com/serge-sans-paille/pythran
(we are waiting for your pull requests)
Debian repo: deb http://ridee.enstb.org/debian unstable main
Join us: #pythran on Freenode (very active),
or pythran@freelists.org (not very active)
Available on PyPI, using Python 2.7, +2000 test cases, PEP8 approved, clang++ &
g++ (>= 4.8) friendly, Linux and OSX validated.
33
Conclusion
Compiling Python means more than typing and translating
http://pythonhosted.org/pythran/
33
Conclusion
Compiling Python means more than typing and translating
What next:
Release the PyData version
http://pythonhosted.org/pythran/
33
Conclusion
Compiling Python means more than typing and translating
What next:
Release the PyData version (ooops we’re late)
http://pythonhosted.org/pythran/
33
Conclusion
Compiling Python means more than typing and translating
What next:
Release the PyData version (ooops we’re late I did not specify which year)
User module import (pull request already issued)
http://pythonhosted.org/pythran/
33
Conclusion
Compiling Python means more than typing and translating
What next:
Release the PyData version (ooops we’re late I did not specify which year)
User module import (pull request already issued)
Set-up an daily performance regression test bot (ongoing work with codespeed)
http://pythonhosted.org/pythran/
33
Conclusion
Compiling Python means more than typing and translating
What next:
Release the PyData version (ooops we’re late I did not specify which year)
User module import (pull request already issued)
Set-up an daily performance regression test bot (ongoing work with codespeed)
Re-enable vectorization through boost::simd and nt2 (soon)
http://pythonhosted.org/pythran/
33
Conclusion
Compiling Python means more than typing and translating
What next:
Release the PyData version (ooops we’re late I did not specify which year)
User module import (pull request already issued)
Set-up an daily performance regression test bot (ongoing work with codespeed)
Re-enable vectorization through boost::simd and nt2 (soon)
More Numpy support, start looking into Scipy
http://pythonhosted.org/pythran/
33
Conclusion
Compiling Python means more than typing and translating
What next:
Release the PyData version (ooops we’re late I did not specify which year)
User module import (pull request already issued)
Set-up an daily performance regression test bot (ongoing work with codespeed)
Re-enable vectorization through boost::simd and nt2 (soon)
More Numpy support, start looking into Scipy
Polyhedral transformations (in another life...)
http://pythonhosted.org/pythran/

More Related Content

What's hot

A Speculative Technique for Auto-Memoization Processor with Multithreading
A Speculative Technique for Auto-Memoization Processor with MultithreadingA Speculative Technique for Auto-Memoization Processor with Multithreading
A Speculative Technique for Auto-Memoization Processor with Multithreading
Matsuo and Tsumura lab.
 
Faster Python, FOSDEM
Faster Python, FOSDEMFaster Python, FOSDEM
Faster Python, FOSDEM
Victor Stinner
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
Jun Young Park
 
Yampa AFRP Introduction
Yampa AFRP IntroductionYampa AFRP Introduction
Yampa AFRP Introduction
ChengHui Weng
 
Tiramisu をちょっと、味見してみました。
Tiramisu をちょっと、味見してみました。Tiramisu をちょっと、味見してみました。
Tiramisu をちょっと、味見してみました。
Mr. Vengineer
 
A born-again programmer's odyssey
A born-again programmer's odysseyA born-again programmer's odyssey
A born-again programmer's odyssey
Igor Rivin
 
TCO in Python via bytecode manipulation.
TCO in Python via bytecode manipulation.TCO in Python via bytecode manipulation.
TCO in Python via bytecode manipulation.
lnikolaeva
 
Machine learning with py torch
Machine learning with py torchMachine learning with py torch
Machine learning with py torch
Riza Fahmi
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov
 
Use the Matplotlib, Luke @ PyCon Taiwan 2012
Use the Matplotlib, Luke @ PyCon Taiwan 2012Use the Matplotlib, Luke @ PyCon Taiwan 2012
Use the Matplotlib, Luke @ PyCon Taiwan 2012
Wen-Wei Liao
 
Beyond tf idf why, what & how
Beyond tf idf why, what & howBeyond tf idf why, what & how
Beyond tf idf why, what & how
lucenerevolution
 
How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJIT
Egor Bogatov
 
Brief Introduction to Cython
Brief Introduction to CythonBrief Introduction to Cython
Brief Introduction to Cython
Aleksandar Jelenak
 
.NET 2015: Будущее рядом
.NET 2015: Будущее рядом.NET 2015: Будущее рядом
.NET 2015: Будущее рядом
Andrey Akinshin
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014
Mark Rees
 
Introduction to Objective - C
Introduction to Objective - CIntroduction to Objective - C
Introduction to Objective - C
Jussi Pohjolainen
 
A formalization of complex event stream processing
A formalization of complex event stream processingA formalization of complex event stream processing
A formalization of complex event stream processing
Sylvain Hallé
 
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by OraclesEfficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Vissarion Fisikopoulos
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
Kimikazu Kato
 
SciSmalltalk: Doing Science with Agility
SciSmalltalk: Doing Science with AgilitySciSmalltalk: Doing Science with Agility
SciSmalltalk: Doing Science with Agility
ESUG
 

What's hot (20)

A Speculative Technique for Auto-Memoization Processor with Multithreading
A Speculative Technique for Auto-Memoization Processor with MultithreadingA Speculative Technique for Auto-Memoization Processor with Multithreading
A Speculative Technique for Auto-Memoization Processor with Multithreading
 
Faster Python, FOSDEM
Faster Python, FOSDEMFaster Python, FOSDEM
Faster Python, FOSDEM
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
Yampa AFRP Introduction
Yampa AFRP IntroductionYampa AFRP Introduction
Yampa AFRP Introduction
 
Tiramisu をちょっと、味見してみました。
Tiramisu をちょっと、味見してみました。Tiramisu をちょっと、味見してみました。
Tiramisu をちょっと、味見してみました。
 
A born-again programmer's odyssey
A born-again programmer's odysseyA born-again programmer's odyssey
A born-again programmer's odyssey
 
TCO in Python via bytecode manipulation.
TCO in Python via bytecode manipulation.TCO in Python via bytecode manipulation.
TCO in Python via bytecode manipulation.
 
Machine learning with py torch
Machine learning with py torchMachine learning with py torch
Machine learning with py torch
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
 
Use the Matplotlib, Luke @ PyCon Taiwan 2012
Use the Matplotlib, Luke @ PyCon Taiwan 2012Use the Matplotlib, Luke @ PyCon Taiwan 2012
Use the Matplotlib, Luke @ PyCon Taiwan 2012
 
Beyond tf idf why, what & how
Beyond tf idf why, what & howBeyond tf idf why, what & how
Beyond tf idf why, what & how
 
How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJIT
 
Brief Introduction to Cython
Brief Introduction to CythonBrief Introduction to Cython
Brief Introduction to Cython
 
.NET 2015: Будущее рядом
.NET 2015: Будущее рядом.NET 2015: Будущее рядом
.NET 2015: Будущее рядом
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014
 
Introduction to Objective - C
Introduction to Objective - CIntroduction to Objective - C
Introduction to Objective - C
 
A formalization of complex event stream processing
A formalization of complex event stream processingA formalization of complex event stream processing
A formalization of complex event stream processing
 
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by OraclesEfficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
 
SciSmalltalk: Doing Science with Agility
SciSmalltalk: Doing Science with AgilitySciSmalltalk: Doing Science with Agility
SciSmalltalk: Doing Science with Agility
 

Similar to Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014

SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
South Tyrol Free Software Conference
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
 
Profiling in Python
Profiling in PythonProfiling in Python
Profiling in Python
Fabian Pedregosa
 
Python Programming - IX. On Randomness
Python Programming - IX. On RandomnessPython Programming - IX. On Randomness
Python Programming - IX. On Randomness
Ranel Padon
 
Python高级编程(二)
Python高级编程(二)Python高级编程(二)
Python高级编程(二)
Qiangning Hong
 
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnNumerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Arnaud Joly
 
Time Series Analysis for Network Secruity
Time Series Analysis for Network SecruityTime Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
mrphilroth
 
openMP loop parallelization
openMP loop parallelizationopenMP loop parallelization
openMP loop parallelization
Albert DeFusco
 
SCIPY-SYMPY.pdf
SCIPY-SYMPY.pdfSCIPY-SYMPY.pdf
SCIPY-SYMPY.pdf
FreddyGuzman19
 
Simple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorialSimple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorial
Jin-Hwa Kim
 
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
GeeksLab Odessa
 
Python profiling
Python profilingPython profiling
Python profiling
dreampuf
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
Databricks
 
Writing Faster Python 3
Writing Faster Python 3Writing Faster Python 3
Writing Faster Python 3
Sebastian Witowski
 
Compilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVMCompilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVM
Linaro
 
Pydiomatic
PydiomaticPydiomatic
Pydiomatic
rik0
 
Python idiomatico
Python idiomaticoPython idiomatico
Python idiomatico
PyCon Italia
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
g3_nittala
 
06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptx
MouDhara1
 
Write Python for Speed
Write Python for SpeedWrite Python for Speed
Write Python for Speed
Yung-Yu Chen
 

Similar to Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014 (20)

SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
Profiling in Python
Profiling in PythonProfiling in Python
Profiling in Python
 
Python Programming - IX. On Randomness
Python Programming - IX. On RandomnessPython Programming - IX. On Randomness
Python Programming - IX. On Randomness
 
Python高级编程(二)
Python高级编程(二)Python高级编程(二)
Python高级编程(二)
 
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnNumerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
 
Time Series Analysis for Network Secruity
Time Series Analysis for Network SecruityTime Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
 
openMP loop parallelization
openMP loop parallelizationopenMP loop parallelization
openMP loop parallelization
 
SCIPY-SYMPY.pdf
SCIPY-SYMPY.pdfSCIPY-SYMPY.pdf
SCIPY-SYMPY.pdf
 
Simple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorialSimple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorial
 
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
 
Python profiling
Python profilingPython profiling
Python profiling
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
Writing Faster Python 3
Writing Faster Python 3Writing Faster Python 3
Writing Faster Python 3
 
Compilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVMCompilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVM
 
Pydiomatic
PydiomaticPydiomatic
Pydiomatic
 
Python idiomatico
Python idiomaticoPython idiomatico
Python idiomatico
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
 
06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptx
 
Write Python for Speed
Write Python for SpeedWrite Python for Speed
Write Python for Speed
 

More from PyData

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 

Recently uploaded

Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 

Recently uploaded (20)

Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 

Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014

  • 1. Pythran – C++ for Snakes Static Compiler for High Performance Serge Guelton & Pierrick Brunet & Mehdi Amini PyData - May 4th 2014 Get the slides: http://goo.gl/6dgra0
  • 2. 1 33 Disclaimer Timings were performed using OSX 10.9, an i7-3740QM CPU @ 2.70GHz, Python 2.7.6, current Pythran (branch Pydata2014), pypy 2.2.1, numba 0.13.1, gcc 4.8, Clang r207887. I am not Pythonista, but I’m interested in performance in general. Daily job: driver-level C code, assembly, multi-threaded C++, GPU, ...
  • 3. 1 33 Disclaimer Timings were performed using OSX 10.9, an i7-3740QM CPU @ 2.70GHz, Python 2.7.6, current Pythran (branch Pydata2014), pypy 2.2.1, numba 0.13.1, gcc 4.8, Clang r207887. I am not Pythonista, but I’m interested in performance in general. Daily job: driver-level C code, assembly, multi-threaded C++, GPU, ... Note: I love to troll, so let’s have a beer later and talk about how awful Python is! ;-)
  • 4. 1 33 Disclaimer Timings were performed using OSX 10.9, an i7-3740QM CPU @ 2.70GHz, Python 2.7.6, current Pythran (branch Pydata2014), pypy 2.2.1, numba 0.13.1, gcc 4.8, Clang r207887. I am not Pythonista, but I’m interested in performance in general. Daily job: driver-level C code, assembly, multi-threaded C++, GPU, ... Note: I love to troll, so let’s have a beer later and talk about how awful Python is! ;-) By the way this talk is written in Latex and takes more than 10 seconds to compile.
  • 5. 2 33 Prototyping Tools for Scientific Computing
  • 6. 2 33 Prototyping Tools for Scientific Computing
  • 7. 3 33 Tools for Scientific Computing in Python FORTRAN + f2py C + SWIG
  • 8. 4 33 I Do not Know Much About Python But it Does not Matter Because... I only care about the white spot.
  • 9. 5 33 Regular IPython Session >>> import numpy as np >>> def rosen(x): ... return sum (100.*(x[1:]-x[: -1]**2.)**2. ... + (1-x[: -1])**2.) >>> import numpy as np >>> r = np.random.rand (100000) >>> %timeit rosen(r) 10 loops , best of 3: 35.1 ms per loop In mathematical optimization, the Rosenbrock function is a non-convex function used as a performance test problem for optimization algorithms introduced by Howard H. Rosenbrock in 1960.[1] It is also known as Rosenbrock’s valley or Rosenbrock’s banana function. (Wikipedia)
  • 10. 6 33 IPython Session with Pythran >>> %load_ext pythranmagic >>> %% pythran import numpy as np #pythran export rosen(float []) def rosen(x): return sum (100.*(x[1:]-x[: -1]**2.)**2. + (1-x[: -1])**2.) >>> import numpy as np >>> r = np.random.rand (100000) >>> %timeit rosen(r) 10000 loops , best of 3: 121 us per loop
  • 11. 6 33 IPython Session with Pythran >>> %load_ext pythranmagic >>> %% pythran import numpy as np #pythran export rosen(float []) def rosen(x): return sum (100.*(x[1:]-x[: -1]**2.)**2. + (1-x[: -1])**2.) >>> import numpy as np >>> r = np.random.rand (100000) >>> %timeit rosen(r) 10000 loops , best of 3: 121 us per loop That’s a ×290 speedup!
  • 12. 7 33 Pythran’s Meta Program Generator Python Module [.py]OpenMP Pythran C++ meta-programboost::python pythonic++ g++Type Info NT2 Native Module [.so]
  • 13. 8 33 Pythran Moto Goals 1. Performance First sacrifice language feature 2. Subset of Python backward compatibility matters pythran ≈ python - useless stuff (i.e. class, eval, introspection, polymorphic variable) 3. Modular Compilation focus on numerical kernels Means 1. Static Compilation 2. Static Optimizations 3. Vectorization & Parallelization
  • 14. 9 33 A Story of Python & C++ def dot(l0 , l1): return sum(x*y for x,y in zip(l0 ,l1)) template <class T0 , class T1 > auto dot(T0&& l0 , T1&& l1) -> decltype(/* ... */) { return pythonic ::sum( pythonic ::map( operator_ :: multiply (), pythonic ::zip(std:: forward <T0 >(l0), std:: forward <T1 >(l1)) ) ); }
  • 15. 10 33 C++ as a Back-end A High Level Language Rich Standard Template Library Object Oriented Programming Meta-Programming Glimpses of type inference C++11 Variadic Templates C++11 A Low Level Language Few Hidden Costs: ”you don’t pay for what you don’t use” Direct access to vector instruction units SSE, AVX, . . . Good for Parallel Programming OpenMP, TBB, . . . (and the parallel STL is proposed for C++17)
  • 16. 11 33 Let’s Dive Into the Backend Runtime... template <class Op , class S0 , class ... Iters > auto map_aux (Op &op , S0 const &seq , Iters ... iters) -> sequence <decltype(op(* seq.begin (), *iters ...)) > { decltype(_map(op , seq , iters ...)) s; auto iter = std :: back_inserter (s); for(auto& iseq : seq) *iter ++= op(iseq , *iters ++...); return s; } template <class Op , class S0 , class ... SN > auto map ( Op op , S0 const &seq , SN const &... seqs ) -> decltype(_map( op , seq , seqs.begin ()...)) { return _map(op , seq , seqs.begin ()...); }
  • 17. 11 33 Let’s Dive Into the Backend Runtime... I’m Kidding! template <class Op , class S0 , class ... Iters > auto map_aux (Op &op , S0 const &seq , Iters ... iters) -> sequence <decltype(op(* seq.begin (), *iters ...)) > { decltype(_map(op , seq , iters ...)) s; auto iter = std :: back_inserter (s); for(auto& iseq : seq) *iter ++= op(iseq , *iters ++...); return s; } template <class Op , class S0 , class ... SN > auto map ( Op op , S0 const &seq , SN const &... seqs ) -> decltype(_map( op , seq , seqs.begin ()...)) { return _map(op , seq , seqs.begin ()...); }
  • 18. 12 33 Static Compilation Buys time for many time-consuming analyses Points-to, used-def, variable scope, memory effects, function purity. . . Unleashes powerful C++ optimizations Lazy Evaluation, Map Parallelizations, Constant Folding Requires static type inference #pythran export foo(int list , float) Only annotate exported functions!
  • 19. 13 33 Pythran’s Moto 1/3 Gather as many information as possible (Typing is just one information among others)
  • 20. 14 33 Example of Analysis: Points-to def foo(a,b): c = a or b return c*2 Where does c points to ?
  • 21. 15 33 Example of Analysis: Argument Effects def fib(n): return n if n<2 else fib(n-1) + fib(n-2) def bar(l): return map(fib , l) def foo(l): return map(fib , random.sample(l, 3)) Do fibo, bar and foo update their arguments?
  • 22. 16 33 Example of Analysis: Pure Functions def f0(a): return a**2 def f1(a): b = f0(a) print b return b l = list (...) map(f0 , l) map(f1 , l) Are f0 and f1 pure functions?
  • 23. 17 33 Example of Analysis: Use - Def Chains a = ’1’ if cond: a = int(a) else: a = 3 print a a = 4 Which version of a is seen by the print statement?
  • 24. 18 33 Pythran’s Moto 2/3 Gather as many information as possible Turn them into Code Optimizations!
  • 25. 19 33 Example of Code Optimization: False Polymorphism a = cos (1) if cond: a = str(a) else: a = None foo(a) Is this code snippet statically typable?
  • 26. 20 33 Example of Code Optimization: Lazy Iterators def valid_conversion (n): l = map(math.cos , range(n)) return sum(l) def invalid_conversion (n): l = map(math.cos , range(n)) l[0] = 1 return sum(l) + max(l) Which map can be converted into an imap
  • 27. 21 33 Example of Code Optimization: Constant Folding def esieve(n): candidates = range (2, n+1) return sorted( set(candidates) - set(p*i for p in candidates for i in range(p, n+1)) ) cache = esieve (100) Can we evalute esieve at compile time?
  • 28. 22 33 Pythran’s Moto 3/3 Gather as many information as possible Turn them into Code Optimizations Vectorize! Parallelize!
  • 29. 23 33 Explicit Parallelization def hyantes(xmin , ymin , xmax , ymax , step , range_ , range_x , range_y , t): pt = [[0]* range_y for _ in range(range_x )] #omp parallel for for i in xrange(range_x ): for j in xrange(range_y ): s = 0 for k in t: tmp = 6368.* math.acos(math.cos(xmin+step*i)* math.cos( k[0] ) * math.cos(( ymin+step*j)-k[1]) + math.sin(xmin+step*i)* math.sin(k[0])) if tmp < range_: s+=k[2] / (1+ tmp) pt[i][j] = s return pt Tool CPython Pythran OpenMP Timing 639.0ms 44.8ms 11.2ms Speedup ×1 ×14.2 ×57.7
  • 30. 24 33 Library Level Optimizations Numpy is the key Basic block of Python Scientific Computing High-level Array Manipulations Many common functions implemented 1. This smells like FORTRAN 2. For the compiler guy, FORTRAN smells good 3. Unlock vectorization & parallelization of Numpy code! Cython is known to be ”as fast as C”, the only way we found to beat it is to be ”as fast as Fortran”, hence: Pythran
  • 31. 25 33 Efficient Numpy Expressions Expression Templates 1. A classic C++ meta-programming optimization 2. Brings Lazy Evaluation to C++ 3. Equivalent to loop fusion More Optimizations vectorization through boost::simd and nt2 parallelization through #pragma omp
  • 32. 26 33 Julia Set, a Cython Example def run_julia(cr , ci , N, bound , lim , cutoff ): julia = np.empty ((N, N), np.uint32) grid_x = np.linspace(-bound , bound , N) t0 = time () #omp parallel for for i, x in enumerate(grid_x ): for j, y in enumerate(grid_x ): julia[i,j] = kernel(x, y, cr , ci , lim , cutoff) return julia , time () - t0 From Scipy2013 Cython Tutorial. Tool CPython Cython Pythran +OpenMP Timing 3630ms 4.3ms 3.71ms 1.52ms Speedup ×1 ×837 ×970 ×2368
  • 33. 27 33 Mandelbrot, a Numba example @autojit def mandel(x, y, max_iters ): """ Given the real and imaginary parts of a complex number , determine if it is a candidate for membership in the Mandelbrot set given a fixed number of iterations. """ i = 0 c = complex(x, y) z = 0.0j for i in range(max_iters ): z = z**2 + c if abs(z)**2 >= 4: return i return 255 @autojit def create_fractal (min_x , max_x , min_y , max_y , image , iters ) height = image.shape [0] width = image.shape [1] pixel_size_x = (max_x - min_x) / width pixel_size_y = (max_y - min_y) / height for x in range(width ): real = min_x + x * pixel_size_x for y in range(height ): imag = min_y + y * pixel_size_y color = mandel(real , imag , iters) image[y, x] = color return image Tool CPython Numba Pythran Timing 8170ms 56ms 47.2ms Speedup ×1 ×145 ×173 ”g++ -Ofast” here
  • 34. 28 33 Nqueens, to Show Some Cool Python # Pure -Python implementation of # itertools. permutations () # Why? Because we can :-) def permutations (iterable , r=None ): pool = tuple(iterable) n = len(pool) if r is None: r = n indices = range(n) cycles = range(n-r+1, n+1)[:: -1] yield tuple(pool[i] for i in indices [:r]) while n: for i in reversed(xrange(r)): cycles[i] -= 1 if cycles[i] == 0: indices[i:] = indices[i+1:] + indices[i:i+1] cycles[i] = n - i else: j = cycles[i] indices[i], indices[-j] = indices[-j], indices[i] yield tuple(pool[i] for i in indices [:r]) break else: return #pythran export n_queens(int) def n_queens( queen_count ): """N-Queens solver. Args: queen_count: the number of queens to solve for. This is also the board size. Yields: Solutions to the problem. Each yielded value is looks like (3, 8, 2, ..., 6) where each number is the column position for the queen , and the index into the tuple indicates the row. """ out =list () cols = range( queen_count ) for vec in permutations (cols ,None ): if ( queen_count == len(set(vec[i]+i for i in cols ))): out.append(vec) return out Solving the NQueen problem, using generator, generator expression, list comprehension, sets. . . http://code.google.com/p/unladen-swallow/ Tool CPython PyPy Pythran Timing 2640.6ms 501.1ms 693.3ms Speedup ×1 ×5.27 ×3.8
  • 35. 29 33 Numpy Benchmarks https://github.com/serge-sans-paille/numpy-benchmarks/ Made with Matplotlib last night! Debian clang version 3.5-1 exp1, x86-64, i7-3520M CPU @ 2.90GHz. No OpenMP/Vectorization for Pythran.
  • 36. 30 33 Crushing the Head of the Snake I would have titled it ”What every Pythonista should know about Python!” (hopefully we’ll get the video soon).
  • 37. 31 33 The Compiler Is not the Solution to Keyboard-Chair Interface, Is It? def total(arr): s = 0 for j in range(len(arr )): s += arr[j] return s def varsum(arr): vs = 0 for j in range(len(arr )): mean = (total(arr) / len(arr )) vs += (arr[j] - mean) ** 2 return vs #pythran export stddev(float64 list list) def stddev(partitions ): ddof =1 final = 0.0 for part in partitions: m = total(part) / len(part) # Find the mean of the entire grup. gtotal = total ([ total(p) for p in partitions ]) glength = total ([ len(p) for p in partitions ]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** final += varsum(part) + adj return math.sqrt(final / (glength - ddof )) Version Awful Less Awful OK Differently OK OK Numpy CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms Pythran
  • 38. 31 33 The Compiler Is not the Solution to Keyboard-Chair Interface, Is It? def total(arr): s = 0 for j in range(len(arr )): s += arr[j] return s def varsum(arr): vs = 0 for j in range(len(arr )): mean = (total(arr) / len(arr )) vs += (arr[j] - mean) ** 2 return vs #pythran export stddev(float64 list list) def stddev(partitions ): ddof =1 final = 0.0 for part in partitions: m = total(part) / len(part) # Find the mean of the entire grup. gtotal = total ([ total(p) for p in partitions ]) glength = total ([ len(p) for p in partitions ]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** final += varsum(part) + adj return math.sqrt(final / (glength - ddof )) Version Awful Less Awful OK Differently OK OK Numpy CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms Pythran 1.38s
  • 39. 31 33 The Compiler Is not the Solution to Keyboard-Chair Interface, Is It? def total(arr): s = 0 for j in range(len(arr )): s += arr[j] return s def varsum(arr): vs = 0 for j in range(len(arr )): mean = (total(arr) / len(arr )) vs += (arr[j] - mean) ** 2 return vs #pythran export stddev(float64 list list) def stddev(partitions ): ddof =1 final = 0.0 for part in partitions: m = total(part) / len(part) # Find the mean of the entire grup. gtotal = total ([ total(p) for p in partitions ]) glength = total ([ len(p) for p in partitions ]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** final += varsum(part) + adj return math.sqrt(final / (glength - ddof )) Version Awful Less Awful OK Differently OK OK Numpy CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms Pythran 1.38s 4.7ms
  • 40. 31 33 The Compiler Is not the Solution to Keyboard-Chair Interface, Is It? def total(arr): s = 0 for j in range(len(arr )): s += arr[j] return s def varsum(arr): vs = 0 for j in range(len(arr )): mean = (total(arr) / len(arr )) vs += (arr[j] - mean) ** 2 return vs #pythran export stddev(float64 list list) def stddev(partitions ): ddof =1 final = 0.0 for part in partitions: m = total(part) / len(part) # Find the mean of the entire grup. gtotal = total ([ total(p) for p in partitions ]) glength = total ([ len(p) for p in partitions ]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** final += varsum(part) + adj return math.sqrt(final / (glength - ddof )) Version Awful Less Awful OK Differently OK OK Numpy CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms Pythran 1.38s 4.7ms 4.8ms
  • 41. 31 33 The Compiler Is not the Solution to Keyboard-Chair Interface, Is It? def total(arr): s = 0 for j in range(len(arr )): s += arr[j] return s def varsum(arr): vs = 0 for j in range(len(arr )): mean = (total(arr) / len(arr )) vs += (arr[j] - mean) ** 2 return vs #pythran export stddev(float64 list list) def stddev(partitions ): ddof =1 final = 0.0 for part in partitions: m = total(part) / len(part) # Find the mean of the entire grup. gtotal = total ([ total(p) for p in partitions ]) glength = total ([ len(p) for p in partitions ]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** final += varsum(part) + adj return math.sqrt(final / (glength - ddof )) Version Awful Less Awful OK Differently OK OK Numpy CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms Pythran 1.38s 4.7ms 4.8ms 5.8ms
  • 42. 31 33 The Compiler Is not the Solution to Keyboard-Chair Interface, Is It? def total(arr): s = 0 for j in range(len(arr )): s += arr[j] return s def varsum(arr): vs = 0 for j in range(len(arr )): mean = (total(arr) / len(arr )) vs += (arr[j] - mean) ** 2 return vs #pythran export stddev(float64 list list) def stddev(partitions ): ddof =1 final = 0.0 for part in partitions: m = total(part) / len(part) # Find the mean of the entire grup. gtotal = total ([ total(p) for p in partitions ]) glength = total ([ len(p) for p in partitions ]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** final += varsum(part) + adj return math.sqrt(final / (glength - ddof )) Version Awful Less Awful OK Differently OK OK Numpy CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms Pythran 1.38s 4.7ms 4.8ms 5.8ms 4.7ms
  • 43. 31 33 The Compiler Is not the Solution to Keyboard-Chair Interface, Is It? def total(arr): s = 0 for j in range(len(arr )): s += arr[j] return s def varsum(arr): vs = 0 for j in range(len(arr )): mean = (total(arr) / len(arr )) vs += (arr[j] - mean) ** 2 return vs #pythran export stddev(float64 list list) def stddev(partitions ): ddof =1 final = 0.0 for part in partitions: m = total(part) / len(part) # Find the mean of the entire grup. gtotal = total ([ total(p) for p in partitions ]) glength = total ([ len(p) for p in partitions ]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** final += varsum(part) + adj return math.sqrt(final / (glength - ddof )) Version Awful Less Awful OK Differently OK OK Numpy CPython 127s 150ms 54.3ms 53.8ms 47.6ms 8.2ms Pythran 1.38s 4.7ms 4.8ms 5.8ms 4.7ms 1.7ms
  • 44. 32 33 Engineering Stuff Get It Follow the source: https://github.com/serge-sans-paille/pythran (we are waiting for your pull requests) Debian repo: deb http://ridee.enstb.org/debian unstable main Join us: #pythran on Freenode (very active), or pythran@freelists.org (not very active) Available on PyPI, using Python 2.7, +2000 test cases, PEP8 approved, clang++ & g++ (>= 4.8) friendly, Linux and OSX validated.
  • 45. 33 Conclusion Compiling Python means more than typing and translating http://pythonhosted.org/pythran/
  • 46. 33 Conclusion Compiling Python means more than typing and translating What next: Release the PyData version http://pythonhosted.org/pythran/
  • 47. 33 Conclusion Compiling Python means more than typing and translating What next: Release the PyData version (ooops we’re late) http://pythonhosted.org/pythran/
  • 48. 33 Conclusion Compiling Python means more than typing and translating What next: Release the PyData version (ooops we’re late I did not specify which year) User module import (pull request already issued) http://pythonhosted.org/pythran/
  • 49. 33 Conclusion Compiling Python means more than typing and translating What next: Release the PyData version (ooops we’re late I did not specify which year) User module import (pull request already issued) Set-up an daily performance regression test bot (ongoing work with codespeed) http://pythonhosted.org/pythran/
  • 50. 33 Conclusion Compiling Python means more than typing and translating What next: Release the PyData version (ooops we’re late I did not specify which year) User module import (pull request already issued) Set-up an daily performance regression test bot (ongoing work with codespeed) Re-enable vectorization through boost::simd and nt2 (soon) http://pythonhosted.org/pythran/
  • 51. 33 Conclusion Compiling Python means more than typing and translating What next: Release the PyData version (ooops we’re late I did not specify which year) User module import (pull request already issued) Set-up an daily performance regression test bot (ongoing work with codespeed) Re-enable vectorization through boost::simd and nt2 (soon) More Numpy support, start looking into Scipy http://pythonhosted.org/pythran/
  • 52. 33 Conclusion Compiling Python means more than typing and translating What next: Release the PyData version (ooops we’re late I did not specify which year) User module import (pull request already issued) Set-up an daily performance regression test bot (ongoing work with codespeed) Re-enable vectorization through boost::simd and nt2 (soon) More Numpy support, start looking into Scipy Polyhedral transformations (in another life...) http://pythonhosted.org/pythran/