SlideShare a Scribd company logo
Building SciPy kernels with Pythran
Serge Guelton - Pythran main dev, Namek
Ralf Gommers - SciPy maintainer, Quansight Labs
14 July 2021
SciPy Pythran
Why we embarked on this journey
SciPy contains a lot of algorithmic code. It needs to be fast.
Existing approaches in SciPy were:
● Python: for the glue, and non critical parts
● Cython: for critical parts
● Fortran 77: for very old critical parts
● C & C++: for ultra critical parts :-)
Our goal: make it easier to write fast SciPy kernels!
The Pythran approach
Keep input code portable and high-level:
● takes pure Python code as input
● understands NumPy high-level constructs
● delivers performance by transpiling to C++
But still:
● efficient explicit looping in Python
● without any runtime dependencies
A typical Pythran kernel for SciPy
#pythran export _max_len_seq_inner(intp[], int8[], int, int, int8[])
def _max_len_seq_inner(taps, state, nbits, length, seq):
n_taps = taps.shape[0]
idx = 0
for i in range(length):
feedback = state[idx]
seq[i] = feedback
for ti in range(n_taps):
feedback ^= state[(taps[ti] + idx) % nbits]
state[idx] = feedback
idx = (idx + 1) % nbits
return np.roll(state, -idx, axis=0)
Understands NumPy function calls
⇖
Explicit indexing
Explicit looping
⇐
⇖
⇙
Works in a Jupyter notebook
Easy build system integration
from distutils.core import setup
from pythran.dist import PythranExtension, PythranBuildExt
setup(...,
ext_modules=[PythranExtension("mymodule", ["mymodule.py"])],
cmdclass={"build_ext": PythranBuildExt})
Or precompile to C++ to use with any build system:
$ pythran -E mykernel.py -o mykernel.cpp
Isn’t Cython enough?
Cython is a great tool
● incremental conversion / mixed mode
● great for gluing existing native code/library with Python
● good portability, no runtime requirements
However, keeping in mind our “easier to write” goal:
● still has a non-negligible learning curve
● tends to be closer to C than Python when performance matters
Then what about Numba?
Numba is a great tool
● Just-in-Time compilation
● GPU support
● pure Python syntax
However:
● It has more runtime dependencies
● tends to require lower-level programming for best performance
Comparing Cython, Numba & Pythran
Cython Numba Pythran
Portability ++ + ++
Runtime dependency ++ -- ++
Maturity ++ + +
Maintenance status 0 + +
Features ++ + 0
Ease of use -- ++ +
Debugging & optimization 0 + 0
Size of binaries - ++ +
For all tools: performance excellent, bus factor is ~ 1-2
When do I use which tool?
Our advice:
● For higher-level, pure Python packages: use Numba
● If you have any compiled code in your package:
○ Use Pythran for standalone kernels
○ Use Cython for binding C/C++ code, and in case you need to
interact with the Python or NumPy C API
SciPy build/run-time dependencies
Current Pythran usage in SciPy
One large extension: RBFInterpolator
Several smaller extensions:
$ git grep -l '#pythran'
scipy/optimize/_group_columns.py
scipy/signal/_max_len_seq_inner.py
scipy/signal/_spectral.py
scipy/stats/_hypotests_pythran.py
More PRs in progress
GSoC student - Xingyu Liu
Xingyu is going through SciPy's code base, looking for kernels to
benchmark and accelerate:
● stats.binned_statistic_dd: 2-30x speedup
● stats.somersd: 4-20x speedup
● spatial.SphericalVoronoi.sort_vertices_of_regions: 3x speedup
With more to come; read the blog of her journey at
https://blogs.python-gsoc.org/en/xingyu-lius-blog/
Benefits for SciPy
Key benefit: easiest way to write fast kernels
● Developer experience about as good as with Numba, accessible
to almost every contributor
● It's fast - typically >= Cython, even without SIMD
● Produced binaries are much smaller than those from Cython
● Pythran itself is easy to contribute to, and has a responsive
maintainer
● Build system integration is easy(-ish)
Pythran limitations
There are still gaps in functionality, not all of NumPy is covered:
● numpy.random
● APIs with too much “dynamic behaviour” (e.g., keepdims keyword)
● There is no escape hatch - if it’s not supported, it must be added
to Pythran itself first
● No threading in SciPy. Pythran can use OpenMP, but this is
forbidden in SciPy (only custom thread pools allowed).
● Extra constraint on Windows: must build with clang-cl
Integration status
Currently Pythran is:
● enabled by default in the SciPy build
● Still an optional dependency (to disable: export SCIPY_USE_PYTHRAN=0)
Lessons from the recent SciPy 1.7.0 release:
● A small portability issue on AIX (already resolved)
● Status with PyPy unclear (PyPy + SciPy has other issues, so can’t test)
● Other than that, mostly smooth sailing
Initial integration required two Pythran releases to fix some build issues
Conclusions
SciPy contributors like Pythran! “This is very elegant”,“Surprised it’s that fast”
Initial goal achieved! Pythran is indeed an easier way to write fast kernels
The journey continues! Pythran will likely become a hard build-time
dependency for or after SciPy 1.8.0
Bonus question: can we combine Pythran with CuPy's Python-to-CUDA JIT?
It emits C++ code too, so we could get fast CPU + GPU code like that.

More Related Content

What's hot

Buzzwords Numba Presentation
Buzzwords Numba PresentationBuzzwords Numba Presentation
Buzzwords Numba Presentation
kammeyer
 
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
Jeff Squyres
 
OpenDataPlane Testing in Travis
OpenDataPlane Testing in TravisOpenDataPlane Testing in Travis
OpenDataPlane Testing in Travis
Dmitry Baryshkov
 
PyQt Application Development On Maemo
PyQt Application Development On MaemoPyQt Application Development On Maemo
PyQt Application Development On Maemo
achipa
 
Multicore
MulticoreMulticore
Python: Thanks for the memories
Python: Thanks for the memoriesPython: Thanks for the memories
Python: Thanks for the memories
Danil Ineev
 
MPI Raspberry pi 3 cluster
MPI Raspberry pi 3 clusterMPI Raspberry pi 3 cluster
MPI Raspberry pi 3 cluster
Arafat Hussain
 
Numba
NumbaNumba
GPU Computing for Data Science
GPU Computing for Data Science GPU Computing for Data Science
GPU Computing for Data Science
Domino Data Lab
 
More Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit JunoMore Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit Juno
Kota Tsuyuzaki
 
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxPractical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Samsung Open Source Group
 
Everything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI LibraryEverything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI Library
Intel® Software
 
Python indroduction
Python indroductionPython indroduction
Python indroduction
FEG
 
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
Igalia
 
Why Python In Entertainment Industry?
Why Python In Entertainment Industry?Why Python In Entertainment Industry?
Why Python In Entertainment Industry?
Shuen-Huei Guan
 
Linaro HPC Workshop Note
Linaro HPC Workshop NoteLinaro HPC Workshop Note
Linaro HPC Workshop Note
Linaro
 
Lua vs python
Lua vs pythonLua vs python
Lua vs python
HoChul Shin
 
HKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateHKG15-110: ODP Project Update
HKG15-110: ODP Project Update
Linaro
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on Android
Koan-Sin Tan
 

What's hot (20)

Buzzwords Numba Presentation
Buzzwords Numba PresentationBuzzwords Numba Presentation
Buzzwords Numba Presentation
 
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
 
OpenDataPlane Testing in Travis
OpenDataPlane Testing in TravisOpenDataPlane Testing in Travis
OpenDataPlane Testing in Travis
 
PyQt Application Development On Maemo
PyQt Application Development On MaemoPyQt Application Development On Maemo
PyQt Application Development On Maemo
 
Multicore
MulticoreMulticore
Multicore
 
Python: Thanks for the memories
Python: Thanks for the memoriesPython: Thanks for the memories
Python: Thanks for the memories
 
MPI Raspberry pi 3 cluster
MPI Raspberry pi 3 clusterMPI Raspberry pi 3 cluster
MPI Raspberry pi 3 cluster
 
Numba
NumbaNumba
Numba
 
GPU Computing for Data Science
GPU Computing for Data Science GPU Computing for Data Science
GPU Computing for Data Science
 
More Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit JunoMore Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit Juno
 
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxPractical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
 
Everything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI LibraryEverything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI Library
 
Python indroduction
Python indroductionPython indroduction
Python indroduction
 
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
 
Why Python In Entertainment Industry?
Why Python In Entertainment Industry?Why Python In Entertainment Industry?
Why Python In Entertainment Industry?
 
Linaro HPC Workshop Note
Linaro HPC Workshop NoteLinaro HPC Workshop Note
Linaro HPC Workshop Note
 
Lua vs python
Lua vs pythonLua vs python
Lua vs python
 
HKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateHKG15-110: ODP Project Update
HKG15-110: ODP Project Update
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on Android
 

Similar to Building SciPy kernels with Pythran

PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
Henry Schreiner
 
MicroPython&electronics prezentācija
MicroPython&electronics prezentācija MicroPython&electronics prezentācija
MicroPython&electronics prezentācija
CRImier
 
Python 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdfPython 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdf
Hendri Karisma
 
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
Tsundere Chen
 
Python on pi
Python on piPython on pi
Python on pi
swee meng ng
 
kpatch.kgraft
kpatch.kgraftkpatch.kgraft
kpatch.kgraft
Udo Seidel
 
Pythonic doesn't mean slow!
Pythonic doesn't mean slow!Pythonic doesn't mean slow!
Pythonic doesn't mean slow!
Ronan Lamy
 
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And Python
Jen Aman
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Ian Ozsvald
 
Rejekts 24 EU No GitOps Pain, No Platform Gain
Rejekts 24 EU No GitOps Pain, No Platform GainRejekts 24 EU No GitOps Pain, No Platform Gain
Rejekts 24 EU No GitOps Pain, No Platform Gain
Łukasz Piątkowski
 
Programming with Python - Basic
Programming with Python - BasicProgramming with Python - Basic
Programming with Python - Basic
Mosky Liu
 
Why learn python in 2017?
Why learn python in 2017?Why learn python in 2017?
Why learn python in 2017?
Karolis Ramanauskas
 
Numba lightning
Numba lightningNumba lightning
Numba lightning
Travis Oliphant
 
Python Linters at Scale.pdf
Python Linters at Scale.pdfPython Linters at Scale.pdf
Python Linters at Scale.pdf
Jimmy Lai
 
An Introduction to PyPy
An Introduction to PyPyAn Introduction to PyPy
An Introduction to PyPy
Michael Hudson-Doyle
 
PySide
PySidePySide
PySide
OpenBossa
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
GR8Conf
 
Rusty Python
Rusty PythonRusty Python
Rusty Python
RangHo Lee
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Travis Oliphant
 
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
Tom Paseka
 

Similar to Building SciPy kernels with Pythran (20)

PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
 
MicroPython&electronics prezentācija
MicroPython&electronics prezentācija MicroPython&electronics prezentācija
MicroPython&electronics prezentācija
 
Python 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdfPython 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdf
 
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
 
Python on pi
Python on piPython on pi
Python on pi
 
kpatch.kgraft
kpatch.kgraftkpatch.kgraft
kpatch.kgraft
 
Pythonic doesn't mean slow!
Pythonic doesn't mean slow!Pythonic doesn't mean slow!
Pythonic doesn't mean slow!
 
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And Python
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
Rejekts 24 EU No GitOps Pain, No Platform Gain
Rejekts 24 EU No GitOps Pain, No Platform GainRejekts 24 EU No GitOps Pain, No Platform Gain
Rejekts 24 EU No GitOps Pain, No Platform Gain
 
Programming with Python - Basic
Programming with Python - BasicProgramming with Python - Basic
Programming with Python - Basic
 
Why learn python in 2017?
Why learn python in 2017?Why learn python in 2017?
Why learn python in 2017?
 
Numba lightning
Numba lightningNumba lightning
Numba lightning
 
Python Linters at Scale.pdf
Python Linters at Scale.pdfPython Linters at Scale.pdf
Python Linters at Scale.pdf
 
An Introduction to PyPy
An Introduction to PyPyAn Introduction to PyPy
An Introduction to PyPy
 
PySide
PySidePySide
PySide
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
 
Rusty Python
Rusty PythonRusty Python
Rusty Python
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
 

More from Ralf Gommers

Reliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfReliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdf
Ralf Gommers
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
Ralf Gommers
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with Python
Ralf Gommers
 
Python array API standardization - current state and benefits
Python array API standardization - current state and benefitsPython array API standardization - current state and benefits
Python array API standardization - current state and benefits
Ralf Gommers
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
Ralf Gommers
 
Strengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeStrengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond code
Ralf Gommers
 
PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019
Ralf Gommers
 
Inside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeInside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decade
Ralf Gommers
 
The evolution of array computing in Python
The evolution of array computing in PythonThe evolution of array computing in Python
The evolution of array computing in Python
Ralf Gommers
 
__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts
Ralf Gommers
 
NumPy Roadmap presentation at NumFOCUS Forum
NumPy Roadmap presentation at NumFOCUS ForumNumPy Roadmap presentation at NumFOCUS Forum
NumPy Roadmap presentation at NumFOCUS Forum
Ralf Gommers
 
NumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionNumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_session
Ralf Gommers
 
SciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeSciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and Code
Ralf Gommers
 

More from Ralf Gommers (13)

Reliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfReliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdf
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with Python
 
Python array API standardization - current state and benefits
Python array API standardization - current state and benefitsPython array API standardization - current state and benefits
Python array API standardization - current state and benefits
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Strengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeStrengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond code
 
PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019
 
Inside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeInside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decade
 
The evolution of array computing in Python
The evolution of array computing in PythonThe evolution of array computing in Python
The evolution of array computing in Python
 
__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts
 
NumPy Roadmap presentation at NumFOCUS Forum
NumPy Roadmap presentation at NumFOCUS ForumNumPy Roadmap presentation at NumFOCUS Forum
NumPy Roadmap presentation at NumFOCUS Forum
 
NumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionNumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_session
 
SciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeSciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and Code
 

Recently uploaded

KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
aymanquadri279
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 

Recently uploaded (20)

KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 

Building SciPy kernels with Pythran

  • 1. Building SciPy kernels with Pythran Serge Guelton - Pythran main dev, Namek Ralf Gommers - SciPy maintainer, Quansight Labs 14 July 2021 SciPy Pythran
  • 2. Why we embarked on this journey SciPy contains a lot of algorithmic code. It needs to be fast. Existing approaches in SciPy were: ● Python: for the glue, and non critical parts ● Cython: for critical parts ● Fortran 77: for very old critical parts ● C & C++: for ultra critical parts :-) Our goal: make it easier to write fast SciPy kernels!
  • 3. The Pythran approach Keep input code portable and high-level: ● takes pure Python code as input ● understands NumPy high-level constructs ● delivers performance by transpiling to C++ But still: ● efficient explicit looping in Python ● without any runtime dependencies
  • 4. A typical Pythran kernel for SciPy #pythran export _max_len_seq_inner(intp[], int8[], int, int, int8[]) def _max_len_seq_inner(taps, state, nbits, length, seq): n_taps = taps.shape[0] idx = 0 for i in range(length): feedback = state[idx] seq[i] = feedback for ti in range(n_taps): feedback ^= state[(taps[ti] + idx) % nbits] state[idx] = feedback idx = (idx + 1) % nbits return np.roll(state, -idx, axis=0) Understands NumPy function calls ⇖ Explicit indexing Explicit looping ⇐ ⇖ ⇙
  • 5. Works in a Jupyter notebook
  • 6. Easy build system integration from distutils.core import setup from pythran.dist import PythranExtension, PythranBuildExt setup(..., ext_modules=[PythranExtension("mymodule", ["mymodule.py"])], cmdclass={"build_ext": PythranBuildExt}) Or precompile to C++ to use with any build system: $ pythran -E mykernel.py -o mykernel.cpp
  • 7. Isn’t Cython enough? Cython is a great tool ● incremental conversion / mixed mode ● great for gluing existing native code/library with Python ● good portability, no runtime requirements However, keeping in mind our “easier to write” goal: ● still has a non-negligible learning curve ● tends to be closer to C than Python when performance matters
  • 8. Then what about Numba? Numba is a great tool ● Just-in-Time compilation ● GPU support ● pure Python syntax However: ● It has more runtime dependencies ● tends to require lower-level programming for best performance
  • 9. Comparing Cython, Numba & Pythran Cython Numba Pythran Portability ++ + ++ Runtime dependency ++ -- ++ Maturity ++ + + Maintenance status 0 + + Features ++ + 0 Ease of use -- ++ + Debugging & optimization 0 + 0 Size of binaries - ++ + For all tools: performance excellent, bus factor is ~ 1-2
  • 10. When do I use which tool? Our advice: ● For higher-level, pure Python packages: use Numba ● If you have any compiled code in your package: ○ Use Pythran for standalone kernels ○ Use Cython for binding C/C++ code, and in case you need to interact with the Python or NumPy C API
  • 12. Current Pythran usage in SciPy One large extension: RBFInterpolator Several smaller extensions: $ git grep -l '#pythran' scipy/optimize/_group_columns.py scipy/signal/_max_len_seq_inner.py scipy/signal/_spectral.py scipy/stats/_hypotests_pythran.py More PRs in progress
  • 13. GSoC student - Xingyu Liu Xingyu is going through SciPy's code base, looking for kernels to benchmark and accelerate: ● stats.binned_statistic_dd: 2-30x speedup ● stats.somersd: 4-20x speedup ● spatial.SphericalVoronoi.sort_vertices_of_regions: 3x speedup With more to come; read the blog of her journey at https://blogs.python-gsoc.org/en/xingyu-lius-blog/
  • 14. Benefits for SciPy Key benefit: easiest way to write fast kernels ● Developer experience about as good as with Numba, accessible to almost every contributor ● It's fast - typically >= Cython, even without SIMD ● Produced binaries are much smaller than those from Cython ● Pythran itself is easy to contribute to, and has a responsive maintainer ● Build system integration is easy(-ish)
  • 15. Pythran limitations There are still gaps in functionality, not all of NumPy is covered: ● numpy.random ● APIs with too much “dynamic behaviour” (e.g., keepdims keyword) ● There is no escape hatch - if it’s not supported, it must be added to Pythran itself first ● No threading in SciPy. Pythran can use OpenMP, but this is forbidden in SciPy (only custom thread pools allowed). ● Extra constraint on Windows: must build with clang-cl
  • 16. Integration status Currently Pythran is: ● enabled by default in the SciPy build ● Still an optional dependency (to disable: export SCIPY_USE_PYTHRAN=0) Lessons from the recent SciPy 1.7.0 release: ● A small portability issue on AIX (already resolved) ● Status with PyPy unclear (PyPy + SciPy has other issues, so can’t test) ● Other than that, mostly smooth sailing Initial integration required two Pythran releases to fix some build issues
  • 17. Conclusions SciPy contributors like Pythran! “This is very elegant”,“Surprised it’s that fast” Initial goal achieved! Pythran is indeed an easier way to write fast kernels The journey continues! Pythran will likely become a hard build-time dependency for or after SciPy 1.8.0 Bonus question: can we combine Pythran with CuPy's Python-to-CUDA JIT? It emits C++ code too, so we could get fast CPU + GPU code like that.