SlideShare a Scribd company logo
1 of 17
Download to read offline
Building SciPy kernels with Pythran
Serge Guelton - Pythran main dev, Namek
Ralf Gommers - SciPy maintainer, Quansight Labs
14 July 2021
SciPy Pythran
Why we embarked on this journey
SciPy contains a lot of algorithmic code. It needs to be fast.
Existing approaches in SciPy were:
● Python: for the glue, and non critical parts
● Cython: for critical parts
● Fortran 77: for very old critical parts
● C & C++: for ultra critical parts :-)
Our goal: make it easier to write fast SciPy kernels!
The Pythran approach
Keep input code portable and high-level:
● takes pure Python code as input
● understands NumPy high-level constructs
● delivers performance by transpiling to C++
But still:
● efficient explicit looping in Python
● without any runtime dependencies
A typical Pythran kernel for SciPy
#pythran export _max_len_seq_inner(intp[], int8[], int, int, int8[])
def _max_len_seq_inner(taps, state, nbits, length, seq):
n_taps = taps.shape[0]
idx = 0
for i in range(length):
feedback = state[idx]
seq[i] = feedback
for ti in range(n_taps):
feedback ^= state[(taps[ti] + idx) % nbits]
state[idx] = feedback
idx = (idx + 1) % nbits
return np.roll(state, -idx, axis=0)
Understands NumPy function calls
⇖
Explicit indexing
Explicit looping
⇐
⇖
⇙
Works in a Jupyter notebook
Easy build system integration
from distutils.core import setup
from pythran.dist import PythranExtension, PythranBuildExt
setup(...,
ext_modules=[PythranExtension("mymodule", ["mymodule.py"])],
cmdclass={"build_ext": PythranBuildExt})
Or precompile to C++ to use with any build system:
$ pythran -E mykernel.py -o mykernel.cpp
Isn’t Cython enough?
Cython is a great tool
● incremental conversion / mixed mode
● great for gluing existing native code/library with Python
● good portability, no runtime requirements
However, keeping in mind our “easier to write” goal:
● still has a non-negligible learning curve
● tends to be closer to C than Python when performance matters
Then what about Numba?
Numba is a great tool
● Just-in-Time compilation
● GPU support
● pure Python syntax
However:
● It has more runtime dependencies
● tends to require lower-level programming for best performance
Comparing Cython, Numba & Pythran
Cython Numba Pythran
Portability ++ + ++
Runtime dependency ++ -- ++
Maturity ++ + +
Maintenance status 0 + +
Features ++ + 0
Ease of use -- ++ +
Debugging & optimization 0 + 0
Size of binaries - ++ +
For all tools: performance excellent, bus factor is ~ 1-2
When do I use which tool?
Our advice:
● For higher-level, pure Python packages: use Numba
● If you have any compiled code in your package:
○ Use Pythran for standalone kernels
○ Use Cython for binding C/C++ code, and in case you need to
interact with the Python or NumPy C API
SciPy build/run-time dependencies
Current Pythran usage in SciPy
One large extension: RBFInterpolator
Several smaller extensions:
$ git grep -l '#pythran'
scipy/optimize/_group_columns.py
scipy/signal/_max_len_seq_inner.py
scipy/signal/_spectral.py
scipy/stats/_hypotests_pythran.py
More PRs in progress
GSoC student - Xingyu Liu
Xingyu is going through SciPy's code base, looking for kernels to
benchmark and accelerate:
● stats.binned_statistic_dd: 2-30x speedup
● stats.somersd: 4-20x speedup
● spatial.SphericalVoronoi.sort_vertices_of_regions: 3x speedup
With more to come; read the blog of her journey at
https://blogs.python-gsoc.org/en/xingyu-lius-blog/
Benefits for SciPy
Key benefit: easiest way to write fast kernels
● Developer experience about as good as with Numba, accessible
to almost every contributor
● It's fast - typically >= Cython, even without SIMD
● Produced binaries are much smaller than those from Cython
● Pythran itself is easy to contribute to, and has a responsive
maintainer
● Build system integration is easy(-ish)
Pythran limitations
There are still gaps in functionality, not all of NumPy is covered:
● numpy.random
● APIs with too much “dynamic behaviour” (e.g., keepdims keyword)
● There is no escape hatch - if it’s not supported, it must be added
to Pythran itself first
● No threading in SciPy. Pythran can use OpenMP, but this is
forbidden in SciPy (only custom thread pools allowed).
● Extra constraint on Windows: must build with clang-cl
Integration status
Currently Pythran is:
● enabled by default in the SciPy build
● Still an optional dependency (to disable: export SCIPY_USE_PYTHRAN=0)
Lessons from the recent SciPy 1.7.0 release:
● A small portability issue on AIX (already resolved)
● Status with PyPy unclear (PyPy + SciPy has other issues, so can’t test)
● Other than that, mostly smooth sailing
Initial integration required two Pythran releases to fix some build issues
Conclusions
SciPy contributors like Pythran! “This is very elegant”,“Surprised it’s that fast”
Initial goal achieved! Pythran is indeed an easier way to write fast kernels
The journey continues! Pythran will likely become a hard build-time
dependency for or after SciPy 1.8.0
Bonus question: can we combine Pythran with CuPy's Python-to-CUDA JIT?
It emits C++ code too, so we could get fast CPU + GPU code like that.

More Related Content

What's hot

Buzzwords Numba Presentation
Buzzwords Numba PresentationBuzzwords Numba Presentation
Buzzwords Numba Presentationkammeyer
 
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZEJeff Squyres
 
OpenDataPlane Testing in Travis
OpenDataPlane Testing in TravisOpenDataPlane Testing in Travis
OpenDataPlane Testing in TravisDmitry Baryshkov
 
PyQt Application Development On Maemo
PyQt Application Development On MaemoPyQt Application Development On Maemo
PyQt Application Development On Maemoachipa
 
Python: Thanks for the memories
Python: Thanks for the memoriesPython: Thanks for the memories
Python: Thanks for the memoriesDanil Ineev
 
MPI Raspberry pi 3 cluster
MPI Raspberry pi 3 clusterMPI Raspberry pi 3 cluster
MPI Raspberry pi 3 clusterArafat Hussain
 
GPU Computing for Data Science
GPU Computing for Data Science GPU Computing for Data Science
GPU Computing for Data Science Domino Data Lab
 
More Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit JunoMore Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit JunoKota Tsuyuzaki
 
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxPractical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxSamsung Open Source Group
 
Everything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI LibraryEverything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI LibraryIntel® Software
 
Python indroduction
Python indroductionPython indroduction
Python indroductionFEG
 
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)Igalia
 
Why Python In Entertainment Industry?
Why Python In Entertainment Industry?Why Python In Entertainment Industry?
Why Python In Entertainment Industry?Shuen-Huei Guan
 
Linaro HPC Workshop Note
Linaro HPC Workshop NoteLinaro HPC Workshop Note
Linaro HPC Workshop NoteLinaro
 
HKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateHKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateLinaro
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyTravis Oliphant
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on AndroidKoan-Sin Tan
 

What's hot (20)

Buzzwords Numba Presentation
Buzzwords Numba PresentationBuzzwords Numba Presentation
Buzzwords Numba Presentation
 
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
 
OpenDataPlane Testing in Travis
OpenDataPlane Testing in TravisOpenDataPlane Testing in Travis
OpenDataPlane Testing in Travis
 
PyQt Application Development On Maemo
PyQt Application Development On MaemoPyQt Application Development On Maemo
PyQt Application Development On Maemo
 
Multicore
MulticoreMulticore
Multicore
 
Python: Thanks for the memories
Python: Thanks for the memoriesPython: Thanks for the memories
Python: Thanks for the memories
 
MPI Raspberry pi 3 cluster
MPI Raspberry pi 3 clusterMPI Raspberry pi 3 cluster
MPI Raspberry pi 3 cluster
 
Numba
NumbaNumba
Numba
 
GPU Computing for Data Science
GPU Computing for Data Science GPU Computing for Data Science
GPU Computing for Data Science
 
More Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit JunoMore Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit Juno
 
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxPractical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
 
Everything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI LibraryEverything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI Library
 
Python indroduction
Python indroductionPython indroduction
Python indroduction
 
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
ARB_gl_spirv: bringing SPIR-V to Mesa OpenGL (FOSDEM 2018)
 
Why Python In Entertainment Industry?
Why Python In Entertainment Industry?Why Python In Entertainment Industry?
Why Python In Entertainment Industry?
 
Linaro HPC Workshop Note
Linaro HPC Workshop NoteLinaro HPC Workshop Note
Linaro HPC Workshop Note
 
Lua vs python
Lua vs pythonLua vs python
Lua vs python
 
HKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateHKG15-110: ODP Project Update
HKG15-110: ODP Project Update
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on Android
 

Similar to Building SciPy kernels with Pythran

PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsHenry Schreiner
 
MicroPython&electronics prezentācija
MicroPython&electronics prezentācija MicroPython&electronics prezentācija
MicroPython&electronics prezentācija CRImier
 
Python 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdfPython 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdfHendri Karisma
 
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...Tsundere Chen
 
Pythonic doesn't mean slow!
Pythonic doesn't mean slow!Pythonic doesn't mean slow!
Pythonic doesn't mean slow!Ronan Lamy
 
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonJen Aman
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
Rejekts 24 EU No GitOps Pain, No Platform Gain
Rejekts 24 EU No GitOps Pain, No Platform GainRejekts 24 EU No GitOps Pain, No Platform Gain
Rejekts 24 EU No GitOps Pain, No Platform GainŁukasz Piątkowski
 
Programming with Python - Basic
Programming with Python - BasicProgramming with Python - Basic
Programming with Python - BasicMosky Liu
 
Python Linters at Scale.pdf
Python Linters at Scale.pdfPython Linters at Scale.pdf
Python Linters at Scale.pdfJimmy Lai
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsGR8Conf
 
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?Tom Paseka
 

Similar to Building SciPy kernels with Pythran (20)

PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
 
MicroPython&electronics prezentācija
MicroPython&electronics prezentācija MicroPython&electronics prezentācija
MicroPython&electronics prezentācija
 
Python 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdfPython 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdf
 
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
 
Python on pi
Python on piPython on pi
Python on pi
 
kpatch.kgraft
kpatch.kgraftkpatch.kgraft
kpatch.kgraft
 
Pythonic doesn't mean slow!
Pythonic doesn't mean slow!Pythonic doesn't mean slow!
Pythonic doesn't mean slow!
 
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And Python
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
Rejekts 24 EU No GitOps Pain, No Platform Gain
Rejekts 24 EU No GitOps Pain, No Platform GainRejekts 24 EU No GitOps Pain, No Platform Gain
Rejekts 24 EU No GitOps Pain, No Platform Gain
 
Programming with Python - Basic
Programming with Python - BasicProgramming with Python - Basic
Programming with Python - Basic
 
Why learn python in 2017?
Why learn python in 2017?Why learn python in 2017?
Why learn python in 2017?
 
Numba lightning
Numba lightningNumba lightning
Numba lightning
 
Python Linters at Scale.pdf
Python Linters at Scale.pdfPython Linters at Scale.pdf
Python Linters at Scale.pdf
 
An Introduction to PyPy
An Introduction to PyPyAn Introduction to PyPy
An Introduction to PyPy
 
PySide
PySidePySide
PySide
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
 
Rusty Python
Rusty PythonRusty Python
Rusty Python
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
 

More from Ralf Gommers

Reliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfReliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfRalf Gommers
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based programRalf Gommers
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with PythonRalf Gommers
 
Python array API standardization - current state and benefits
Python array API standardization - current state and benefitsPython array API standardization - current state and benefits
Python array API standardization - current state and benefitsRalf Gommers
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonRalf Gommers
 
Strengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeStrengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeRalf Gommers
 
PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019Ralf Gommers
 
Inside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeInside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeRalf Gommers
 
The evolution of array computing in Python
The evolution of array computing in PythonThe evolution of array computing in Python
The evolution of array computing in PythonRalf Gommers
 
__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related conceptsRalf Gommers
 
NumPy Roadmap presentation at NumFOCUS Forum
NumPy Roadmap presentation at NumFOCUS ForumNumPy Roadmap presentation at NumFOCUS Forum
NumPy Roadmap presentation at NumFOCUS ForumRalf Gommers
 
NumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionNumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionRalf Gommers
 
SciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeSciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeRalf Gommers
 

More from Ralf Gommers (13)

Reliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfReliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdf
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with Python
 
Python array API standardization - current state and benefits
Python array API standardization - current state and benefitsPython array API standardization - current state and benefits
Python array API standardization - current state and benefits
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Strengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeStrengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond code
 
PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019PyData NYC whatsnew NumPy-SciPy 2019
PyData NYC whatsnew NumPy-SciPy 2019
 
Inside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeInside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decade
 
The evolution of array computing in Python
The evolution of array computing in PythonThe evolution of array computing in Python
The evolution of array computing in Python
 
__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts
 
NumPy Roadmap presentation at NumFOCUS Forum
NumPy Roadmap presentation at NumFOCUS ForumNumPy Roadmap presentation at NumFOCUS Forum
NumPy Roadmap presentation at NumFOCUS Forum
 
NumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionNumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_session
 
SciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeSciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and Code
 

Recently uploaded

SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 

Recently uploaded (20)

SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 

Building SciPy kernels with Pythran

  • 1. Building SciPy kernels with Pythran Serge Guelton - Pythran main dev, Namek Ralf Gommers - SciPy maintainer, Quansight Labs 14 July 2021 SciPy Pythran
  • 2. Why we embarked on this journey SciPy contains a lot of algorithmic code. It needs to be fast. Existing approaches in SciPy were: ● Python: for the glue, and non critical parts ● Cython: for critical parts ● Fortran 77: for very old critical parts ● C & C++: for ultra critical parts :-) Our goal: make it easier to write fast SciPy kernels!
  • 3. The Pythran approach Keep input code portable and high-level: ● takes pure Python code as input ● understands NumPy high-level constructs ● delivers performance by transpiling to C++ But still: ● efficient explicit looping in Python ● without any runtime dependencies
  • 4. A typical Pythran kernel for SciPy #pythran export _max_len_seq_inner(intp[], int8[], int, int, int8[]) def _max_len_seq_inner(taps, state, nbits, length, seq): n_taps = taps.shape[0] idx = 0 for i in range(length): feedback = state[idx] seq[i] = feedback for ti in range(n_taps): feedback ^= state[(taps[ti] + idx) % nbits] state[idx] = feedback idx = (idx + 1) % nbits return np.roll(state, -idx, axis=0) Understands NumPy function calls ⇖ Explicit indexing Explicit looping ⇐ ⇖ ⇙
  • 5. Works in a Jupyter notebook
  • 6. Easy build system integration from distutils.core import setup from pythran.dist import PythranExtension, PythranBuildExt setup(..., ext_modules=[PythranExtension("mymodule", ["mymodule.py"])], cmdclass={"build_ext": PythranBuildExt}) Or precompile to C++ to use with any build system: $ pythran -E mykernel.py -o mykernel.cpp
  • 7. Isn’t Cython enough? Cython is a great tool ● incremental conversion / mixed mode ● great for gluing existing native code/library with Python ● good portability, no runtime requirements However, keeping in mind our “easier to write” goal: ● still has a non-negligible learning curve ● tends to be closer to C than Python when performance matters
  • 8. Then what about Numba? Numba is a great tool ● Just-in-Time compilation ● GPU support ● pure Python syntax However: ● It has more runtime dependencies ● tends to require lower-level programming for best performance
  • 9. Comparing Cython, Numba & Pythran Cython Numba Pythran Portability ++ + ++ Runtime dependency ++ -- ++ Maturity ++ + + Maintenance status 0 + + Features ++ + 0 Ease of use -- ++ + Debugging & optimization 0 + 0 Size of binaries - ++ + For all tools: performance excellent, bus factor is ~ 1-2
  • 10. When do I use which tool? Our advice: ● For higher-level, pure Python packages: use Numba ● If you have any compiled code in your package: ○ Use Pythran for standalone kernels ○ Use Cython for binding C/C++ code, and in case you need to interact with the Python or NumPy C API
  • 12. Current Pythran usage in SciPy One large extension: RBFInterpolator Several smaller extensions: $ git grep -l '#pythran' scipy/optimize/_group_columns.py scipy/signal/_max_len_seq_inner.py scipy/signal/_spectral.py scipy/stats/_hypotests_pythran.py More PRs in progress
  • 13. GSoC student - Xingyu Liu Xingyu is going through SciPy's code base, looking for kernels to benchmark and accelerate: ● stats.binned_statistic_dd: 2-30x speedup ● stats.somersd: 4-20x speedup ● spatial.SphericalVoronoi.sort_vertices_of_regions: 3x speedup With more to come; read the blog of her journey at https://blogs.python-gsoc.org/en/xingyu-lius-blog/
  • 14. Benefits for SciPy Key benefit: easiest way to write fast kernels ● Developer experience about as good as with Numba, accessible to almost every contributor ● It's fast - typically >= Cython, even without SIMD ● Produced binaries are much smaller than those from Cython ● Pythran itself is easy to contribute to, and has a responsive maintainer ● Build system integration is easy(-ish)
  • 15. Pythran limitations There are still gaps in functionality, not all of NumPy is covered: ● numpy.random ● APIs with too much “dynamic behaviour” (e.g., keepdims keyword) ● There is no escape hatch - if it’s not supported, it must be added to Pythran itself first ● No threading in SciPy. Pythran can use OpenMP, but this is forbidden in SciPy (only custom thread pools allowed). ● Extra constraint on Windows: must build with clang-cl
  • 16. Integration status Currently Pythran is: ● enabled by default in the SciPy build ● Still an optional dependency (to disable: export SCIPY_USE_PYTHRAN=0) Lessons from the recent SciPy 1.7.0 release: ● A small portability issue on AIX (already resolved) ● Status with PyPy unclear (PyPy + SciPy has other issues, so can’t test) ● Other than that, mostly smooth sailing Initial integration required two Pythran releases to fix some build issues
  • 17. Conclusions SciPy contributors like Pythran! “This is very elegant”,“Surprised it’s that fast” Initial goal achieved! Pythran is indeed an easier way to write fast kernels The journey continues! Pythran will likely become a hard build-time dependency for or after SciPy 1.8.0 Bonus question: can we combine Pythran with CuPy's Python-to-CUDA JIT? It emits C++ code too, so we could get fast CPU + GPU code like that.