SlideShare a Scribd company logo
© 2017 Continuum Analytics - Confidential & Proprietary© 2018 Quansight - Confidential & Proprietary
Extending Python: Past, Present, and Future
PyCon EE
September 2019
Python and in particular PyData keeps Growing
Python now most popular
1998 20182001
2009 20122005
A brief history of [my] time [with Python]
2010 2016
Where I started
Started as my graduate student
“procrastination project” (as Multipack)
in 1998 and became SciPy in 2001 with
the help of colleagues.
108 releases, 766 contributors
Used by: 128,495
Pearu Peterson
Estonia was critical
To both SciPy and
Where it led for me
Gave up my chance at a tenured academic
position in 2005-2006 to bring together the
diverging array community in Python and unify
Numeric and Numarray.
159 releases, 827 contributors
Used by: 254,856
What amplified data science
Created by Wes McKinney. Also, AQR agreed to
release this data-frame he started at AQR (while
dozens of other data-frames in hedge-funds and
investment banks did not get open-sourced)
106 releases, 1601 contributors
Used by: 139,133
Why Python for ML?
Created by David Cournapeau as Google Summer
of Code Project and then quickly added to by
100s of researchers around the world. Supported
100 releases, 1433 contributors
Used by: 70,287
First DL Framework in Python
Built at Université de Montréal by Frédéric
Bastien and his students. Many contributors.
Forms foundation for PyMC3 and other libraries.
33 releases, 332 contributors
Used by: 6,194
Adapted from Jake Vanderplas
PyCon 2017 Keynote
Python’s Scientific Ecosystem
Jake Vanderplas PyCon 2017 Keynote
Keys to Python Success
Keys to Python Success
Modular Extensibility
New Types and Functions
Protocol Overloading (i.e. “dunder” methods)
Modular Extensibility
Modules Packages
>>> import numpy
>>> numpy.__file__
>>> numpy.__path__
>>> numpy.linalg.__file__
>>> import math
>>> math.__file__
>>> import os
>>> os.__file__
a = 3
b = 4
def cross(x,y):
Return a*x + b*y
>>> import my_module
>>> my_module.__file__
>>> ks = my_module.__dict__.keys()
>>> [y for y in ks
if not y.startswith('__')]
['a', 'b', 'cross']
subpackages = []
for name in dir(numpy):
obj = getattr(numpy, name)
if hasattr(obj, '__file__') and 
>>> print subpackages
New types New functions
class Node:
def __init__(self, item, parent=None):
self.item = item
self.children = []
if parent is not None:
from math import sqrt
def kurtosis(data):
N = len(data)
mean = sum(data)/N
std = sqrt(sum((x-mean)**2 for x in data)/N)
zi = ((x-mean)/std for x in data)
return sum(z**4 for z in zi)/N - 3
>>> g = Node(“Root”)
>>> type(g)
>>> type(g).__mro__
(__main__.Node, object)
>>> type(Node).__mro__
(type, object)
>>> type(3)
>>> type(3).__mro__
(int, object)
>> type(int).__mro__
(type, object)
>>> type(kurtosis)
>>> type(sqrt)
>>> type(sum)
>>> import numpy; type(numpy.add)
New Types and Functions
Protocol Overloading
__str__, __new__, __doc__, __del__,
__init__, __repr__, __setattr__
__getattribute__, __delattr__,
__hash__, __reduce__, __class__,
__dir__, __format__, __reduce_ex__,
__call__, __enter__, __exit__,
__next__, __dict__, __slots__
__getitem__, __setitem__,
__delitem__, __contains__,
__iter__, __reversed__,
__abs__, __add__, __and__, __bool__,
__ceil__, __divmod__, __eq__, __float__,
__floor__, __floordiv__, __ge__, __gt__,
__index__, __int__, __invert__, __le__,
__lshift__, __lt__, __mod__, __mul__,
__ne__, __neg__, __or__, __pos__, __pow__,
__radd__, __rand__, __rdivmod__,
__rfloordiv__, __rlshift__, __rmod__,
__rmul__, __ror__, __round__, __rpow__,
__rrshift__, __rshift__, __rsub__,
__rtruediv__, __rxor__, __truediv__,
__trunc__, __xor__
C/C++ — Cython, Numba, CFFI, ctypes, boost.python, pybind11
Fortran — f2py
JAVA — Py4J, JPyPe, javabridge
C#/.NET — Python for .NET (pythonnet)
An Opinionated List (there are others)
Rust — PyO3, Rust-CPython
Extending Python in the Past
First problem: Efficient Data Input
The first step is to get the data right
“It’s Always About the Data”
Reference Counting Essay
May 1998
Guido van Rossum
April 1998
Michael A. Miller
June 1998
A walk through bitarray
Ilan Schnell
Built all first versions
of Anaconda
bitarray: efficient arrays of booleans
Note: docstrings removed!
>>> import bitarray
>>> bitarray.bitarray.__mro__
(bitarray.bitarray, bitarray._bitarray, object)
>>> type(bitarray.bits2bytes)
>>> bitarray._sysinfo()
(8, 8, 8, 8, 9223372036854775807
>>> 2**63 - 1
Function table for module
Function table for module (Python 2)
Add new _bitarray “built-in” type}
Python Functions in C
Single Argument Function because METH_O
_bitarray type def
Must ParseTuple to get arguments
| METH_KEYWORDS to accept a 3rd
dictionary argument to function.
Expands to PyVarObject ob_base;
Powerful but requires care!
• Reference counting (you have to do this manually)
• Error handling (can be tedious)
• Initialization (can byte you badly if you aren’t careful)
• Other run-times (PyPy, RustPython) can’t easily use
your tool.
• You have access to all the machinery Python itself
uses to create all of its own builtins.
• You are literally extending Python with new builtin
types and functions.
• Incredible speed as fast as machine can work.
Extending Python Today
What should you do today?
• Just write your code in Python and use existing extensions.
• If More Speed is needed:
My opinionated modern view
• Use Numba
• Use Cython
• Use mypy (and eventually mypyc)
• Run with PyPy
• Use Rust and PyO3
• Or if few existing extensions being used:
• An open-source, function-at-a-time compiler library for Python
• Compiler toolbox for different targets and execution models:
• single-threaded CPU, multi-threaded CPU, GPU
• regular functions, “universal functions” (array functions), etc
• Speedup: 2x (compared to basic NumPy code) to 200x (compared to pure
• Combine ease of writing Python with speeds approaching FORTRAN
• Empowers scientists who make tools for themselves and other scientists
Numba: A JIT Compiler for Python
7 things about Numba you may not know
Numba is 100% Open Source
Numba + Jupyter = Rapid
CUDA Prototyping
Numba can compile for the
CPU and the GPU at the same time
Numba makes array processing
easy with @(gu)vectorize
Numba comes with a
CUDA Simulator
You can send Numba
functions over the network
Numba has typed Lists and
Dictionaries (soon)
Numba (compile Python to CPUs and GPUs)
conda install numba
Code Generation
How does Numba work?
Python Function
Numba IR
Rewrite IR
def do_math(a, b):
>>> do_math(x, y)
Supported Platforms and Hardware

(7 and later)
32 and 64-bit CPUs (Incl
Xeon Phi)
Python 2.7, 3.4-3.7

(10.9 and later)
CUDA & HSA GPUs NumPy 1.10 and later

(RHEL 6 and later)
Some support for ARM and
Basic Example
Basic Example
Array Allocation
Looping over ndarray x as an iterator
Using numpy math functions
Returning a slice of the array
2.7x speedup!
Numba decorator

(nopython=True not required)
• Detects CPU model during compilation and optimizes for that target
• Automatic type inference: No need to give type signatures for functions
• Dispatches to multiple type-specializations for the same function
• Call out to C libraries with CFFI and types
• Special "callback" mode for creating C callbacks to use with external
• Optional caching to disk, and ahead-of-time creation of shared libraries
• Compiler is extensible with new data types and functions
Numba Features
• Three main technologies for parallelism:
Parallel Computing
SIMD Multi-threading Distributed Computing
x0x1x2x3 x0x1x2x3
x2 x1
• Numba's CPU detection will enable
LLVM to autovectorize for
appropriate SIMD instruction set:
• SSE, AVX, AVX2, AVX-512
• Will become even more important
as AVX-512 is now available on
both Xeon Phi and Skylake Xeon
SIMD: Single Instruction Multiple Data
Manual Multithreading: Release the GIL
SpeedupRatio 0
Number of Threads
1 2 4
Option to release the GIL
Using Python
Universal Functions (Ufuncs)
Ufuncs are a core concept in NumPy for array-oriented
◦ A function with scalar inputs is broadcast across the elements of
the input arrays:
• np.add([1,2,3], 3) == [4, 5, 6]
• np.add([1,2,3], [10, 20, 30]) == [11, 22, 33]
◦ Parallelism is present, by construction. Numba will generate
loops and can automatically multi-thread if requested.
◦ Before Numba, creating fast ufuncs required writing C. No
Universal Functions (Ufuncs)
Different decorator!
1.8x speedup!
Multi-threaded Ufuncs
Specify type signature
Select parallel target
Automatically uses all CPU cores!
• ParallelAccelerator is a special compiler pass contributed by Intel Labs
• Todd A. Anderson, Ehsan Totoni, Paul Liu
• Based on similar contribution to Julia
• Automatically generates mulithreaded code in a Numba compiled-
• Array expressions and reductions
• Random functions
• Dot products
• Explicit loops indicated with prange() call
ParallelAccelerator: Example #1
NumPy Numba Numba+PA
1000000x10 input,
Core i7 Quad Core CPU
ParallelAccelerator: prange()
NumPy Numba Numba+PA
1000000x10 input,
Core i7 Quad Core CPU
Cython is Python with C data types
Basic use
Create a text file with a .pyx extension along with a
Hint: can use %%cython magic in notebooks
After %load_ext Cython
Borrowed from Cython documentation
Type definitions
Convert to Python list
Some of the C++ stdlib is available
Auto-conversion on return
Creates an extension type (like _bitarray)
Write fast functions that work
on anything supporting PEP3118
buffer protocol.
mypyc is a compiler that compiles mypy-annotated, statically typed
Python modules into CPython C extensions.
• Most type annotations are enforced at runtime (raising TypeError on mismatch)

• Classes are compiled into extension classes without __dict__ (much, but not quite, like if they used __slots__)

• Monkey patching doesn't work

• Instance attributes won't fall back to class attributes if undefined

• Metaclasses not supported

• Also there are still a bunch of bad bugs and unsupported features :)
Still Experimental!
Extending Python in the
iOS?Most of these are CPython only!
These extensions are an anchor to
Python runtime progress!
CPython C-API
What do we need?
•A way to extend Python that targets multiple run-
times by default (at least PyPy, CPython,
RustPython) with the ability to add new run-times
•Use a subset of typed-Python to do it — i.e. a
domain-specific extension language in Python
•Need NumPy, Pandas, SciPy, Scikit-Learn, and
more to use this approach (this will take time)
Early Hope
A Bold Proposal
• Create a Cython-like tool that uses mypy typing
• Borrow heavily from Cython ideas but start a new
project that could be pulled into Python itself.
• At the same time work from below to continue the
clean-up of CPython C-API that has already started.
Need ~$5million commitment for a 3-year project to start this
• Core team of 5+ devs with 1 lead
• 1/2 time project manager and PSF representative
• 3+ community liaisons and developer evangelists
• Start with $500k Phase 0 to prove the idea
• Get total funding from at least 20 companies:
$25k initial buy-in, at least $250k
commitment over 3 years to start the effort.
• Allow up to $100k initial and $1million
• Paying participants get project-management
attention and early easy-to-use runtimes and
binary extensions delivered with ability to set
priorities (plus marketing and the knowledge
they are leading Python forward).
Work Order
• We have the people in our network of
• We have a sales and marketing team
that will pitch this.
• We are just rolling out the proposal.
We can do this!
A new platform to help open-source projects and
developers thrive professionally and financially.
Sign up to:
• build your open-
source portfolio
• show which
projects you use
• thank contributors
for projects you
• (soon) get
connected to
initiatives like the
one to make
Python universally
Sustaining the Future
Open-source innovation and
maintenance around the entire data-
science and AI workflow.
• NumPy ecosystem maintenance
• Maintenance and support with PyData core team
• Improve connection of NumPy to ML Frameworks
• GPU Support for NumPy Ecosystem
• Improve foundations of Array computing
• JupyterLab
• Data Catalog standards
• Packaging (conda-forge, PyPA, etc.)
uarray — unified array interface and symbolic NumPy
xnd — re-factored NumPy (low-level cross-language
libraries for N-D (tensor) computing)
Partnered with NumFOCUS
and Ursa Labs (supporting
Apache Arrow)

More Related Content

What's hot

SciPy 2019: How to Accelerate an Existing Codebase with Numba
SciPy 2019: How to Accelerate an Existing Codebase with NumbaSciPy 2019: How to Accelerate an Existing Codebase with Numba
SciPy 2019: How to Accelerate an Existing Codebase with Numba
Python as the Zen of Data Science
Python as the Zen of Data SciencePython as the Zen of Data Science
Python as the Zen of Data Science
Travis Oliphant
SciPy 2010 Review
SciPy 2010 ReviewSciPy 2010 Review
SciPy 2010 Review
Enthought, Inc.
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)
Hansol Kang
Getting started with TensorFlow
Getting started with TensorFlowGetting started with TensorFlow
Getting started with TensorFlow
Kenta Oono
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
Albert Bifet
Python in big data world
Python in big data worldPython in big data world
Python in big data world
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Intro to Python
Intro to PythonIntro to Python
Sea Amsterdam 2014 November 19
Sea Amsterdam 2014 November 19Sea Amsterdam 2014 November 19
Sea Amsterdam 2014 November 19
Introduction To TensorFlow
Introduction To TensorFlowIntroduction To TensorFlow
Introduction To TensorFlow
TensorFlow 101
TensorFlow 101TensorFlow 101
TensorFlow 101
Raghu Rajah
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
Sri Ambati
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
Ganesan Narayanasamy
Python array API standardization - current state and benefits
Python array API standardization - current state and benefitsPython array API standardization - current state and benefits
Python array API standardization - current state and benefits
Ralf Gommers
IPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for HadoopIPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for Hadoop
DataWorks Summit
RDM 2020: Python, Numpy, and Pandas
RDM 2020: Python, Numpy, and PandasRDM 2020: Python, Numpy, and Pandas
RDM 2020: Python, Numpy, and Pandas
Henry Schreiner
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data Platform
Shivaji Dutta

What's hot (20)

SciPy 2019: How to Accelerate an Existing Codebase with Numba
SciPy 2019: How to Accelerate an Existing Codebase with NumbaSciPy 2019: How to Accelerate an Existing Codebase with Numba
SciPy 2019: How to Accelerate an Existing Codebase with Numba
Python as the Zen of Data Science
Python as the Zen of Data SciencePython as the Zen of Data Science
Python as the Zen of Data Science
SciPy 2010 Review
SciPy 2010 ReviewSciPy 2010 Review
SciPy 2010 Review
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)
Getting started with TensorFlow
Getting started with TensorFlowGetting started with TensorFlow
Getting started with TensorFlow
Session 2
Session 2Session 2
Session 2
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
Python in big data world
Python in big data worldPython in big data world
Python in big data world
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Intro to Python
Intro to PythonIntro to Python
Intro to Python
Sea Amsterdam 2014 November 19
Sea Amsterdam 2014 November 19Sea Amsterdam 2014 November 19
Sea Amsterdam 2014 November 19
Introduction To TensorFlow
Introduction To TensorFlowIntroduction To TensorFlow
Introduction To TensorFlow
TensorFlow 101
TensorFlow 101TensorFlow 101
TensorFlow 101
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
Python array API standardization - current state and benefits
Python array API standardization - current state and benefitsPython array API standardization - current state and benefits
Python array API standardization - current state and benefits
IPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for HadoopIPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for Hadoop
RDM 2020: Python, Numpy, and Pandas
RDM 2020: Python, Numpy, and PandasRDM 2020: Python, Numpy, and Pandas
RDM 2020: Python, Numpy, and Pandas
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data Platform

Similar to PyCon Estonia 2019

Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
Travis Oliphant
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Travis Oliphant
Python Libraries and Modules
Python Libraries and ModulesPython Libraries and Modules
Python Libraries and Modules
Role of python in hpc
Role of python in hpcRole of python in hpc
Role of python in hpc
Dr Reeja S R
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Ian Ozsvald
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to pythonActiveState
Scientific Python
Scientific PythonScientific Python
Scientific Python
Eueung Mulyana
Tips and tricks for data science projects with Python
Tips and tricks for data science projects with Python Tips and tricks for data science projects with Python
Tips and tricks for data science projects with Python
Jose Manuel Ortega Candel
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
Innfinision Cloud and BigData Solutions
Ali Hallaji
Elasticwulf Pycon Talk
Elasticwulf Pycon TalkElasticwulf Pycon Talk
Elasticwulf Pycon Talk
Peter Skomoroch
London level39
London level39London level39
London level39
Travis Oliphant
Python library
Python libraryPython library
Python library
xlwings - For Python Quants Conference (London 2014)
xlwings - For Python Quants Conference (London 2014)xlwings - For Python Quants Conference (London 2014)
xlwings - For Python Quants Conference (London 2014)

Similar to PyCon Estonia 2019 (20)

Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Python Libraries and Modules
Python Libraries and ModulesPython Libraries and Modules
Python Libraries and Modules
Role of python in hpc
Role of python in hpcRole of python in hpc
Role of python in hpc
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
Scientific Python
Scientific PythonScientific Python
Scientific Python
Tips and tricks for data science projects with Python
Tips and tricks for data science projects with Python Tips and tricks for data science projects with Python
Tips and tricks for data science projects with Python
Py tables
Py tablesPy tables
Py tables
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
Elasticwulf Pycon Talk
Elasticwulf Pycon TalkElasticwulf Pycon Talk
Elasticwulf Pycon Talk
London level39
London level39London level39
London level39
Python library
Python libraryPython library
Python library
xlwings - For Python Quants Conference (London 2014)
xlwings - For Python Quants Conference (London 2014)xlwings - For Python Quants Conference (London 2014)
xlwings - For Python Quants Conference (London 2014)

More from Travis Oliphant

PyData Barcelona Keynote
PyData Barcelona KeynotePyData Barcelona Keynote
PyData Barcelona Keynote
Travis Oliphant
Python for Data Science with Anaconda
Python for Data Science with AnacondaPython for Data Science with Anaconda
Python for Data Science with Anaconda
Travis Oliphant
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
Travis Oliphant
Scaling PyData Up and Out
Scaling PyData Up and OutScaling PyData Up and Out
Scaling PyData Up and Out
Travis Oliphant
Anaconda and PyData Solutions
Anaconda and PyData SolutionsAnaconda and PyData Solutions
Anaconda and PyData Solutions
Travis Oliphant
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
Travis Oliphant
Bids talk 9.18
Bids talk 9.18Bids talk 9.18
Bids talk 9.18
Travis Oliphant
Effectively using Open Source with conda
Effectively using Open Source with condaEffectively using Open Source with conda
Effectively using Open Source with conda
Travis Oliphant
Blaze: a large-scale, array-oriented infrastructure for Python
Blaze: a large-scale, array-oriented infrastructure for PythonBlaze: a large-scale, array-oriented infrastructure for Python
Blaze: a large-scale, array-oriented infrastructure for Python
Travis Oliphant
PyData Introduction
PyData IntroductionPyData Introduction
PyData Introduction
Travis Oliphant

More from Travis Oliphant (11)

PyData Barcelona Keynote
PyData Barcelona KeynotePyData Barcelona Keynote
PyData Barcelona Keynote
Python for Data Science with Anaconda
Python for Data Science with AnacondaPython for Data Science with Anaconda
Python for Data Science with Anaconda
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
Scaling PyData Up and Out
Scaling PyData Up and OutScaling PyData Up and Out
Scaling PyData Up and Out
Anaconda and PyData Solutions
Anaconda and PyData SolutionsAnaconda and PyData Solutions
Anaconda and PyData Solutions
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
Bids talk 9.18
Bids talk 9.18Bids talk 9.18
Bids talk 9.18
Effectively using Open Source with conda
Effectively using Open Source with condaEffectively using Open Source with conda
Effectively using Open Source with conda
Blaze: a large-scale, array-oriented infrastructure for Python
Blaze: a large-scale, array-oriented infrastructure for PythonBlaze: a large-scale, array-oriented infrastructure for Python
Blaze: a large-scale, array-oriented infrastructure for Python
Numba lightning
Numba lightningNumba lightning
Numba lightning
PyData Introduction
PyData IntroductionPyData Introduction
PyData Introduction

Recently uploaded

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson

Recently uploaded (20)

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

PyCon Estonia 2019

  • 1. © 2017 Continuum Analytics - Confidential & Proprietary© 2018 Quansight - Confidential & Proprietary Extending Python: Past, Present, and Future Quansight @quansightai @teoliphant PyCon EE September 2019
  • 2. Python and in particular PyData keeps Growing
  • 4. 1998 20182001 2015 2009 20122005 … 2001 2006 A brief history of [my] time [with Python] 1991 2003 2014 2008 2010 2016 2009
  • 5. Where I started Started as my graduate student “procrastination project” (as Multipack) in 1998 and became SciPy in 2001 with the help of colleagues. 108 releases, 766 contributors Used by: 128,495 Pearu Peterson Estonia was critical To both SciPy and NumPy
  • 6. Where it led for me Gave up my chance at a tenured academic position in 2005-2006 to bring together the diverging array community in Python and unify Numeric and Numarray. 159 releases, 827 contributors Used by: 254,856
  • 7. What amplified data science Created by Wes McKinney. Also, AQR agreed to release this data-frame he started at AQR (while dozens of other data-frames in hedge-funds and investment banks did not get open-sourced) 106 releases, 1601 contributors Used by: 139,133
  • 8. Why Python for ML? Created by David Cournapeau as Google Summer of Code Project and then quickly added to by 100s of researchers around the world. Supported by INRIA. 100 releases, 1433 contributors Used by: 70,287
  • 9. First DL Framework in Python Built at Université de Montréal by Frédéric Bastien and his students. Many contributors. Forms foundation for PyMC3 and other libraries. 33 releases, 332 contributors Used by: 6,194
  • 10. Bokeh Adapted from Jake Vanderplas PyCon 2017 Keynote
  • 11. Python’s Scientific Ecosystem Bokeh Jake Vanderplas PyCon 2017 Keynote
  • 12. Keys to Python Success
  • 13. Keys to Python Success Modular Extensibility New Types and Functions Protocol Overloading (i.e. “dunder” methods) Interoperability
  • 14. Modular Extensibility Modules Packages >>> import numpy >>> numpy.__file__ {path-prefix}numpy/ >>> numpy.__path__ {path-prefix}numpy >>> numpy.linalg.__file__ {path-prefix}numpy/linalg/ >>> import math >>> math.__file__ {path}math{platform}.so >>> import os >>> os.__file__ {path} .pydor # a = 3 b = 4 def cross(x,y): Return a*x + b*y >>> import my_module >>> my_module.__file__ {path} >>> ks = my_module.__dict__.keys() >>> [y for y in ks if not y.startswith('__')] ['a', 'b', 'cross'] subpackages = [] for name in dir(numpy): obj = getattr(numpy, name) if hasattr(obj, '__file__') and obj.__file__.endswith('') subpackages.append(obj.__name__) >>> print subpackages [‘numpy.matrixlib','numpy.compat','numpy.core', 'numpy.fft','numpy.lib','numpy.linalg','', 'numpy.matrixlib','numpy.polynomial','numpy.random', 'numpy.testing']
  • 15. New types New functions class Node: def __init__(self, item, parent=None): self.item = item self.children = [] if parent is not None: parent.children.append(self) from math import sqrt def kurtosis(data): N = len(data) mean = sum(data)/N std = sqrt(sum((x-mean)**2 for x in data)/N) zi = ((x-mean)/std for x in data) return sum(z**4 for z in zi)/N - 3 >>> g = Node(“Root”) >>> type(g) __main__.Node >>> type(g).__mro__ (__main__.Node, object) >>> type(Node).__mro__ (type, object) >>> type(3) int >>> type(3).__mro__ (int, object) >> type(int).__mro__ (type, object) >>> type(kurtosis) function >>> type(sqrt) builtin_function_or_method >>> type(sum) builtin_function_or_method >>> import numpy; type(numpy.add) numpy.ufunc New Types and Functions
  • 16. Protocol Overloading Number Sequence/Mapping Object __str__, __new__, __doc__, __del__, __init__, __repr__, __setattr__ __getattribute__, __delattr__, __hash__, __reduce__, __class__, __dir__, __format__, __reduce_ex__, __call__, __enter__, __exit__, __next__, __dict__, __slots__ __getitem__, __setitem__, __delitem__, __contains__, __iter__, __reversed__, __len__ __abs__, __add__, __and__, __bool__, __ceil__, __divmod__, __eq__, __float__, __floor__, __floordiv__, __ge__, __gt__, __index__, __int__, __invert__, __le__, __lshift__, __lt__, __mod__, __mul__, __ne__, __neg__, __or__, __pos__, __pow__, __radd__, __rand__, __rdivmod__, __rfloordiv__, __rlshift__, __rmod__, __rmul__, __ror__, __round__, __rpow__, __rrshift__, __rshift__, __rsub__, __rtruediv__, __rxor__, __truediv__, __trunc__, __xor__
  • 17. Interoperability C/C++ — Cython, Numba, CFFI, ctypes, boost.python, pybind11 Fortran — f2py JAVA — Py4J, JPyPe, javabridge C#/.NET — Python for .NET (pythonnet) An Opinionated List (there are others) Rust — PyO3, Rust-CPython
  • 19. First problem: Efficient Data Input The first step is to get the data right “It’s Always About the Data” Reference Counting Essay May 1998 Guido van Rossum TableIO April 1998 Michael A. Miller NumPyIO June 1998
  • 20. A walk through bitarray Ilan Schnell Built all first versions of Anaconda bitarray: efficient arrays of booleans
  • 21. Note: docstrings removed! >>> import bitarray >>> bitarray.bitarray.__mro__ (bitarray.bitarray, bitarray._bitarray, object) >>> type(bitarray.bits2bytes) builtin_function_or_method >>> bitarray._sysinfo() (8, 8, 8, 8, 9223372036854775807 >>> 2**63 - 1 9223372036854775807
  • 22. Function table for module Function table for module (Python 2) Add new _bitarray “built-in” type} METH_KEYWORDS
  • 23. Python Functions in C Single Argument Function because METH_O
  • 25. Must ParseTuple to get arguments | METH_KEYWORDS to accept a 3rd dictionary argument to function.
  • 27. Powerful but requires care! • Reference counting (you have to do this manually) • Error handling (can be tedious) • Initialization (can byte you badly if you aren’t careful) • Other run-times (PyPy, RustPython) can’t easily use your tool. • You have access to all the machinery Python itself uses to create all of its own builtins. • You are literally extending Python with new builtin types and functions. • Incredible speed as fast as machine can work.
  • 29. What should you do today? • Just write your code in Python and use existing extensions. • If More Speed is needed: My opinionated modern view • Use Numba • Use Cython • Use mypy (and eventually mypyc) • Run with PyPy • Use Rust and PyO3 • Or if few existing extensions being used:
  • 30. • An open-source, function-at-a-time compiler library for Python • Compiler toolbox for different targets and execution models: • single-threaded CPU, multi-threaded CPU, GPU • regular functions, “universal functions” (array functions), etc • Speedup: 2x (compared to basic NumPy code) to 200x (compared to pure Python) • Combine ease of writing Python with speeds approaching FORTRAN • Empowers scientists who make tools for themselves and other scientists Numba: A JIT Compiler for Python
  • 31. 7 things about Numba you may not know 1 2 3 4 5 6 7 Numba is 100% Open Source Numba + Jupyter = Rapid CUDA Prototyping Numba can compile for the CPU and the GPU at the same time Numba makes array processing easy with @(gu)vectorize Numba comes with a CUDA Simulator You can send Numba functions over the network Numba has typed Lists and Dictionaries (soon)
  • 32. Numba (compile Python to CPUs and GPUs) conda install numba Intermediate Representation (IR) x86 ARM PTX Python LLVMNumba Code Generation Backend Parsing Frontend
  • 33. How does Numba work? Python Function (bytecode) Bytecode Analysis Functions Arguments Numba IR Machine Code Execute! Type Inference LLVM/NVVM JIT LLVM IR Lowering Rewrite IR Cache @jit def do_math(a, b): … >>> do_math(x, y)
  • 34. Supported Platforms and Hardware OS HW SW Windows
 (7 and later) 32 and 64-bit CPUs (Incl Xeon Phi) Python 2.7, 3.4-3.7 OS X
 (10.9 and later) CUDA & HSA GPUs NumPy 1.10 and later Linux
 (RHEL 6 and later) Some support for ARM and ROCm
  • 36. Basic Example Array Allocation Looping over ndarray x as an iterator Using numpy math functions Returning a slice of the array 2.7x speedup! Numba decorator
 (nopython=True not required)
  • 37. • Detects CPU model during compilation and optimizes for that target • Automatic type inference: No need to give type signatures for functions • Dispatches to multiple type-specializations for the same function • Call out to C libraries with CFFI and types • Special "callback" mode for creating C callbacks to use with external libraries • Optional caching to disk, and ahead-of-time creation of shared libraries • Compiler is extensible with new data types and functions Numba Features
  • 38. • Three main technologies for parallelism: Parallel Computing SIMD Multi-threading Distributed Computing x0x1x2x3 x0x1x2x3 x0x3 x2 x1
  • 39. • Numba's CPU detection will enable LLVM to autovectorize for appropriate SIMD instruction set: • SSE, AVX, AVX2, AVX-512 • Will become even more important as AVX-512 is now available on both Xeon Phi and Skylake Xeon processors SIMD: Single Instruction Multiple Data
  • 40. Manual Multithreading: Release the GIL SpeedupRatio 0 0.9 1.8 2.6 3.5 Number of Threads 1 2 4 Option to release the GIL Using Python concurrent.futures
  • 41. Universal Functions (Ufuncs) Ufuncs are a core concept in NumPy for array-oriented computing. ◦ A function with scalar inputs is broadcast across the elements of the input arrays: • np.add([1,2,3], 3) == [4, 5, 6] • np.add([1,2,3], [10, 20, 30]) == [11, 22, 33] ◦ Parallelism is present, by construction. Numba will generate loops and can automatically multi-thread if requested. ◦ Before Numba, creating fast ufuncs required writing C. No longer!
  • 42. Universal Functions (Ufuncs) Different decorator! 1.8x speedup!
  • 43. Multi-threaded Ufuncs Specify type signature Select parallel target Automatically uses all CPU cores!
  • 44. ParallelAccelerator • ParallelAccelerator is a special compiler pass contributed by Intel Labs • Todd A. Anderson, Ehsan Totoni, Paul Liu • Based on similar contribution to Julia • Automatically generates mulithreaded code in a Numba compiled- function: • Array expressions and reductions • Random functions • Dot products • Explicit loops indicated with prange() call
  • 45. ParallelAccelerator: Example #1 Time(ms) 0 1000 2000 3000 4000 NumPy Numba Numba+PA 1.8x 3.6x 1000000x10 input, Core i7 Quad Core CPU
  • 46. ParallelAccelerator: prange() Time(ms) 0 25 50 75 100 NumPy Numba Numba+PA 4.3x 50x 1000000x10 input, Core i7 Quad Core CPU
  • 47. Cython Cython is Python with C data types
  • 49. Basic use Create a text file with a .pyx extension along with a helloworld.pyx Hint: can use %%cython magic in notebooks After %load_ext Cython Borrowed from Cython documentation
  • 51. Some of the C++ stdlib is available Auto-conversion on return
  • 52. Creates an extension type (like _bitarray) Write fast functions that work on anything supporting PEP3118 buffer protocol. entropy.pyx
  • 53. Mypyc
  • 54. MyPyC mypyc is a compiler that compiles mypy-annotated, statically typed Python modules into CPython C extensions. • Most type annotations are enforced at runtime (raising TypeError on mismatch)
 • Classes are compiled into extension classes without __dict__ (much, but not quite, like if they used __slots__)
 • Monkey patching doesn't work
 • Instance attributes won't fall back to class attributes if undefined
 • Metaclasses not supported
 • Also there are still a bunch of bad bugs and unsupported features :) Still Experimental!
  • 57. Extending Python in the Future
  • 59. These extensions are an anchor to Python runtime progress! CPython C-API
  • 60. What do we need? •A way to extend Python that targets multiple run- times by default (at least PyPy, CPython, RustPython) with the ability to add new run-times •Use a subset of typed-Python to do it — i.e. a domain-specific extension language in Python itself •Need NumPy, Pandas, SciPy, Scikit-Learn, and more to use this approach (this will take time)
  • 62. A Bold Proposal • Create a Cython-like tool that uses mypy typing • Borrow heavily from Cython ideas but start a new project that could be pulled into Python itself. • At the same time work from below to continue the clean-up of CPython C-API that has already started.
  • 63. Need ~$5million commitment for a 3-year project to start this • Core team of 5+ devs with 1 lead • 1/2 time project manager and PSF representative • 3+ community liaisons and developer evangelists • Start with $500k Phase 0 to prove the idea • Get total funding from at least 20 companies: $25k initial buy-in, at least $250k commitment over 3 years to start the effort. • Allow up to $100k initial and $1million commitment. • Paying participants get project-management attention and early easy-to-use runtimes and binary extensions delivered with ability to set priorities (plus marketing and the knowledge they are leading Python forward). How? LABS Cooperative Community Work Order • We have the people in our network of collaborators. • We have a sales and marketing team that will pitch this. • We are just rolling out the proposal. Interested?
  • 64. We can do this!
  • 65. A new platform to help open-source projects and developers thrive professionally and financially. Sign up to: • build your open- source portfolio • show which projects you use • thank contributors for projects you love • (soon) get connected to initiatives like the one to make Python universally extensible.
  • 66. LABS Sustaining the Future Open-source innovation and maintenance around the entire data- science and AI workflow. • NumPy ecosystem maintenance • Maintenance and support with PyData core team • Improve connection of NumPy to ML Frameworks • GPU Support for NumPy Ecosystem • Improve foundations of Array computing • JupyterLab • Data Catalog standards • Packaging (conda-forge, PyPA, etc.) uarray — unified array interface and symbolic NumPy xnd — re-factored NumPy (low-level cross-language libraries for N-D (tensor) computing) Partnered with NumFOCUS and Ursa Labs (supporting Apache Arrow)