SlideShare a Scribd company logo
Standardizing on a single
N-dimensional array API for Python
Ralf Gommers
MXNet workshop, 14 Dec 2020
Array-based computing in Python
Today’s Python data ecosystem
Can we make it easy to build on top of multiple array data structures?
State of compatibility today
All libraries have common concepts and functionality.
But, there are many small (and some large) incompatibilities. It’s very painful to
translate code from one array library to another.
Let’s look at some examples!
Consortium for Python Data API Standards
A new organization, with participation from maintainers of many array (or
tensor) and dataframe libraries.
Concrete goals for first year:
1. Define a standardization methodology and necessary tooling for it
2. Publish an RFC for an array API standard
3. Publish an RFC for a dataframe API standard
4. Finalize 2021.0x API standards after community review
See data-apis.org and github.com/data-apis for more on the Consortium
Goals for and scope of the array API
Syntax and semantics of functions
and objects in the API
Casting rules, broadcasting, indexing,
Python operator support
Data interchange & device support
Execution semantics (e.g. task
scheduling, parallelism, lazy eval)
Non-standard dtypes, masked arrays,
I/O, subclassing array object, C API
Error handling & behaviour for invalid
inputs to functions and methods
Goal 1: enable writing code & packages that support multiple array libraries
Goal 2: make it easy for end users to switch between array libraries
In Scope Out of Scope
Array- and array-consuming libraries
Using DLPack, will work for any two
libraries if they support device the
data resides on
x = xp.from_dlpack(x_other)
Data interchange between array libs
Portable code in array-consuming libs
def softmax(x):
# grab standard namespace from
# the passed-in array
xp = get_array_api(x)
x_exp = xp.exp(x)
partition = xp.sum(x_exp, axis=1,
keepdims=True)
return x_exp / partition
What does the full API surface look like?
● 1 array object with
○ 6 attributes: ndim, shape, size, dtype, device, T
○ dunder methods to support all Python operators
○ __array_api_version__, __array_namespace__, __dlpack__
● 11 dtype literals: bool, (u)int8/16/32/64, float32/64
● 1 device object
● 4 constants: inf, nan, pi, e
● ~115 functions:
○ Array creation & manipulation (18)
○ Element-wise math & logic (55)
○ Statistics (7)
○ Linear algebra (22)
○ Search, sort & set (7)
○ Utilities (4)
Mutability & copies/views
x = ones(4)
# y may be a view on data of x
y = x[:2]
# modifies x if y is a view
y += 1
Mutable operations and the concept of views are
important for strided in-memory array implementations
(NumPy, CuPy, PyTorch, MXNet)
They are problematic for libraries based on immutable data
structures or delayed evaluation (TensorFlow, JAX, Dask)
Decisions in API standard:
1. Support inplace operators
2. Support item and slice assignment
3. Do not support out= keyword
4. Warn users that mixing mutating operations and views
may result in implementation-specific behavior
Dtype casting rules
x = xp.arange(5) # will be integer
y = xp.ones(5, dtype=xp.float32)
# This may give float32, float64, or raise
dtype = (x * y).dtype
Casting rules are straightforward to align between
libraries when the dtypes are of the same kind
Mixed integer and floating-point casting is very
inconsistent between libraries, and hard to change:
Hence this will remain unspecified.
Data-dependent output shape/dtype
# Boolean indexing, and even slicing
# in some cases, results in shapes
# that depend on values in `x`
x2 = x[:, x > 3]
val = somefunc(x)
x3 = x[:val]
# Functions for which output shape
# depends on value
unique(x)
nonzero(x)
# NumPy does value-based casting
x = np.ones(3, dtype=np.float32)
x + 1 # float32 output
x + 100000 # float64 output
Data-dependent output shapes or dtypes are
problematic, because of:
● static memory allocation (TensorFlow, JAX)
● graph-based scheduling (Dask)
● JIT compilation (Numba, PyTorch, JAX,
Gluon)
Value-based dtype results can be avoided.
Value-based shapes can be important - the API
standard will include but clearly mark such
functionality.
Where are we now, and what’s next?
The array API standard is >90% complete and published for community review.
Still work-in-progress are:
● Data interchange with DLPack
● Device support
● Data-dependent shape handling
● A handful of regular functions (linalg, result_type, meshgrid)
Important next steps will be:
1. Complete the library-independent test suite
2. First (prototype) implementations in libraries
3. Get sign-off from maintainers of each array library
4. Define process to handle future & optional extensions
Thank you
Consortium:
● Website & introductory blog posts: data-apis.org
● Array API main repo: github.com/data-apis/array-api
● Latest version of the standard: data-apis.github.io/array-api/latest
● Members: github.com/data-apis/governance
Find me at: ralf.gommers@gmail.com, rgommers, ralfgommers
Try this at home - installing the latest version of all seven array libraries in one
env to experiment:
conda create -n many-libs python=3.7
conda activate many-libs
conda install cudatoolkit=10.2
pip install numpy torch jax jaxlib tensorflow mxnet cupy-cuda102 dask toolz sparse

More Related Content

What's hot

Plotting data with python and pylab
Plotting data with python and pylabPlotting data with python and pylab
Plotting data with python and pylab
Giovanni Marco Dall'Olio
 
Data visualization in Python
Data visualization in PythonData visualization in Python
Data visualization in Python
Marc Garcia
 
Scipy, numpy and friends
Scipy, numpy and friendsScipy, numpy and friends
Scipy, numpy and friends
Michele Mattioni
 
Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*
Intel® Software
 
MPI Raspberry pi 3 cluster
MPI Raspberry pi 3 clusterMPI Raspberry pi 3 cluster
MPI Raspberry pi 3 clusterArafat Hussain
 
Numba
NumbaNumba
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Intel® Software
 
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
MLconf
 
Buzzwords Numba Presentation
Buzzwords Numba PresentationBuzzwords Numba Presentation
Buzzwords Numba Presentationkammeyer
 
An adaptive algorithm for detection of duplicate records
An adaptive algorithm for detection of duplicate recordsAn adaptive algorithm for detection of duplicate records
An adaptive algorithm for detection of duplicate recordsLikan Patra
 
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
Big Data Spain
 
Sci computing using python
Sci computing using pythonSci computing using python
Sci computing using python
Ashok Govindarajan
 
Essential NumPy
Essential NumPyEssential NumPy
Essential NumPy
zekeLabs Technologies
 
Matplotlib Review 2021
Matplotlib Review 2021Matplotlib Review 2021
Matplotlib Review 2021
Bhaskar J.Roy
 
Matlab gnu octave
Matlab gnu octaveMatlab gnu octave
Matlab gnu octave
pramodkumar1804
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Intel® Software
 
Kaggle tokyo 2018
Kaggle tokyo 2018Kaggle tokyo 2018
Kaggle tokyo 2018
Cournapeau David
 
AI & Topology concluding remarks - "The open-source landscape for topology in...
AI & Topology concluding remarks - "The open-source landscape for topology in...AI & Topology concluding remarks - "The open-source landscape for topology in...
AI & Topology concluding remarks - "The open-source landscape for topology in...
Umberto Lupo
 

What's hot (20)

Plotting data with python and pylab
Plotting data with python and pylabPlotting data with python and pylab
Plotting data with python and pylab
 
Data visualization in Python
Data visualization in PythonData visualization in Python
Data visualization in Python
 
Scipy, numpy and friends
Scipy, numpy and friendsScipy, numpy and friends
Scipy, numpy and friends
 
Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*
 
MPI Raspberry pi 3 cluster
MPI Raspberry pi 3 clusterMPI Raspberry pi 3 cluster
MPI Raspberry pi 3 cluster
 
Numba
NumbaNumba
Numba
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
 
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
 
Buzzwords Numba Presentation
Buzzwords Numba PresentationBuzzwords Numba Presentation
Buzzwords Numba Presentation
 
An adaptive algorithm for detection of duplicate records
An adaptive algorithm for detection of duplicate recordsAn adaptive algorithm for detection of duplicate records
An adaptive algorithm for detection of duplicate records
 
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
 
Sci computing using python
Sci computing using pythonSci computing using python
Sci computing using python
 
Essential NumPy
Essential NumPyEssential NumPy
Essential NumPy
 
Matplotlib Review 2021
Matplotlib Review 2021Matplotlib Review 2021
Matplotlib Review 2021
 
Matlab gnu octave
Matlab gnu octaveMatlab gnu octave
Matlab gnu octave
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
Lecture1
Lecture1Lecture1
Lecture1
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
 
Kaggle tokyo 2018
Kaggle tokyo 2018Kaggle tokyo 2018
Kaggle tokyo 2018
 
AI & Topology concluding remarks - "The open-source landscape for topology in...
AI & Topology concluding remarks - "The open-source landscape for topology in...AI & Topology concluding remarks - "The open-source landscape for topology in...
AI & Topology concluding remarks - "The open-source landscape for topology in...
 

Similar to Standardizing on a single N-dimensional array API for Python

Standardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationStandardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft Presentation
Travis Oliphant
 
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Amazon Web Services
 
Modern C++
Modern C++Modern C++
Modern C++
Richard Thomson
 
An Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAn Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAdam Getchell
 
PythonStudyMaterialSTudyMaterial.pdf
PythonStudyMaterialSTudyMaterial.pdfPythonStudyMaterialSTudyMaterial.pdf
PythonStudyMaterialSTudyMaterial.pdf
data2businessinsight
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Herman Wu
 
Meetup tensorframes
Meetup tensorframesMeetup tensorframes
Meetup tensorframes
Paolo Platter
 
Intellectual technologies
Intellectual technologiesIntellectual technologies
Intellectual technologies
Polad Saruxanov
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Databricks
 
Python
PythonPython
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
Hektor Jacynycz García
 
James Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on PythonJames Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on Python
CP-Union
 
TI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific LanguagesTI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific LanguagesEelco Visser
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
Amazon Web Services
 
An Intoduction to R
An Intoduction to RAn Intoduction to R
An Intoduction to R
Mahmoud Shiri Varamini
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
Martin Odersky
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science Project
Matthew Gerring
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to pythonActiveState
 

Similar to Standardizing on a single N-dimensional array API for Python (20)

Standardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationStandardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft Presentation
 
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
 
Modern C++
Modern C++Modern C++
Modern C++
 
An Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAn Overview Of Python With Functional Programming
An Overview Of Python With Functional Programming
 
PythonStudyMaterialSTudyMaterial.pdf
PythonStudyMaterialSTudyMaterial.pdfPythonStudyMaterialSTudyMaterial.pdf
PythonStudyMaterialSTudyMaterial.pdf
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
Meetup tensorframes
Meetup tensorframesMeetup tensorframes
Meetup tensorframes
 
Intellectual technologies
Intellectual technologiesIntellectual technologies
Intellectual technologies
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
Python
PythonPython
Python
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
 
James Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on PythonJames Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on Python
 
TI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific LanguagesTI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific Languages
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
An Intoduction to R
An Intoduction to RAn Intoduction to R
An Intoduction to R
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science Project
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
Lect1.pptx
Lect1.pptxLect1.pptx
Lect1.pptx
 

More from Ralf Gommers

Reliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfReliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdf
Ralf Gommers
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
Ralf Gommers
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with Python
Ralf Gommers
 
Building SciPy kernels with Pythran
Building SciPy kernels with PythranBuilding SciPy kernels with Pythran
Building SciPy kernels with Pythran
Ralf Gommers
 
Strengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeStrengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond code
Ralf Gommers
 
Inside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeInside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decade
Ralf Gommers
 
__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts
Ralf Gommers
 
NumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionNumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_session
Ralf Gommers
 
SciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeSciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and Code
Ralf Gommers
 

More from Ralf Gommers (9)

Reliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfReliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdf
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with Python
 
Building SciPy kernels with Pythran
Building SciPy kernels with PythranBuilding SciPy kernels with Pythran
Building SciPy kernels with Pythran
 
Strengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeStrengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond code
 
Inside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeInside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decade
 
__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts
 
NumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionNumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_session
 
SciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeSciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and Code
 

Recently uploaded

Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
Google
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Game Development with Unity3D (Game Development lecture 3)
Game Development  with Unity3D (Game Development lecture 3)Game Development  with Unity3D (Game Development lecture 3)
Game Development with Unity3D (Game Development lecture 3)
abdulrafaychaudhry
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
vrstrong314
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 

Recently uploaded (20)

Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Game Development with Unity3D (Game Development lecture 3)
Game Development  with Unity3D (Game Development lecture 3)Game Development  with Unity3D (Game Development lecture 3)
Game Development with Unity3D (Game Development lecture 3)
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 

Standardizing on a single N-dimensional array API for Python

  • 1. Standardizing on a single N-dimensional array API for Python Ralf Gommers MXNet workshop, 14 Dec 2020
  • 3. Today’s Python data ecosystem Can we make it easy to build on top of multiple array data structures?
  • 4. State of compatibility today All libraries have common concepts and functionality. But, there are many small (and some large) incompatibilities. It’s very painful to translate code from one array library to another. Let’s look at some examples!
  • 5. Consortium for Python Data API Standards A new organization, with participation from maintainers of many array (or tensor) and dataframe libraries. Concrete goals for first year: 1. Define a standardization methodology and necessary tooling for it 2. Publish an RFC for an array API standard 3. Publish an RFC for a dataframe API standard 4. Finalize 2021.0x API standards after community review See data-apis.org and github.com/data-apis for more on the Consortium
  • 6. Goals for and scope of the array API Syntax and semantics of functions and objects in the API Casting rules, broadcasting, indexing, Python operator support Data interchange & device support Execution semantics (e.g. task scheduling, parallelism, lazy eval) Non-standard dtypes, masked arrays, I/O, subclassing array object, C API Error handling & behaviour for invalid inputs to functions and methods Goal 1: enable writing code & packages that support multiple array libraries Goal 2: make it easy for end users to switch between array libraries In Scope Out of Scope
  • 7. Array- and array-consuming libraries Using DLPack, will work for any two libraries if they support device the data resides on x = xp.from_dlpack(x_other) Data interchange between array libs Portable code in array-consuming libs def softmax(x): # grab standard namespace from # the passed-in array xp = get_array_api(x) x_exp = xp.exp(x) partition = xp.sum(x_exp, axis=1, keepdims=True) return x_exp / partition
  • 8. What does the full API surface look like? ● 1 array object with ○ 6 attributes: ndim, shape, size, dtype, device, T ○ dunder methods to support all Python operators ○ __array_api_version__, __array_namespace__, __dlpack__ ● 11 dtype literals: bool, (u)int8/16/32/64, float32/64 ● 1 device object ● 4 constants: inf, nan, pi, e ● ~115 functions: ○ Array creation & manipulation (18) ○ Element-wise math & logic (55) ○ Statistics (7) ○ Linear algebra (22) ○ Search, sort & set (7) ○ Utilities (4)
  • 9.
  • 10. Mutability & copies/views x = ones(4) # y may be a view on data of x y = x[:2] # modifies x if y is a view y += 1 Mutable operations and the concept of views are important for strided in-memory array implementations (NumPy, CuPy, PyTorch, MXNet) They are problematic for libraries based on immutable data structures or delayed evaluation (TensorFlow, JAX, Dask) Decisions in API standard: 1. Support inplace operators 2. Support item and slice assignment 3. Do not support out= keyword 4. Warn users that mixing mutating operations and views may result in implementation-specific behavior
  • 11. Dtype casting rules x = xp.arange(5) # will be integer y = xp.ones(5, dtype=xp.float32) # This may give float32, float64, or raise dtype = (x * y).dtype Casting rules are straightforward to align between libraries when the dtypes are of the same kind Mixed integer and floating-point casting is very inconsistent between libraries, and hard to change: Hence this will remain unspecified.
  • 12. Data-dependent output shape/dtype # Boolean indexing, and even slicing # in some cases, results in shapes # that depend on values in `x` x2 = x[:, x > 3] val = somefunc(x) x3 = x[:val] # Functions for which output shape # depends on value unique(x) nonzero(x) # NumPy does value-based casting x = np.ones(3, dtype=np.float32) x + 1 # float32 output x + 100000 # float64 output Data-dependent output shapes or dtypes are problematic, because of: ● static memory allocation (TensorFlow, JAX) ● graph-based scheduling (Dask) ● JIT compilation (Numba, PyTorch, JAX, Gluon) Value-based dtype results can be avoided. Value-based shapes can be important - the API standard will include but clearly mark such functionality.
  • 13. Where are we now, and what’s next? The array API standard is >90% complete and published for community review. Still work-in-progress are: ● Data interchange with DLPack ● Device support ● Data-dependent shape handling ● A handful of regular functions (linalg, result_type, meshgrid) Important next steps will be: 1. Complete the library-independent test suite 2. First (prototype) implementations in libraries 3. Get sign-off from maintainers of each array library 4. Define process to handle future & optional extensions
  • 14. Thank you Consortium: ● Website & introductory blog posts: data-apis.org ● Array API main repo: github.com/data-apis/array-api ● Latest version of the standard: data-apis.github.io/array-api/latest ● Members: github.com/data-apis/governance Find me at: ralf.gommers@gmail.com, rgommers, ralfgommers Try this at home - installing the latest version of all seven array libraries in one env to experiment: conda create -n many-libs python=3.7 conda activate many-libs conda install cudatoolkit=10.2 pip install numpy torch jax jaxlib tensorflow mxnet cupy-cuda102 dask toolz sparse