SlideShare a Scribd company logo
scikit-cuda
A Python Interface to GPU-Powered Libraries
Lev Givon
https://www.columbia.edu/~lev
Bionet Group
Department of Electrical Engineering
Columbia University
September 24, 2015
Lev Givon scikit-cuda
Motivation
NVIDIA CUDA contains a (growing) range of great GPU-based
libraries (CUBLAS, CUFFT, CURAND, CUSPARSE, CUSOLVER,
NPP) in addition to the core driver and runtime libraries.
Other libraries (both commercial and free) also available:
CULA - http://www.culatools.com/
MAGMA - http://icl.cs.utk.edu/magma/
PyCUDA (http://mathema.tician.de/software/pycuda/)
enables Python programmers to take advantage of CUDA, but
provides few interfaces to the above libraries.
Lev Givon scikit-cuda
Package Description
Exposes contents of various GPU-based libraries via both low-level
and high-level Python functions.
Project origin: using CULA’s GPU-based SVD routines to accelerate
time decoding algorithms implemented in Python. 1
Low-level functions
wrap C functions using ctypes
(https://docs.python.org/2/library/ctypes.html);
replicate C function signatures (with a few slight differences);
catch errors and map error codes to Python exceptions.
High-level functions
provide a similar interface to functions in scipy (http://scipy.org);
use PyCUDA’s GPUArray class to encapsulate matrices/vectors in
GPU memory.
1http://www.bionet.ee.columbia.edu/publications
Lev Givon scikit-cuda
Example
1 import pycuda.gpuarray as gpuarray
2 import pycuda.autoinit
3 import numpy as np
4 import skcuda.linalg as linalg
5
6 linalg.init()
7 a = np.random.randn(9, 6) + 1j*np.random.randn(9, 6)
8 a = np.asarray(a, np.complex64)
9 a_gpu = gpuarray.to_gpu(a)
10 u_gpu, s_gpu, vh_gpu = linalg.svd(a_gpu, ’S’, ’S’)
11 np.allclose(a, np.dot(u_gpu.get(), np.dot(np.diag(s_gpu.get()),
12 vh_gpu.get())), 1e-4)
Lev Givon scikit-cuda
Available High-Level Functions
FFT/IFFT
Numerical integration (trapezoidal 1D and 2D)
Linear algebra (dot, eig, qr, svd, transpose, etc.)
Randomized linear algebra (rdmd, rsvd)
Special element-wise functions (expi, sici, etc.)
Numpy-like helper routines not provided by PyCUDA (cumsum,
zeros, etc.)
Lev Givon scikit-cuda
Areas for Future Improvement
Improve maintainability and coverage of exposed functions in
supported libraries by automatically generating ctypes wrappers by
parsing C header files.
Alternately, use cffi (https://bitbucket.org/cffi/cffi) to
generate wrappers.
Provide full support for functions in MAGMA (possibly combined
with a Python-based MAGMA library installer) to enable folks
without access to CULA to run GPU-based LPAPACK.
Enable selection of library backend for functions that are provided by
more than one library (e.g., CULA, MAGMA).
Less naive automated block/grid selection mechanism for functions
that require it.
Lev Givon scikit-cuda
Resources
Code: https://github.com/lebedov/scikit-cuda
Docs: https://scikit-cuda.rtfd.org
Thanks to our long (and growing) list of contributors!
http://scikit-cuda.readthedocs.org/en/latest/authors.html
Lev Givon scikit-cuda

More Related Content

What's hot

Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010 Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010
Skills Matter
 
Cuda
CudaCuda
Nodester Architecture overview & roadmap
Nodester Architecture overview & roadmapNodester Architecture overview & roadmap
Nodester Architecture overview & roadmapcmatthieu
 
nodester Architecture overview & roadmap
nodester Architecture overview & roadmapnodester Architecture overview & roadmap
nodester Architecture overview & roadmapwearefractal
 
Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDA
prithan
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
Rogue Wave Software
 
Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012 Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012
CESGA Centro de Supercomputación de Galicia
 
Parallel K means clustering using CUDA
Parallel K means clustering using CUDAParallel K means clustering using CUDA
Parallel K means clustering using CUDA
prithan
 
Nvidia in bioinformatics
Nvidia in bioinformaticsNvidia in bioinformatics
Nvidia in bioinformatics
Shanker Trivedi
 
Gpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaGpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cuda
Ferdinand Jamitzky
 
Stack switching for fun and profit
Stack switching for fun and profitStack switching for fun and profit
Stack switching for fun and profit
Saúl Ibarra Corretgé
 
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
PhtRaveller
 
Código para Latch físico: Touch_calibrate.py
Código para Latch físico: Touch_calibrate.pyCódigo para Latch físico: Touch_calibrate.py
Código para Latch físico: Touch_calibrate.py
Chema Alonso
 
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
Home sensor prototype on Arduino & Raspberry Pi with Node.JSHome sensor prototype on Arduino & Raspberry Pi with Node.JS
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
Hyunghun Cho
 
How to burn your GPU with CUDA9.1
How to burn your GPU with CUDA9.1How to burn your GPU with CUDA9.1
How to burn your GPU with CUDA9.1
Naoto MATSUMOTO
 
S1170143 2
S1170143 2S1170143 2
S1170143 2s1170143
 
IoT Chess
IoT ChessIoT Chess
IoT Chess
Lars Gregori
 
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Hakky St
 
Gstreamer Basics
Gstreamer BasicsGstreamer Basics
Gstreamer Basics
idrajeev
 

What's hot (20)

Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010 Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010
 
Cuda
CudaCuda
Cuda
 
Nodester Architecture overview & roadmap
Nodester Architecture overview & roadmapNodester Architecture overview & roadmap
Nodester Architecture overview & roadmap
 
nodester Architecture overview & roadmap
nodester Architecture overview & roadmapnodester Architecture overview & roadmap
nodester Architecture overview & roadmap
 
Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDA
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
 
Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012 Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012
 
Parallel K means clustering using CUDA
Parallel K means clustering using CUDAParallel K means clustering using CUDA
Parallel K means clustering using CUDA
 
Nvidia in bioinformatics
Nvidia in bioinformaticsNvidia in bioinformatics
Nvidia in bioinformatics
 
Gpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaGpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cuda
 
Stack switching for fun and profit
Stack switching for fun and profitStack switching for fun and profit
Stack switching for fun and profit
 
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
 
Código para Latch físico: Touch_calibrate.py
Código para Latch físico: Touch_calibrate.pyCódigo para Latch físico: Touch_calibrate.py
Código para Latch físico: Touch_calibrate.py
 
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
Home sensor prototype on Arduino & Raspberry Pi with Node.JSHome sensor prototype on Arduino & Raspberry Pi with Node.JS
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
 
How to burn your GPU with CUDA9.1
How to burn your GPU with CUDA9.1How to burn your GPU with CUDA9.1
How to burn your GPU with CUDA9.1
 
S1170143 2
S1170143 2S1170143 2
S1170143 2
 
IoT Chess
IoT ChessIoT Chess
IoT Chess
 
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
 
Gstreamer Basics
Gstreamer BasicsGstreamer Basics
Gstreamer Basics
 
Illustrator_Sample
Illustrator_SampleIllustrator_Sample
Illustrator_Sample
 

Viewers also liked

Strategies and Tools for Parallel Machine Learning in Python
Strategies and Tools for Parallel Machine Learning in PythonStrategies and Tools for Parallel Machine Learning in Python
Strategies and Tools for Parallel Machine Learning in PythonOlivier Grisel
 
Python for High Performance and Scientific Computing
Python for High Performance and Scientific ComputingPython for High Performance and Scientific Computing
Python for High Performance and Scientific Computing
Andreas Schreiber
 
GPGPU Seminar (PyCUDA)
GPGPU Seminar (PyCUDA)GPGPU Seminar (PyCUDA)
GPGPU Seminar (PyCUDA)
智啓 出川
 
Anaconda & NumbaPro 使ってみた
Anaconda & NumbaPro 使ってみたAnaconda & NumbaPro 使ってみた
Anaconda & NumbaPro 使ってみた
Yosuke Onoue
 
More modern gpu
More modern gpuMore modern gpu
More modern gpu
Preferred Networks
 

Viewers also liked (6)

Strategies and Tools for Parallel Machine Learning in Python
Strategies and Tools for Parallel Machine Learning in PythonStrategies and Tools for Parallel Machine Learning in Python
Strategies and Tools for Parallel Machine Learning in Python
 
Python for High Performance and Scientific Computing
Python for High Performance and Scientific ComputingPython for High Performance and Scientific Computing
Python for High Performance and Scientific Computing
 
GPGPU Seminar (PyCUDA)
GPGPU Seminar (PyCUDA)GPGPU Seminar (PyCUDA)
GPGPU Seminar (PyCUDA)
 
Anaconda & NumbaPro 使ってみた
Anaconda & NumbaPro 使ってみたAnaconda & NumbaPro 使ってみた
Anaconda & NumbaPro 使ってみた
 
PyCUDAの紹介
PyCUDAの紹介PyCUDAの紹介
PyCUDAの紹介
 
More modern gpu
More modern gpuMore modern gpu
More modern gpu
 

Similar to scikit-cuda

Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PeterAndreasEntschev
 
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATORQGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
NVIDIA Japan
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
Kazuaki Ishizaki
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
Shohei Hido
 
Overview of Chainer and Its Features
Overview of Chainer and Its FeaturesOverview of Chainer and Its Features
Overview of Chainer and Its Features
Seiya Tokui
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
NVIDIA Japan
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
DLow6
 
Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018
Preferred Networks
 
LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1
Hajime Tazaki
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilers
AnastasiaStulova
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
lcplcp1
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
Kohei KaiGai
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
Kohei KaiGai
 
Gpu Cuda
Gpu CudaGpu Cuda
Kubernetes internals (Kubernetes 해부하기)
Kubernetes internals (Kubernetes 해부하기)Kubernetes internals (Kubernetes 해부하기)
Kubernetes internals (Kubernetes 해부하기)
DongHyeon Kim
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPass
Peter Vandenabeele
 
Transparent GPU Exploitation for Java
Transparent GPU Exploitation for JavaTransparent GPU Exploitation for Java
Transparent GPU Exploitation for Java
Kazuaki Ishizaki
 
Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?
inside-BigData.com
 

Similar to scikit-cuda (20)

Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATORQGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
 
Arvindsujeeth scaladays12
Arvindsujeeth scaladays12Arvindsujeeth scaladays12
Arvindsujeeth scaladays12
 
Overview of Chainer and Its Features
Overview of Chainer and Its FeaturesOverview of Chainer and Its Features
Overview of Chainer and Its Features
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
 
Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018
 
LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilers
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
Gpu Cuda
Gpu CudaGpu Cuda
Gpu Cuda
 
Kubernetes internals (Kubernetes 해부하기)
Kubernetes internals (Kubernetes 해부하기)Kubernetes internals (Kubernetes 해부하기)
Kubernetes internals (Kubernetes 해부하기)
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPass
 
Transparent GPU Exploitation for Java
Transparent GPU Exploitation for JavaTransparent GPU Exploitation for Java
Transparent GPU Exploitation for Java
 
Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?
 

Recently uploaded

Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
manasideore6
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
obonagu
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Self-Control of Emotions by Slidesgo.pptx
Self-Control of Emotions by Slidesgo.pptxSelf-Control of Emotions by Slidesgo.pptx
Self-Control of Emotions by Slidesgo.pptx
iemerc2024
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
ssuser7dcef0
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Soumen Santra
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 

Recently uploaded (20)

Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Self-Control of Emotions by Slidesgo.pptx
Self-Control of Emotions by Slidesgo.pptxSelf-Control of Emotions by Slidesgo.pptx
Self-Control of Emotions by Slidesgo.pptx
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 

scikit-cuda

  • 1. scikit-cuda A Python Interface to GPU-Powered Libraries Lev Givon https://www.columbia.edu/~lev Bionet Group Department of Electrical Engineering Columbia University September 24, 2015 Lev Givon scikit-cuda
  • 2. Motivation NVIDIA CUDA contains a (growing) range of great GPU-based libraries (CUBLAS, CUFFT, CURAND, CUSPARSE, CUSOLVER, NPP) in addition to the core driver and runtime libraries. Other libraries (both commercial and free) also available: CULA - http://www.culatools.com/ MAGMA - http://icl.cs.utk.edu/magma/ PyCUDA (http://mathema.tician.de/software/pycuda/) enables Python programmers to take advantage of CUDA, but provides few interfaces to the above libraries. Lev Givon scikit-cuda
  • 3. Package Description Exposes contents of various GPU-based libraries via both low-level and high-level Python functions. Project origin: using CULA’s GPU-based SVD routines to accelerate time decoding algorithms implemented in Python. 1 Low-level functions wrap C functions using ctypes (https://docs.python.org/2/library/ctypes.html); replicate C function signatures (with a few slight differences); catch errors and map error codes to Python exceptions. High-level functions provide a similar interface to functions in scipy (http://scipy.org); use PyCUDA’s GPUArray class to encapsulate matrices/vectors in GPU memory. 1http://www.bionet.ee.columbia.edu/publications Lev Givon scikit-cuda
  • 4. Example 1 import pycuda.gpuarray as gpuarray 2 import pycuda.autoinit 3 import numpy as np 4 import skcuda.linalg as linalg 5 6 linalg.init() 7 a = np.random.randn(9, 6) + 1j*np.random.randn(9, 6) 8 a = np.asarray(a, np.complex64) 9 a_gpu = gpuarray.to_gpu(a) 10 u_gpu, s_gpu, vh_gpu = linalg.svd(a_gpu, ’S’, ’S’) 11 np.allclose(a, np.dot(u_gpu.get(), np.dot(np.diag(s_gpu.get()), 12 vh_gpu.get())), 1e-4) Lev Givon scikit-cuda
  • 5. Available High-Level Functions FFT/IFFT Numerical integration (trapezoidal 1D and 2D) Linear algebra (dot, eig, qr, svd, transpose, etc.) Randomized linear algebra (rdmd, rsvd) Special element-wise functions (expi, sici, etc.) Numpy-like helper routines not provided by PyCUDA (cumsum, zeros, etc.) Lev Givon scikit-cuda
  • 6. Areas for Future Improvement Improve maintainability and coverage of exposed functions in supported libraries by automatically generating ctypes wrappers by parsing C header files. Alternately, use cffi (https://bitbucket.org/cffi/cffi) to generate wrappers. Provide full support for functions in MAGMA (possibly combined with a Python-based MAGMA library installer) to enable folks without access to CULA to run GPU-based LPAPACK. Enable selection of library backend for functions that are provided by more than one library (e.g., CULA, MAGMA). Less naive automated block/grid selection mechanism for functions that require it. Lev Givon scikit-cuda
  • 7. Resources Code: https://github.com/lebedov/scikit-cuda Docs: https://scikit-cuda.rtfd.org Thanks to our long (and growing) list of contributors! http://scikit-cuda.readthedocs.org/en/latest/authors.html Lev Givon scikit-cuda