SlideShare a Scribd company logo
1 of 71
Download to read offline
INTRODUCTION TO
GPU COMPUTING
Ilya Kuzovkin
13 May 2014, Tartu
PART I
“TEAPOT”
SIMPLE OPENGL PROGRAM
Idea of computing on
GPU emerged
because GPUs
became very good at
parallel computations.
SIMPLE OPENGL PROGRAM
Idea of computing on
GPU emerged
because GPUs
became very good at
parallel computations.
!
Let us start from
observing an example
of parallelism in a
simple OpenGL
application.
SIMPLE OPENGL PROGRAM
You will need CodeBlocksWindows, Linux or XCodeMac to run
this example.
• Install CodeBlocks bundled with MinGW compiler from
http://www.codeblocks.org/downloads/26
!
• Download codebase from https://github.com/kuz/
Introduction-to-GPU-Computing
!
• Open the project from the code/Cube!
!
• Compile & run it
SHADER PROGRAM
Program which is executed on GPU.
Has to be written using shading language.
In OpenGL this language is GLSL, which is based on C.
http://www.opengl.org/wiki/Shader
SHADER PROGRAM
Program which is executed on GPU.
Has to be written using shading language.
In OpenGL this language is GLSL, which is based on C.
OpenGL has 5 main shader stages:
• Vertex Shader
• Tessellation Control
• Geometry Shader
• Fragment Shader
• Compute Shader (since 4.3)
http://www.opengl.org/wiki/Shader
SHADER PROGRAM
Program which is executed on GPU.
Has to be written using shading language.
In OpenGL this language is GLSL, which is based on C.
OpenGL has 5 main shader stages:
• Vertex Shader
• Tessellation Control
• Geometry Shader
• Fragment Shader
• Compute Shader (since 4.3)
http://www.opengl.org/wiki/Shader
LIGHTING
Is it a cube or not?
We will find out as soon as
we add lighting to the scene.
LIGHTING
Is it a cube or not?
We will find out as soon as
we add lighting to the scene.
https://github.com/konstantint/ComputerGraphics2013/blob/master/Lectures/07%20-%20Color%20and%20Lighting/slides07_colorandlighting.pdf
LIGHTING
Is it a cube or not?
We will find out as soon as
we add lighting to the scene.
https://github.com/konstantint/ComputerGraphics2013/blob/master/Lectures/07%20-%20Color%20and%20Lighting/slides07_colorandlighting.pdf
Exercise: code that equation into
fragment shader of the Cube program
LIGHTING
• Run the program with lighting enabled and look at
FPS values
COMPARE FPS
• Run the program with lighting enabled and look at
FPS values
!
• In cube.cpp idle() function uncomment dummy
code which simulates approximately same amount of
computations as Phong lighting model requires.
COMPARE FPS
• Run the program with lighting enabled and look at
FPS values
!
• In cube.cpp idle() function uncomment dummy
code which simulates approximately same amount of
computations as Phong lighting model requires.
!
• Note that these computations are performed on CPU
COMPARE FPS
• Run the program with lighting enabled and look at
FPS values
!
• In cube.cpp idle() function uncomment dummy
code which simulates approximately same amount of
computations as Phong lighting model requires.
!
• Note that these computations are performed on CPU
!
• Observe how FPS has changed
COMPARE FPS
• Run the program with lighting enabled and look at
FPS values
!
• In cube.cpp idle() function uncomment dummy
code which simulates approximately same amount of
computations as Phong lighting model requires.
!
• Note that these computations are performed on CPU
!
• Observe how FPS has changed
Parallel computations are fast on GPU.
Lets use it to compute something useful.
COMPARE FPS
PART II
“OLD SCHOOL”
OPENGL PIPELINE + GLSL
http://www.opengl.org/wiki/Framebuffer
Take the input data from
the CPU memory and put
it as an image into the
GPU memory
http://www.opengl.org/wiki/Framebuffer
In the fragment shader
perform a computation on
each of the pixels of that
image
Take the input data from
the CPU memory and put
it as an image into the
GPU memory
OPENGL PIPELINE + GLSL
http://www.opengl.org/wiki/Framebuffer
In the fragment shader
perform a computation on
each of the pixels of that
image
Store the resulting image
to the Render Buffer inside
the GPU memory
Take the input data from
the CPU memory and put
it as an image into the
GPU memory
OPENGL PIPELINE + GLSL
http://www.opengl.org/wiki/Framebuffer
Read output from the GPU
memory back to the CPU
memory
Store the resulting image
to the Render Buffer inside
the GPU memory
In the fragment shader
perform a computation on
each of the pixels of that
image
Take the input data from
the CPU memory and put
it as an image into the
GPU memory
OPENGL PIPELINE + GLSL
• Create texture where will store the input data
http://www.opengl.org/wiki/Framebuffer
OPENGL PIPELINE + GLSL
• Create texture where will store the input data
!
!
!
!
!
• Create FrameBuffer Object (FBO) to “render” to
http://www.opengl.org/wiki/Framebuffer
OPENGL PIPELINE + GLSL
• Run OpenGL pipeline
http://www.opengl.org/wiki/Framebuffer
OPENGL PIPELINE + GLSL
• Run OpenGL pipeline
• Render GL_QUADS of same size as the texture matrix
http://www.opengl.org/wiki/Framebuffer
OPENGL PIPELINE + GLSL
• Run OpenGL pipeline
• Render GL_QUADS of same size as the texture matrix
• Use fragment shader to perform per-fragment
computations using data from the texture
http://www.opengl.org/wiki/Framebuffer
OPENGL PIPELINE + GLSL
• Run OpenGL pipeline
• Render GL_QUADS of same size as the texture matrix
• Use fragment shader to perform per-fragment
computations using data from the texture
• OpenGL will store result in the texture given to the
Render Buffer (within Framebuffer Object)
http://www.opengl.org/wiki/Framebuffer
OPENGL PIPELINE + GLSL
• Run OpenGL pipeline
• Render GL_QUADS of same size as the texture matrix
• Use fragment shader to perform per-fragment
computations using data from the texture
• OpenGL will store result in the texture given to the
Render Buffer (within Framebuffer Object)
!
• Read the data from the Render Buffer
http://www.opengl.org/wiki/Framebuffer
OPENGL PIPELINE + GLSL
• Run OpenGL pipeline
• Render GL_QUADS of same size as the texture matrix
• Use fragment shader to perform per-fragment
computations using data from the texture
• OpenGL will store result in the texture given to the
Render Buffer (within Framebuffer Object)
!
• Read the data from the Render Buffer
!
!
!
!
• Can we use that to properly debug GLSL?
http://www.opengl.org/wiki/Framebuffer
OPENGL PIPELINE + GLSL
Run the project from the code/FBO
DEMO
PART III
“MODERN TIMES”
COMPUTE SHADER
• Since OpenGL 4.3
• Used to compute things not related to rendering directly
COMPUTE SHADER
• Since OpenGL 4.3
• Used to compute things not related to rendering directly
COMPUTE SHADER
http://web.engr.oregonstate.edu/~mjb/cs557/Handouts/compute.shader.1pp.pdf
• Since OpenGL 4.3
• Used to compute things not related to rendering directly
Will not talk
about it
http://wiki.tiker.net/CudaVsOpenCL
Supported only by
nVidia hardware
Supported by nVidia,
AMD, Intel, Qualcomm
https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl
http://wiki.tiker.net/CudaVsOpenCL
Supported only by
nVidia hardware
Supported by nVidia,
AMD, Intel, Qualcomm
https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl
Implementations only
by nVidia
OpenCL
http://wiki.tiker.net/CudaVsOpenCL
Supported only by
nVidia hardware
Supported by nVidia,
AMD, Intel, Qualcomm
https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl
Implementations only
by nVidia
OpenCL
http://wiki.tiker.net/CudaVsOpenCL
~same performance levels
Supported only by
nVidia hardware
Supported by nVidia,
AMD, Intel, Qualcomm
https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl
Implementations only
by nVidia
OpenCL
http://wiki.tiker.net/CudaVsOpenCL
~same performance levels
Developer-friendly OpenCL
PART III
CHAPTER 1
KERNEL
KERNEL
KERNEL
WRITE AND READ DATA ON GPU
WRITE AND READ DATA ON GPU
… run computations here …
WRITE AND READ DATA ON GPU
… run computations here …
THE COMPUTATION
THE COMPUTATION
THE COMPUTATION
THE COMPUTATION
THE COMPUTATION
DEMO
Open, study and run the project from the code/OpenCL
PART III
CHAPTER 2
CUDA PROGRAMMING MODEL
• CPU is called “host”
• Move data CPU <-> GPU memory cudaMemcopy
• Allocate memory cudaMalloc	
  
• Launch kernels on GPU
• GPU is called “device”
CUDA PROGRAMMING MODEL
• CPU is called “host”
• Move data CPU <-> GPU memory cudaMemcopy
• Allocate memory cudaMalloc	
  
• Launch kernels on GPU
• GPU is called “device”
1. CPU allocates memory on GPU
2. CPU copies data to GPU memory
3. CPU launches kernels on GPU (process the data)
4. CPU copies results back to CPU memory
Typical CUDA program
CUDA PROGRAMMING MODEL
• CPU is called “host”
• Move data CPU <-> GPU memory cudaMemcopy
• Allocate memory cudaMalloc	
  
• Launch kernels on GPU
• GPU is called “device”
1. CPU allocates memory on GPU
2. CPU copies data to GPU memory
3. CPU launches kernels on GPU (process the data)
4. CPU copies results back to CPU memory
Typical CUDA program
Very similar to the
logic of OpenCL
EXAMPLE
Introduction to Parallel Programming @ Udacity
https://www.udacity.com/course/cs344
EXAMPLE
Introduction to Parallel Programming @ Udacity
https://www.udacity.com/course/cs344
EXAMPLE
Introduction to Parallel Programming @ Udacity
https://www.udacity.com/course/cs344
EXAMPLE
Introduction to Parallel Programming @ Udacity
https://www.udacity.com/course/cs344
EXAMPLE
Introduction to Parallel Programming @ Udacity
https://www.udacity.com/course/cs344
EXAMPLE
Introduction to Parallel Programming @ Udacity
https://www.udacity.com/course/cs344
EXAMPLE
Introduction to Parallel Programming @ Udacity
https://www.udacity.com/course/cs344
PART IV
“DISCUSSION”
LINKS
• Code repository for this presentation
• https://github.com/kuz/Introduction-to-GPU-Computing
• Feel free to leave feature requests there, ask questions, etc.
!
• OpenCL tutorials & presentations
• http://streamcomputing.eu/knowledge/for-developers/tutorials/
• http://opencl.codeplex.com/wikipage?title=OpenCL%20Tutorials%20-%201&referringTitle=OpenCL%20Tutorials
• http://www.cc.gatech.edu/~vetter/keeneland/tutorial-2011-04-14/06-intro_to_opencl.pdf
!
• Introduction to Parallel Computing @ www.udacity.com
• https://www.udacity.com/course/cs344
!
• Slides about Compute Shader
• http://web.engr.oregonstate.edu/~mjb/cs557/Handouts/compute.shader.1pp.pdf
!
• Introduction to Computer Graphics with codebases
• https://github.com/konstantint/ComputerGraphics2013
!
• GLSL Computing (aka “Old school”)
• http://www.computer-graphics.se/gpu-computing/lab1.html

More Related Content

What's hot

CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentationVishal Singh
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...AMD Developer Central
 
Gpu and The Brick Wall
Gpu and The Brick WallGpu and The Brick Wall
Gpu and The Brick Wallugur candan
 
Easy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java ProgrammersEasy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java ProgrammersKazuaki Ishizaki
 
GPU power consumption and performance trends
GPU power consumption and performance trendsGPU power consumption and performance trends
GPU power consumption and performance trendsAlessio Villardita
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerAMD Developer Central
 
19564926 graphics-processing-unit
19564926 graphics-processing-unit19564926 graphics-processing-unit
19564926 graphics-processing-unitDayakar Siddula
 
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengHC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengAMD Developer Central
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practicesLior Sidi
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An IntroductionDhan V Sagar
 
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU ArchitectureRevisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecturemohamedragabslideshare
 
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONSA SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONScseij
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...AMD Developer Central
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.J On The Beach
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansAMD Developer Central
 
Using GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with JavaUsing GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with JavaTim Ellison
 

What's hot (20)

CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
Gpu and The Brick Wall
Gpu and The Brick WallGpu and The Brick Wall
Gpu and The Brick Wall
 
Easy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java ProgrammersEasy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java Programmers
 
GPU power consumption and performance trends
GPU power consumption and performance trendsGPU power consumption and performance trends
GPU power consumption and performance trends
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
 
19564926 graphics-processing-unit
19564926 graphics-processing-unit19564926 graphics-processing-unit
19564926 graphics-processing-unit
 
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengHC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An Introduction
 
Graphics processing unit
Graphics processing unitGraphics processing unit
Graphics processing unit
 
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU ArchitectureRevisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
 
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONSA SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
 
GPU - Basic Working
GPU - Basic WorkingGPU - Basic Working
GPU - Basic Working
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
 
Hadoop + GPU
Hadoop + GPUHadoop + GPU
Hadoop + GPU
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
 
Using GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with JavaUsing GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with Java
 
ARM and Machine Learning
ARM and Machine LearningARM and Machine Learning
ARM and Machine Learning
 

Viewers also liked

Why Graphics Is Fast, and What It Can Teach Us About Parallel Programming
Why Graphics Is Fast, and What It Can Teach Us About Parallel ProgrammingWhy Graphics Is Fast, and What It Can Teach Us About Parallel Programming
Why Graphics Is Fast, and What It Can Teach Us About Parallel ProgrammingJonathan Ragan-Kelley
 
02 direct3 d_pipeline
02 direct3 d_pipeline02 direct3 d_pipeline
02 direct3 d_pipelineGirish Ghate
 
Article overview: Unsupervised Learning of Visual Structure Using Predictive ...
Article overview: Unsupervised Learning of Visual Structure Using Predictive ...Article overview: Unsupervised Learning of Visual Structure Using Predictive ...
Article overview: Unsupervised Learning of Visual Structure Using Predictive ...Ilya Kuzovkin
 
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄Cheer Chain Enterprise Co., Ltd.
 
Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...
Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...
Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...Ilya Kuzovkin
 
Learning RBM(Restricted Boltzmann Machine in Practice)
Learning RBM(Restricted Boltzmann Machine in Practice)Learning RBM(Restricted Boltzmann Machine in Practice)
Learning RBM(Restricted Boltzmann Machine in Practice)Mad Scientists
 
Prinsip dasar dan peran koperasai
Prinsip dasar dan peran koperasaiPrinsip dasar dan peran koperasai
Prinsip dasar dan peran koperasaiNenengYuyuRohana
 
Presentationv1 Part1
Presentationv1 Part1Presentationv1 Part1
Presentationv1 Part1Abhishek Mago
 
Branding Bootcamp: Developing an Authentic Brand That Connects With Your Cust...
Branding Bootcamp: Developing an Authentic Brand That Connects With Your Cust...Branding Bootcamp: Developing an Authentic Brand That Connects With Your Cust...
Branding Bootcamp: Developing an Authentic Brand That Connects With Your Cust...Stone Soup Creative
 
Scaling Mobile Development
Scaling Mobile DevelopmentScaling Mobile Development
Scaling Mobile DevelopmentLookout
 

Viewers also liked (19)

Why Graphics Is Fast, and What It Can Teach Us About Parallel Programming
Why Graphics Is Fast, and What It Can Teach Us About Parallel ProgrammingWhy Graphics Is Fast, and What It Can Teach Us About Parallel Programming
Why Graphics Is Fast, and What It Can Teach Us About Parallel Programming
 
02 direct3 d_pipeline
02 direct3 d_pipeline02 direct3 d_pipeline
02 direct3 d_pipeline
 
presentation
presentationpresentation
presentation
 
Article overview: Unsupervised Learning of Visual Structure Using Predictive ...
Article overview: Unsupervised Learning of Visual Structure Using Predictive ...Article overview: Unsupervised Learning of Visual Structure Using Predictive ...
Article overview: Unsupervised Learning of Visual Structure Using Predictive ...
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
 
Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...
Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...
Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...
 
Learning RBM(Restricted Boltzmann Machine in Practice)
Learning RBM(Restricted Boltzmann Machine in Practice)Learning RBM(Restricted Boltzmann Machine in Practice)
Learning RBM(Restricted Boltzmann Machine in Practice)
 
Conférence bpi identité numérique - 24 fév 2012
Conférence bpi   identité numérique - 24 fév 2012Conférence bpi   identité numérique - 24 fév 2012
Conférence bpi identité numérique - 24 fév 2012
 
Prinsip dasar dan peran koperasai
Prinsip dasar dan peran koperasaiPrinsip dasar dan peran koperasai
Prinsip dasar dan peran koperasai
 
Dalton Sample Sheets
Dalton Sample SheetsDalton Sample Sheets
Dalton Sample Sheets
 
Zone IDA Proc
Zone IDA ProcZone IDA Proc
Zone IDA Proc
 
Presentationv1 Part1
Presentationv1 Part1Presentationv1 Part1
Presentationv1 Part1
 
Branding Bootcamp: Developing an Authentic Brand That Connects With Your Cust...
Branding Bootcamp: Developing an Authentic Brand That Connects With Your Cust...Branding Bootcamp: Developing an Authentic Brand That Connects With Your Cust...
Branding Bootcamp: Developing an Authentic Brand That Connects With Your Cust...
 
Scaling Mobile Development
Scaling Mobile DevelopmentScaling Mobile Development
Scaling Mobile Development
 
Neider
NeiderNeider
Neider
 
13 Ways to Spook Your Audience
13 Ways to Spook Your Audience13 Ways to Spook Your Audience
13 Ways to Spook Your Audience
 
Keynote &amp; on stage interview (carbo)
Keynote &amp; on stage interview (carbo)Keynote &amp; on stage interview (carbo)
Keynote &amp; on stage interview (carbo)
 

Similar to Introduction to GPU Computing

TestUpload
TestUploadTestUpload
TestUploadZarksaDS
 
oSC-2023-Cross-Build.pdf
oSC-2023-Cross-Build.pdfoSC-2023-Cross-Build.pdf
oSC-2023-Cross-Build.pdfAdrianSchrter1
 
Java on the GPU: Where are we now?
Java on the GPU: Where are we now?Java on the GPU: Where are we now?
Java on the GPU: Where are we now?Dmitry Alexandrov
 
Tech Day 2015: A Gentle Introduction to GPS and GNATbench
Tech Day 2015: A Gentle Introduction to GPS and GNATbenchTech Day 2015: A Gentle Introduction to GPS and GNATbench
Tech Day 2015: A Gentle Introduction to GPS and GNATbenchAdaCore
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...AMD Developer Central
 
Making your app soar without a container manifest
Making your app soar without a container manifestMaking your app soar without a container manifest
Making your app soar without a container manifestLibbySchulze
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
From Zero to Hero - All you need to do serious deep learning stuff in R
From Zero to Hero - All you need to do serious deep learning stuff in R From Zero to Hero - All you need to do serious deep learning stuff in R
From Zero to Hero - All you need to do serious deep learning stuff in R Kai Lichtenberg
 
Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Holden Karau
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerGeorge Markomanolis
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
 
Oh the compilers you'll build
Oh the compilers you'll buildOh the compilers you'll build
Oh the compilers you'll buildMark Stoodley
 
Developing games and graphic visualizations in Pascal
Developing games and graphic visualizations in PascalDeveloping games and graphic visualizations in Pascal
Developing games and graphic visualizations in PascalMichalis Kamburelis
 
workshop_8_c__.pdf
workshop_8_c__.pdfworkshop_8_c__.pdf
workshop_8_c__.pdfAtulAvhad2
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using SwiftDiego Freniche Brito
 
Webinar: Começando seus trabalhos com Machine Learning utilizando ferramentas...
Webinar: Começando seus trabalhos com Machine Learning utilizando ferramentas...Webinar: Começando seus trabalhos com Machine Learning utilizando ferramentas...
Webinar: Começando seus trabalhos com Machine Learning utilizando ferramentas...Embarcados
 

Similar to Introduction to GPU Computing (20)

TestUpload
TestUploadTestUpload
TestUpload
 
Hands on OpenCL
Hands on OpenCLHands on OpenCL
Hands on OpenCL
 
From Web to Mobile with Stage 3D
From Web to Mobile with Stage 3DFrom Web to Mobile with Stage 3D
From Web to Mobile with Stage 3D
 
oSC-2023-Cross-Build.pdf
oSC-2023-Cross-Build.pdfoSC-2023-Cross-Build.pdf
oSC-2023-Cross-Build.pdf
 
Java on the GPU: Where are we now?
Java on the GPU: Where are we now?Java on the GPU: Where are we now?
Java on the GPU: Where are we now?
 
Tech Day 2015: A Gentle Introduction to GPS and GNATbench
Tech Day 2015: A Gentle Introduction to GPS and GNATbenchTech Day 2015: A Gentle Introduction to GPS and GNATbench
Tech Day 2015: A Gentle Introduction to GPS and GNATbench
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
 
Making your app soar without a container manifest
Making your app soar without a container manifestMaking your app soar without a container manifest
Making your app soar without a container manifest
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
From Zero to Hero - All you need to do serious deep learning stuff in R
From Zero to Hero - All you need to do serious deep learning stuff in R From Zero to Hero - All you need to do serious deep learning stuff in R
From Zero to Hero - All you need to do serious deep learning stuff in R
 
Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI Supercomputer
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
Oh the compilers you'll build
Oh the compilers you'll buildOh the compilers you'll build
Oh the compilers you'll build
 
Cheap HPC
Cheap HPCCheap HPC
Cheap HPC
 
Developing games and graphic visualizations in Pascal
Developing games and graphic visualizations in PascalDeveloping games and graphic visualizations in Pascal
Developing games and graphic visualizations in Pascal
 
workshop_8_c__.pdf
workshop_8_c__.pdfworkshop_8_c__.pdf
workshop_8_c__.pdf
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
 
Webinar: Começando seus trabalhos com Machine Learning utilizando ferramentas...
Webinar: Começando seus trabalhos com Machine Learning utilizando ferramentas...Webinar: Começando seus trabalhos com Machine Learning utilizando ferramentas...
Webinar: Começando seus trabalhos com Machine Learning utilizando ferramentas...
 

More from Ilya Kuzovkin

Understanding Information Processing in Human Brain by Interpreting Machine L...
Understanding Information Processing in Human Brain by Interpreting Machine L...Understanding Information Processing in Human Brain by Interpreting Machine L...
Understanding Information Processing in Human Brain by Interpreting Machine L...Ilya Kuzovkin
 
The Brain and the Modern AI: Drastic Differences and Curious Similarities
The Brain and the Modern AI: Drastic Differences and Curious SimilaritiesThe Brain and the Modern AI: Drastic Differences and Curious Similarities
The Brain and the Modern AI: Drastic Differences and Curious SimilaritiesIlya Kuzovkin
 
The First Day at the Deep learning Zoo
The First Day at the Deep learning ZooThe First Day at the Deep learning Zoo
The First Day at the Deep learning ZooIlya Kuzovkin
 
Intuitive Intro to Gödel's Incompleteness Theorem
Intuitive Intro to Gödel's Incompleteness TheoremIntuitive Intro to Gödel's Incompleteness Theorem
Intuitive Intro to Gödel's Incompleteness TheoremIlya Kuzovkin
 
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML CampIntroduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML CampIlya Kuzovkin
 
Mastering the game of Go with deep neural networks and tree search (article o...
Mastering the game of Go with deep neural networks and tree search (article o...Mastering the game of Go with deep neural networks and tree search (article o...
Mastering the game of Go with deep neural networks and tree search (article o...Ilya Kuzovkin
 
Paper overview: "Deep Residual Learning for Image Recognition"
Paper overview: "Deep Residual Learning for Image Recognition"Paper overview: "Deep Residual Learning for Image Recognition"
Paper overview: "Deep Residual Learning for Image Recognition"Ilya Kuzovkin
 
Deep Learning: Theory, History, State of the Art & Practical Tools
Deep Learning: Theory, History, State of the Art & Practical ToolsDeep Learning: Theory, History, State of the Art & Practical Tools
Deep Learning: Theory, History, State of the Art & Practical ToolsIlya Kuzovkin
 
NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?
NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?
NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?Ilya Kuzovkin
 
Neural Turing Machines
Neural Turing MachinesNeural Turing Machines
Neural Turing MachinesIlya Kuzovkin
 
Neuroimaging: Intracortical, fMRI, EEG
Neuroimaging: Intracortical, fMRI, EEGNeuroimaging: Intracortical, fMRI, EEG
Neuroimaging: Intracortical, fMRI, EEGIlya Kuzovkin
 
Article Overview "Reach and grasp by people with tetraplegia using a neurally...
Article Overview "Reach and grasp by people with tetraplegia using a neurally...Article Overview "Reach and grasp by people with tetraplegia using a neurally...
Article Overview "Reach and grasp by people with tetraplegia using a neurally...Ilya Kuzovkin
 
Soft Introduction to Brain-Computer Interfaces and Machine Learning
Soft Introduction to Brain-Computer Interfaces and Machine LearningSoft Introduction to Brain-Computer Interfaces and Machine Learning
Soft Introduction to Brain-Computer Interfaces and Machine LearningIlya Kuzovkin
 
Ilya Kuzovkin - Adaptive Interactive Learning for Brain-Computer Interfaces
Ilya Kuzovkin - Adaptive Interactive Learning for Brain-Computer InterfacesIlya Kuzovkin - Adaptive Interactive Learning for Brain-Computer Interfaces
Ilya Kuzovkin - Adaptive Interactive Learning for Brain-Computer InterfacesIlya Kuzovkin
 

More from Ilya Kuzovkin (14)

Understanding Information Processing in Human Brain by Interpreting Machine L...
Understanding Information Processing in Human Brain by Interpreting Machine L...Understanding Information Processing in Human Brain by Interpreting Machine L...
Understanding Information Processing in Human Brain by Interpreting Machine L...
 
The Brain and the Modern AI: Drastic Differences and Curious Similarities
The Brain and the Modern AI: Drastic Differences and Curious SimilaritiesThe Brain and the Modern AI: Drastic Differences and Curious Similarities
The Brain and the Modern AI: Drastic Differences and Curious Similarities
 
The First Day at the Deep learning Zoo
The First Day at the Deep learning ZooThe First Day at the Deep learning Zoo
The First Day at the Deep learning Zoo
 
Intuitive Intro to Gödel's Incompleteness Theorem
Intuitive Intro to Gödel's Incompleteness TheoremIntuitive Intro to Gödel's Incompleteness Theorem
Intuitive Intro to Gödel's Incompleteness Theorem
 
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML CampIntroduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
 
Mastering the game of Go with deep neural networks and tree search (article o...
Mastering the game of Go with deep neural networks and tree search (article o...Mastering the game of Go with deep neural networks and tree search (article o...
Mastering the game of Go with deep neural networks and tree search (article o...
 
Paper overview: "Deep Residual Learning for Image Recognition"
Paper overview: "Deep Residual Learning for Image Recognition"Paper overview: "Deep Residual Learning for Image Recognition"
Paper overview: "Deep Residual Learning for Image Recognition"
 
Deep Learning: Theory, History, State of the Art & Practical Tools
Deep Learning: Theory, History, State of the Art & Practical ToolsDeep Learning: Theory, History, State of the Art & Practical Tools
Deep Learning: Theory, History, State of the Art & Practical Tools
 
NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?
NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?
NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?
 
Neural Turing Machines
Neural Turing MachinesNeural Turing Machines
Neural Turing Machines
 
Neuroimaging: Intracortical, fMRI, EEG
Neuroimaging: Intracortical, fMRI, EEGNeuroimaging: Intracortical, fMRI, EEG
Neuroimaging: Intracortical, fMRI, EEG
 
Article Overview "Reach and grasp by people with tetraplegia using a neurally...
Article Overview "Reach and grasp by people with tetraplegia using a neurally...Article Overview "Reach and grasp by people with tetraplegia using a neurally...
Article Overview "Reach and grasp by people with tetraplegia using a neurally...
 
Soft Introduction to Brain-Computer Interfaces and Machine Learning
Soft Introduction to Brain-Computer Interfaces and Machine LearningSoft Introduction to Brain-Computer Interfaces and Machine Learning
Soft Introduction to Brain-Computer Interfaces and Machine Learning
 
Ilya Kuzovkin - Adaptive Interactive Learning for Brain-Computer Interfaces
Ilya Kuzovkin - Adaptive Interactive Learning for Brain-Computer InterfacesIlya Kuzovkin - Adaptive Interactive Learning for Brain-Computer Interfaces
Ilya Kuzovkin - Adaptive Interactive Learning for Brain-Computer Interfaces
 

Recently uploaded

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Recently uploaded (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Introduction to GPU Computing

  • 1. INTRODUCTION TO GPU COMPUTING Ilya Kuzovkin 13 May 2014, Tartu
  • 3. SIMPLE OPENGL PROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations.
  • 4. SIMPLE OPENGL PROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations. ! Let us start from observing an example of parallelism in a simple OpenGL application.
  • 5. SIMPLE OPENGL PROGRAM You will need CodeBlocksWindows, Linux or XCodeMac to run this example. • Install CodeBlocks bundled with MinGW compiler from http://www.codeblocks.org/downloads/26 ! • Download codebase from https://github.com/kuz/ Introduction-to-GPU-Computing ! • Open the project from the code/Cube! ! • Compile & run it
  • 6. SHADER PROGRAM Program which is executed on GPU. Has to be written using shading language. In OpenGL this language is GLSL, which is based on C. http://www.opengl.org/wiki/Shader
  • 7. SHADER PROGRAM Program which is executed on GPU. Has to be written using shading language. In OpenGL this language is GLSL, which is based on C. OpenGL has 5 main shader stages: • Vertex Shader • Tessellation Control • Geometry Shader • Fragment Shader • Compute Shader (since 4.3) http://www.opengl.org/wiki/Shader
  • 8. SHADER PROGRAM Program which is executed on GPU. Has to be written using shading language. In OpenGL this language is GLSL, which is based on C. OpenGL has 5 main shader stages: • Vertex Shader • Tessellation Control • Geometry Shader • Fragment Shader • Compute Shader (since 4.3) http://www.opengl.org/wiki/Shader
  • 9. LIGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene.
  • 10. LIGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene. https://github.com/konstantint/ComputerGraphics2013/blob/master/Lectures/07%20-%20Color%20and%20Lighting/slides07_colorandlighting.pdf
  • 11. LIGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene. https://github.com/konstantint/ComputerGraphics2013/blob/master/Lectures/07%20-%20Color%20and%20Lighting/slides07_colorandlighting.pdf Exercise: code that equation into fragment shader of the Cube program
  • 13. • Run the program with lighting enabled and look at FPS values COMPARE FPS
  • 14. • Run the program with lighting enabled and look at FPS values ! • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires. COMPARE FPS
  • 15. • Run the program with lighting enabled and look at FPS values ! • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires. ! • Note that these computations are performed on CPU COMPARE FPS
  • 16. • Run the program with lighting enabled and look at FPS values ! • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires. ! • Note that these computations are performed on CPU ! • Observe how FPS has changed COMPARE FPS
  • 17. • Run the program with lighting enabled and look at FPS values ! • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires. ! • Note that these computations are performed on CPU ! • Observe how FPS has changed Parallel computations are fast on GPU. Lets use it to compute something useful. COMPARE FPS
  • 19. OPENGL PIPELINE + GLSL http://www.opengl.org/wiki/Framebuffer Take the input data from the CPU memory and put it as an image into the GPU memory
  • 20. http://www.opengl.org/wiki/Framebuffer In the fragment shader perform a computation on each of the pixels of that image Take the input data from the CPU memory and put it as an image into the GPU memory OPENGL PIPELINE + GLSL
  • 21. http://www.opengl.org/wiki/Framebuffer In the fragment shader perform a computation on each of the pixels of that image Store the resulting image to the Render Buffer inside the GPU memory Take the input data from the CPU memory and put it as an image into the GPU memory OPENGL PIPELINE + GLSL
  • 22. http://www.opengl.org/wiki/Framebuffer Read output from the GPU memory back to the CPU memory Store the resulting image to the Render Buffer inside the GPU memory In the fragment shader perform a computation on each of the pixels of that image Take the input data from the CPU memory and put it as an image into the GPU memory OPENGL PIPELINE + GLSL
  • 23. • Create texture where will store the input data http://www.opengl.org/wiki/Framebuffer OPENGL PIPELINE + GLSL
  • 24. • Create texture where will store the input data ! ! ! ! ! • Create FrameBuffer Object (FBO) to “render” to http://www.opengl.org/wiki/Framebuffer OPENGL PIPELINE + GLSL
  • 25. • Run OpenGL pipeline http://www.opengl.org/wiki/Framebuffer OPENGL PIPELINE + GLSL
  • 26. • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix http://www.opengl.org/wiki/Framebuffer OPENGL PIPELINE + GLSL
  • 27. • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture http://www.opengl.org/wiki/Framebuffer OPENGL PIPELINE + GLSL
  • 28. • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) http://www.opengl.org/wiki/Framebuffer OPENGL PIPELINE + GLSL
  • 29. • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) ! • Read the data from the Render Buffer http://www.opengl.org/wiki/Framebuffer OPENGL PIPELINE + GLSL
  • 30. • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) ! • Read the data from the Render Buffer ! ! ! ! • Can we use that to properly debug GLSL? http://www.opengl.org/wiki/Framebuffer OPENGL PIPELINE + GLSL
  • 31. Run the project from the code/FBO DEMO
  • 33. COMPUTE SHADER • Since OpenGL 4.3 • Used to compute things not related to rendering directly
  • 34. COMPUTE SHADER • Since OpenGL 4.3 • Used to compute things not related to rendering directly
  • 35. COMPUTE SHADER http://web.engr.oregonstate.edu/~mjb/cs557/Handouts/compute.shader.1pp.pdf • Since OpenGL 4.3 • Used to compute things not related to rendering directly Will not talk about it
  • 37. Supported only by nVidia hardware Supported by nVidia, AMD, Intel, Qualcomm https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl http://wiki.tiker.net/CudaVsOpenCL
  • 38. Supported only by nVidia hardware Supported by nVidia, AMD, Intel, Qualcomm https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only by nVidia OpenCL http://wiki.tiker.net/CudaVsOpenCL
  • 39. Supported only by nVidia hardware Supported by nVidia, AMD, Intel, Qualcomm https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only by nVidia OpenCL http://wiki.tiker.net/CudaVsOpenCL ~same performance levels
  • 40. Supported only by nVidia hardware Supported by nVidia, AMD, Intel, Qualcomm https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only by nVidia OpenCL http://wiki.tiker.net/CudaVsOpenCL ~same performance levels Developer-friendly OpenCL
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 50. WRITE AND READ DATA ON GPU
  • 51. WRITE AND READ DATA ON GPU … run computations here …
  • 52. WRITE AND READ DATA ON GPU … run computations here …
  • 58. DEMO Open, study and run the project from the code/OpenCL
  • 60. CUDA PROGRAMMING MODEL • CPU is called “host” • Move data CPU <-> GPU memory cudaMemcopy • Allocate memory cudaMalloc   • Launch kernels on GPU • GPU is called “device”
  • 61. CUDA PROGRAMMING MODEL • CPU is called “host” • Move data CPU <-> GPU memory cudaMemcopy • Allocate memory cudaMalloc   • Launch kernels on GPU • GPU is called “device” 1. CPU allocates memory on GPU 2. CPU copies data to GPU memory 3. CPU launches kernels on GPU (process the data) 4. CPU copies results back to CPU memory Typical CUDA program
  • 62. CUDA PROGRAMMING MODEL • CPU is called “host” • Move data CPU <-> GPU memory cudaMemcopy • Allocate memory cudaMalloc   • Launch kernels on GPU • GPU is called “device” 1. CPU allocates memory on GPU 2. CPU copies data to GPU memory 3. CPU launches kernels on GPU (process the data) 4. CPU copies results back to CPU memory Typical CUDA program Very similar to the logic of OpenCL
  • 63. EXAMPLE Introduction to Parallel Programming @ Udacity https://www.udacity.com/course/cs344
  • 64. EXAMPLE Introduction to Parallel Programming @ Udacity https://www.udacity.com/course/cs344
  • 65. EXAMPLE Introduction to Parallel Programming @ Udacity https://www.udacity.com/course/cs344
  • 66. EXAMPLE Introduction to Parallel Programming @ Udacity https://www.udacity.com/course/cs344
  • 67. EXAMPLE Introduction to Parallel Programming @ Udacity https://www.udacity.com/course/cs344
  • 68. EXAMPLE Introduction to Parallel Programming @ Udacity https://www.udacity.com/course/cs344
  • 69. EXAMPLE Introduction to Parallel Programming @ Udacity https://www.udacity.com/course/cs344
  • 71. LINKS • Code repository for this presentation • https://github.com/kuz/Introduction-to-GPU-Computing • Feel free to leave feature requests there, ask questions, etc. ! • OpenCL tutorials & presentations • http://streamcomputing.eu/knowledge/for-developers/tutorials/ • http://opencl.codeplex.com/wikipage?title=OpenCL%20Tutorials%20-%201&referringTitle=OpenCL%20Tutorials • http://www.cc.gatech.edu/~vetter/keeneland/tutorial-2011-04-14/06-intro_to_opencl.pdf ! • Introduction to Parallel Computing @ www.udacity.com • https://www.udacity.com/course/cs344 ! • Slides about Compute Shader • http://web.engr.oregonstate.edu/~mjb/cs557/Handouts/compute.shader.1pp.pdf ! • Introduction to Computer Graphics with codebases • https://github.com/konstantint/ComputerGraphics2013 ! • GLSL Computing (aka “Old school”) • http://www.computer-graphics.se/gpu-computing/lab1.html