SlideShare a Scribd company logo
1 of 6
Download to read offline
2010/2/25
1
GPGPU
Ked
Result
Computation of Normal Vector
Image resolution: 640 x 480
CPU: 625 clock time
GPU: 125 clock time
Result
Computation of Normal Vector
Image resolution: 1280 x 1024
CPU: 2500 clock time
GPU: 172 clock time
OK, What is GPU
 A graphics accelerator incorporates custom
microchips which contain special mathematical
operations commonly used in graphics rendering.
GPGPU
• General purpose computing on GPU
 GPGPU
 GPGP
 GP2
(boring RD -.-||)‫‏‬
hi, I am R2-D2
Why faster
2010/2/25
2
Why faster Why faster
 CPU GPU
General purpose Specialized hardware
Serial execution Parallel execution
Minimum latency Maximum throughput
Development tools:
Focus on GPGPU
 CUDA:
 Compute Unified Device Architecture
 Developed by NVIDIA
 C like language
 Full developing environment
 Compiler
 Debugger
 Math libraries
Development tools:
Focus on GPGPU
 Advantage:
 Shared memory amongst threads
 16k
 Faster downloads and readbacks to and from GPU
 Full support for integer and bitwise operations
Development tools:
Shader programming
 ARB low-level assembly language
 OpenGL shading language
 Cg programming language
 DirectX high-level shader language
Development tools:
Shader programming
2010/2/25
3
Development tools:
Shader programming
 Developing tools of GLSL:
Pipeline of GPU processing
Shader programming
Vertex shader
Fragment shader
Geometry shader
RenderMan shading language
 Developed by Pixar has uncompromising image
quality as its fundamental goal
 Light shader
 Displacement shader
 Surface shader
 Volume shader
 Imager shader
Vertex shader Fragment shader
2010/2/25
4
Streaming of fragment shader
 Stream processing is a computer programming paradigm,
related to SIMD, that allows some applications to more
easily exploit a limited form of parallel processing. Such
applications can use multiple computational units, such
as the floating point units on a GPU, without explicitly
managing allocation, synchronization, or communication
among those units.
Branch of fragment shader
Conception of GPGPU
 Textures => Computing arrays
 Vertex Coordinates => Computational range
 Fragment programs => Computation
 Read from framebuffer => Get result
Case study:
Computation of normal vector
 Normal(V0) =
[ normal(F401) +
normal(F102) +
normal(F203) +
normal(F304) ] / 4
 Normal(F102) =
cross(v1v0, v2v0)‫‏‬
Prepare:
Choose graphic card
Prepare:
Test the graphic card
need
2010/2/25
5
Use GLSL in BCB environment:
Call GLee library Other choice: GLew
Install shader:
Run-time building
Texture:
Computing array
Vertex coordinate:
Computational range
Fragment program:
Computation
Read from framebuffer:
Get result
FameBuffer Object is a better choice
2010/2/25
6
Trivia:
Ghost in numerical computing
Review the result
Image resolution: 640 x 480
CPU: 625 clock time
GPU: 125 clock time
Image resolution: 1280 x 1024
CPU: 2500 clock time
GPU: 172 clock time
Reference
 GPU Gems 2
 OpenGL Shading Language
 OpenGL Programming Guide
 Dominik Göddeke
-- GPGPU::Basic Math Tutorial
(website)‫‏‬
 GPGPU: SIGGRAPH 2004 course
 Batch, batch, batch:
what does it really means
Thx.

More Related Content

What's hot

Achieving Improved Performance In Multi-threaded Programming With GPU Computing
Achieving Improved Performance In Multi-threaded Programming With GPU ComputingAchieving Improved Performance In Multi-threaded Programming With GPU Computing
Achieving Improved Performance In Multi-threaded Programming With GPU ComputingMesbah Uddin Khan
 
Accelerating Real Time Applications on Heterogeneous Platforms
Accelerating Real Time Applications on Heterogeneous PlatformsAccelerating Real Time Applications on Heterogeneous Platforms
Accelerating Real Time Applications on Heterogeneous PlatformsIJMER
 
High Performance Pedestrian Detection On TEGRA X1
High Performance Pedestrian Detection On TEGRA X1High Performance Pedestrian Detection On TEGRA X1
High Performance Pedestrian Detection On TEGRA X1NVIDIA
 
Inference accelerators
Inference acceleratorsInference accelerators
Inference acceleratorsDarshanG13
 
Intel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learningIntel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learninggeetachauhan
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And EffectsThomas Goddard
 
Optimal Virtual Machine Placement across Multiple Cloud Providers
Optimal Virtual Machine Placement across Multiple Cloud ProvidersOptimal Virtual Machine Placement across Multiple Cloud Providers
Optimal Virtual Machine Placement across Multiple Cloud ProvidersSivadon Chaisiri
 
PhD defense talk (portfolio of my expertise)
PhD defense talk (portfolio of my expertise)PhD defense talk (portfolio of my expertise)
PhD defense talk (portfolio of my expertise)Gernot Ziegler
 
Artificial Neural Networks for Storm Surge Prediction in North Carolina
Artificial Neural Networks for Storm Surge Prediction in North CarolinaArtificial Neural Networks for Storm Surge Prediction in North Carolina
Artificial Neural Networks for Storm Surge Prediction in North CarolinaAnton Bezuglov
 
Metrics 2.0 @ Monitorama PDX 2014
Metrics 2.0 @ Monitorama PDX 2014Metrics 2.0 @ Monitorama PDX 2014
Metrics 2.0 @ Monitorama PDX 2014Dieter Plaetinck
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practicesLior Sidi
 
Monte Carlo on GPUs
Monte Carlo on GPUsMonte Carlo on GPUs
Monte Carlo on GPUsfcassier
 
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARMEdge AI and Vision Alliance
 
Distributed deep learning optimizations
Distributed deep learning optimizationsDistributed deep learning optimizations
Distributed deep learning optimizationsgeetachauhan
 
Example uses of gpu compute models
Example uses of gpu compute modelsExample uses of gpu compute models
Example uses of gpu compute modelsPedram Mazloom
 
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...VIMALKUMAR KUMARESAN
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveNetronome
 

What's hot (20)

Haskell Accelerate
Haskell  AccelerateHaskell  Accelerate
Haskell Accelerate
 
cnsm2011_slide
cnsm2011_slidecnsm2011_slide
cnsm2011_slide
 
Achieving Improved Performance In Multi-threaded Programming With GPU Computing
Achieving Improved Performance In Multi-threaded Programming With GPU ComputingAchieving Improved Performance In Multi-threaded Programming With GPU Computing
Achieving Improved Performance In Multi-threaded Programming With GPU Computing
 
Accelerating Real Time Applications on Heterogeneous Platforms
Accelerating Real Time Applications on Heterogeneous PlatformsAccelerating Real Time Applications on Heterogeneous Platforms
Accelerating Real Time Applications on Heterogeneous Platforms
 
High Performance Pedestrian Detection On TEGRA X1
High Performance Pedestrian Detection On TEGRA X1High Performance Pedestrian Detection On TEGRA X1
High Performance Pedestrian Detection On TEGRA X1
 
Inference accelerators
Inference acceleratorsInference accelerators
Inference accelerators
 
Intel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learningIntel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learning
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
 
Optimal Virtual Machine Placement across Multiple Cloud Providers
Optimal Virtual Machine Placement across Multiple Cloud ProvidersOptimal Virtual Machine Placement across Multiple Cloud Providers
Optimal Virtual Machine Placement across Multiple Cloud Providers
 
PhD defense talk (portfolio of my expertise)
PhD defense talk (portfolio of my expertise)PhD defense talk (portfolio of my expertise)
PhD defense talk (portfolio of my expertise)
 
Artificial Neural Networks for Storm Surge Prediction in North Carolina
Artificial Neural Networks for Storm Surge Prediction in North CarolinaArtificial Neural Networks for Storm Surge Prediction in North Carolina
Artificial Neural Networks for Storm Surge Prediction in North Carolina
 
Metrics 2.0 @ Monitorama PDX 2014
Metrics 2.0 @ Monitorama PDX 2014Metrics 2.0 @ Monitorama PDX 2014
Metrics 2.0 @ Monitorama PDX 2014
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
 
Monte Carlo on GPUs
Monte Carlo on GPUsMonte Carlo on GPUs
Monte Carlo on GPUs
 
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
 
An35225228
An35225228An35225228
An35225228
 
Distributed deep learning optimizations
Distributed deep learning optimizationsDistributed deep learning optimizations
Distributed deep learning optimizations
 
Example uses of gpu compute models
Example uses of gpu compute modelsExample uses of gpu compute models
Example uses of gpu compute models
 
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep Dive
 

Viewers also liked

Open CL For Haifa Linux Club
Open CL For Haifa Linux ClubOpen CL For Haifa Linux Club
Open CL For Haifa Linux ClubOfer Rosenberg
 
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...npinto
 
CSTalks - GPGPU - 19 Jan
CSTalks  -  GPGPU - 19 JanCSTalks  -  GPGPU - 19 Jan
CSTalks - GPGPU - 19 Jancstalks
 
General Programming on the GPU - Confoo
General Programming on the GPU - ConfooGeneral Programming on the GPU - Confoo
General Programming on the GPU - ConfooSirKetchup
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeOfer Rosenberg
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...AMD Developer Central
 
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...Storti Mario
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLinaro
 
GPU Technology Conference 2014 Keynote
GPU Technology Conference 2014 KeynoteGPU Technology Conference 2014 Keynote
GPU Technology Conference 2014 KeynoteNVIDIA
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Rob Gillen
 
E-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPUE-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPUNur Ahmadi
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Angela Mendoza M.
 
"The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres...
"The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres..."The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres...
"The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres...Edge AI and Vision Alliance
 
GPUDirect RDMA and Green Multi-GPU Architectures
GPUDirect RDMA and Green Multi-GPU ArchitecturesGPUDirect RDMA and Green Multi-GPU Architectures
GPUDirect RDMA and Green Multi-GPU Architecturesinside-BigData.com
 
Introduction to gpu architecture
Introduction to gpu architectureIntroduction to gpu architecture
Introduction to gpu architectureCHIHTE LU
 
CS 354 GPU Architecture
CS 354 GPU ArchitectureCS 354 GPU Architecture
CS 354 GPU ArchitectureMark Kilgard
 
Introduction to OpenCL, 2010
Introduction to OpenCL, 2010Introduction to OpenCL, 2010
Introduction to OpenCL, 2010Tomasz Bednarz
 

Viewers also liked (20)

Open CL For Haifa Linux Club
Open CL For Haifa Linux ClubOpen CL For Haifa Linux Club
Open CL For Haifa Linux Club
 
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
 
Gpgpu intro
Gpgpu introGpgpu intro
Gpgpu intro
 
CSTalks - GPGPU - 19 Jan
CSTalks  -  GPGPU - 19 JanCSTalks  -  GPGPU - 19 Jan
CSTalks - GPGPU - 19 Jan
 
General Programming on the GPU - Confoo
General Programming on the GPU - ConfooGeneral Programming on the GPU - Confoo
General Programming on the GPU - Confoo
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universe
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
 
Cliff sugerman
Cliff sugermanCliff sugerman
Cliff sugerman
 
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 
GPU Technology Conference 2014 Keynote
GPU Technology Conference 2014 KeynoteGPU Technology Conference 2014 Keynote
GPU Technology Conference 2014 Keynote
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
 
E-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPUE-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPU
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08
 
"The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres...
"The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres..."The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres...
"The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres...
 
GPUDirect RDMA and Green Multi-GPU Architectures
GPUDirect RDMA and Green Multi-GPU ArchitecturesGPUDirect RDMA and Green Multi-GPU Architectures
GPUDirect RDMA and Green Multi-GPU Architectures
 
Introduction to gpu architecture
Introduction to gpu architectureIntroduction to gpu architecture
Introduction to gpu architecture
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
CS 354 GPU Architecture
CS 354 GPU ArchitectureCS 354 GPU Architecture
CS 354 GPU Architecture
 
Introduction to OpenCL, 2010
Introduction to OpenCL, 2010Introduction to OpenCL, 2010
Introduction to OpenCL, 2010
 

Similar to Gpgpu

GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauAMD Developer Central
 
Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDAMartin Peniak
 
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONSA SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONScseij
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Intel® Software
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaRob Gillen
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationCliff Kinard
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Griffon Topic2 Presentation (Tia)
Griffon Topic2 Presentation (Tia)Griffon Topic2 Presentation (Tia)
Griffon Topic2 Presentation (Tia)Nat Weerawan
 
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdfJIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdfSamiraKids
 

Similar to Gpgpu (20)

GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
 
Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDA
 
Deep Learning Edge
Deep Learning Edge Deep Learning Edge
Deep Learning Edge
 
NVIDIA CUDA
NVIDIA CUDANVIDIA CUDA
NVIDIA CUDA
 
Cuda intro
Cuda introCuda intro
Cuda intro
 
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONSA SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONS
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with Cuda
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Griffon Topic2 Presentation (Tia)
Griffon Topic2 Presentation (Tia)Griffon Topic2 Presentation (Tia)
Griffon Topic2 Presentation (Tia)
 
FIR filter on GPU
FIR filter on GPUFIR filter on GPU
FIR filter on GPU
 
Mod 2 hardware_graphics.pdf
Mod 2 hardware_graphics.pdfMod 2 hardware_graphics.pdf
Mod 2 hardware_graphics.pdf
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 
Qt Programming on TI Processors
Qt Programming on TI ProcessorsQt Programming on TI Processors
Qt Programming on TI Processors
 
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdfJIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
 
Introduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSPIntroduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSP
 

More from Su Yan-Jen

Captain america painting competition -- 13
Captain america painting competition -- 13Captain america painting competition -- 13
Captain america painting competition -- 13Su Yan-Jen
 
Captain america painting competition -- 12
Captain america painting competition -- 12Captain america painting competition -- 12
Captain america painting competition -- 12Su Yan-Jen
 
Captain america painting competition -- 11
Captain america painting competition -- 11Captain america painting competition -- 11
Captain america painting competition -- 11Su Yan-Jen
 
Captain america painting competition 10
Captain america painting competition 10Captain america painting competition 10
Captain america painting competition 10Su Yan-Jen
 
Captain america painting competition 9
Captain america painting competition 9Captain america painting competition 9
Captain america painting competition 9Su Yan-Jen
 
Captain america painting competition 8
 Captain america painting competition 8 Captain america painting competition 8
Captain america painting competition 8Su Yan-Jen
 
Captain america painting competition 7
 Captain america painting competition 7 Captain america painting competition 7
Captain america painting competition 7Su Yan-Jen
 
Captain america painting competition 6
 Captain america painting competition 6 Captain america painting competition 6
Captain america painting competition 6Su Yan-Jen
 
Captain america painting competition 5
Captain america painting competition 5Captain america painting competition 5
Captain america painting competition 5Su Yan-Jen
 
Captain america painting competition 4
Captain america  painting competition 4Captain america  painting competition 4
Captain america painting competition 4Su Yan-Jen
 
Captain america painting competition 3
Captain america painting competition 3Captain america painting competition 3
Captain america painting competition 3Su Yan-Jen
 
Captain america painting competition 2
Captain america painting competition 2Captain america painting competition 2
Captain america painting competition 2Su Yan-Jen
 
Captain America painting competition
Captain America painting competitionCaptain America painting competition
Captain America painting competitionSu Yan-Jen
 
PM2.5 visualization
PM2.5 visualizationPM2.5 visualization
PM2.5 visualizationSu Yan-Jen
 
Stereo matching
Stereo matchingStereo matching
Stereo matchingSu Yan-Jen
 
Face recognition
Face recognitionFace recognition
Face recognitionSu Yan-Jen
 
Data mining of commercial surveillance
Data mining of commercial surveillanceData mining of commercial surveillance
Data mining of commercial surveillanceSu Yan-Jen
 

More from Su Yan-Jen (20)

Captain america painting competition -- 13
Captain america painting competition -- 13Captain america painting competition -- 13
Captain america painting competition -- 13
 
Captain america painting competition -- 12
Captain america painting competition -- 12Captain america painting competition -- 12
Captain america painting competition -- 12
 
Captain america painting competition -- 11
Captain america painting competition -- 11Captain america painting competition -- 11
Captain america painting competition -- 11
 
Captain america painting competition 10
Captain america painting competition 10Captain america painting competition 10
Captain america painting competition 10
 
Captain america painting competition 9
Captain america painting competition 9Captain america painting competition 9
Captain america painting competition 9
 
Captain america painting competition 8
 Captain america painting competition 8 Captain america painting competition 8
Captain america painting competition 8
 
Captain america painting competition 7
 Captain america painting competition 7 Captain america painting competition 7
Captain america painting competition 7
 
Captain america painting competition 6
 Captain america painting competition 6 Captain america painting competition 6
Captain america painting competition 6
 
Captain america painting competition 5
Captain america painting competition 5Captain america painting competition 5
Captain america painting competition 5
 
Captain america painting competition 4
Captain america  painting competition 4Captain america  painting competition 4
Captain america painting competition 4
 
Captain america painting competition 3
Captain america painting competition 3Captain america painting competition 3
Captain america painting competition 3
 
Captain america painting competition 2
Captain america painting competition 2Captain america painting competition 2
Captain america painting competition 2
 
Captain America painting competition
Captain America painting competitionCaptain America painting competition
Captain America painting competition
 
PM2.5 visualization
PM2.5 visualizationPM2.5 visualization
PM2.5 visualization
 
Transformer 3
Transformer 3Transformer 3
Transformer 3
 
Transformer 2
Transformer 2Transformer 2
Transformer 2
 
Transformer
TransformerTransformer
Transformer
 
Stereo matching
Stereo matchingStereo matching
Stereo matching
 
Face recognition
Face recognitionFace recognition
Face recognition
 
Data mining of commercial surveillance
Data mining of commercial surveillanceData mining of commercial surveillance
Data mining of commercial surveillance
 

Recently uploaded

JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governanceWSO2
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 

Recently uploaded (20)

JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 

Gpgpu

  • 1. 2010/2/25 1 GPGPU Ked Result Computation of Normal Vector Image resolution: 640 x 480 CPU: 625 clock time GPU: 125 clock time Result Computation of Normal Vector Image resolution: 1280 x 1024 CPU: 2500 clock time GPU: 172 clock time OK, What is GPU  A graphics accelerator incorporates custom microchips which contain special mathematical operations commonly used in graphics rendering. GPGPU • General purpose computing on GPU  GPGPU  GPGP  GP2 (boring RD -.-||)‫‏‬ hi, I am R2-D2 Why faster
  • 2. 2010/2/25 2 Why faster Why faster  CPU GPU General purpose Specialized hardware Serial execution Parallel execution Minimum latency Maximum throughput Development tools: Focus on GPGPU  CUDA:  Compute Unified Device Architecture  Developed by NVIDIA  C like language  Full developing environment  Compiler  Debugger  Math libraries Development tools: Focus on GPGPU  Advantage:  Shared memory amongst threads  16k  Faster downloads and readbacks to and from GPU  Full support for integer and bitwise operations Development tools: Shader programming  ARB low-level assembly language  OpenGL shading language  Cg programming language  DirectX high-level shader language Development tools: Shader programming
  • 3. 2010/2/25 3 Development tools: Shader programming  Developing tools of GLSL: Pipeline of GPU processing Shader programming Vertex shader Fragment shader Geometry shader RenderMan shading language  Developed by Pixar has uncompromising image quality as its fundamental goal  Light shader  Displacement shader  Surface shader  Volume shader  Imager shader Vertex shader Fragment shader
  • 4. 2010/2/25 4 Streaming of fragment shader  Stream processing is a computer programming paradigm, related to SIMD, that allows some applications to more easily exploit a limited form of parallel processing. Such applications can use multiple computational units, such as the floating point units on a GPU, without explicitly managing allocation, synchronization, or communication among those units. Branch of fragment shader Conception of GPGPU  Textures => Computing arrays  Vertex Coordinates => Computational range  Fragment programs => Computation  Read from framebuffer => Get result Case study: Computation of normal vector  Normal(V0) = [ normal(F401) + normal(F102) + normal(F203) + normal(F304) ] / 4  Normal(F102) = cross(v1v0, v2v0)‫‏‬ Prepare: Choose graphic card Prepare: Test the graphic card need
  • 5. 2010/2/25 5 Use GLSL in BCB environment: Call GLee library Other choice: GLew Install shader: Run-time building Texture: Computing array Vertex coordinate: Computational range Fragment program: Computation Read from framebuffer: Get result FameBuffer Object is a better choice
  • 6. 2010/2/25 6 Trivia: Ghost in numerical computing Review the result Image resolution: 640 x 480 CPU: 625 clock time GPU: 125 clock time Image resolution: 1280 x 1024 CPU: 2500 clock time GPU: 172 clock time Reference  GPU Gems 2  OpenGL Shading Language  OpenGL Programming Guide  Dominik Göddeke -- GPGPU::Basic Math Tutorial (website)‫‏‬  GPGPU: SIGGRAPH 2004 course  Batch, batch, batch: what does it really means Thx.