SlideShare a Scribd company logo
GPU’s and CUDA
CPU Architecture
                                              -  good for serial programs
                                              -  do many different things well
                                              -  many transistors for purposes
                                              other than ALUs (eg. flow control
                                              and caching)
                                              -  memory access is slow (1GB/s)
                                              -  switching threads is slow




Image from Alex Moore, “Introduction to Programming in CUDA”,
http://astro.pas.rochester.edu/~aquillen/gpuworkshop.html
GPU Architecture
                           -  many processors perform
Cache Control ALUs
                           similar operations on a large
                           data set in parallel (single-
                           instruction multiple-data
                           parallelism)
                           -  recent GPUs have around 30
                           multiprocessors, each containing
                           8 stream processors
                           -  GPUs devote most (80%) of
                           their transistors to ALUs
                           -  fast memory (80GB/s)
Thread Hierarchy
   -  a block of threads runs on a
   single stream processor

   -  a grid of blocks makes up
   the entire set

   -  each thread in a block can
   access the same shared
   memory

   -  many more threads than
   processors

   Image from Johan Seland, “CUDA Programming”,
   http://heim.ifi.uio.no/~knutm/geilo2008/seland.pdf
Memory Hierarchy




Image from Johan Seland, “CUDA Programming”, http://heim.ifi.uio.no/~knutm/geilo2008/seland.pdf
CUDA
-  a set of C extensions for running programs on a GPU
-  Windows, Linux, Mac…. Nvidia Cards only
-  single instruction, multiple data
-  gives you direct access to the memory architecture
-  don’t need to know graphics APIs
-  built-in libraries: cudaMalloc, cudaFree, cudaMemcpy, cublas, cufft
Elementwise Matrix Addition
CPU Program
void add_matrix(float* a, float* b, float* c, int N) { 
    int index; 
    for ( int i = 0; i < N; ++i ) {
        for ( int j = 0; j < N; ++j ) { 
            index = i + j*N; 
            c[index] = a[index] + b[index]; 
        }
    }
}

int main() { 
    add_matrix( a, b, c, N ); 
}
Elemtwise Matix Addition
CUDA program
__global__ add_matrix(float* a, float* b, float* c, int N ) { 
    int i = blockIdx.x * blockDim.x + threadIdx.x; 
    int j = blockIdx.y * blockDim.y + threadIdx.y; 
    int index = i + j*N; 

     if ( i < N && j < N ) 
         c[index] = a[index] + b[index]; 
}

int main() { 
    dim3 dimBlock( blocksize, blocksize ); 
    dim3 dimGrid( N/dimBlock.x, N/dimBlock.y ); 
    add_matrix<<<dimGrid, dimBlock>>>( a, b, c, N );
}
Results
                  (for tasks relevant to the MWA Real Time System)

                                  Imaging


                                Gridding *


                     UnpeelTileResponse


                        PeelTileResponse
  Module




                       ReRotateVisibilities


                    MeasureTileResponse


                MeasureIonosphericOffset


           RotateAndAccumulateVisibilities

                                              0   10   20        30           40          50          60      70
                                                                        Speedup




Image from Kevin Dale, “A Graphics Hardware-Accelerated Real-Time Processing Pipeline for Radio Astronomy”,
Presented at AstroGPU, Nov 2007.

More Related Content

What's hot

Resource Management with Systemd and cgroups
Resource Management with Systemd and cgroupsResource Management with Systemd and cgroups
Resource Management with Systemd and cgroups
Tsung-en Hsiao
 
GPU Performance Prediction Using High-level Application Models
GPU Performance Prediction Using High-level Application ModelsGPU Performance Prediction Using High-level Application Models
GPU Performance Prediction Using High-level Application Models
Filipo Mór
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
Vishal Singh
 
Gpu perf-presentation
Gpu perf-presentationGpu perf-presentation
Gpu perf-presentation
GiannisTsagatakis
 
Cuda
CudaCuda
GPU Programming with CUDA
GPU Programming with CUDAGPU Programming with CUDA
GPU Programming with CUDA
Filipo Mór
 
Protecting Real-Time GPU Kernels in Integrated CPU-GPU SoC Platforms
Protecting Real-Time GPU Kernels in Integrated CPU-GPU SoC PlatformsProtecting Real-Time GPU Kernels in Integrated CPU-GPU SoC Platforms
Protecting Real-Time GPU Kernels in Integrated CPU-GPU SoC Platforms
Heechul Yun
 
Designing High Performance Computing Architectures for Reliable Space Applica...
Designing High Performance Computing Architectures for Reliable Space Applica...Designing High Performance Computing Architectures for Reliable Space Applica...
Designing High Performance Computing Architectures for Reliable Space Applica...
Fisnik Kraja
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Kohei KaiGai
 
CUDA
CUDACUDA
Preliminary xsx die_fact_finding
Preliminary xsx die_fact_findingPreliminary xsx die_fact_finding
Preliminary xsx die_fact_finding
mistercteam
 
Available HPC resources at CSUC
Available HPC resources at CSUCAvailable HPC resources at CSUC
20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom
Kohei KaiGai
 
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Tokyo Institute of Technology
 
Google warehouse scale computer
Google warehouse scale computerGoogle warehouse scale computer
Google warehouse scale computer
Tejhaskar Ashok Kumar
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
Kohei KaiGai
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
Kohei KaiGai
 
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraFlow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Ryousei Takano
 

What's hot (18)

Resource Management with Systemd and cgroups
Resource Management with Systemd and cgroupsResource Management with Systemd and cgroups
Resource Management with Systemd and cgroups
 
GPU Performance Prediction Using High-level Application Models
GPU Performance Prediction Using High-level Application ModelsGPU Performance Prediction Using High-level Application Models
GPU Performance Prediction Using High-level Application Models
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
Gpu perf-presentation
Gpu perf-presentationGpu perf-presentation
Gpu perf-presentation
 
Cuda
CudaCuda
Cuda
 
GPU Programming with CUDA
GPU Programming with CUDAGPU Programming with CUDA
GPU Programming with CUDA
 
Protecting Real-Time GPU Kernels in Integrated CPU-GPU SoC Platforms
Protecting Real-Time GPU Kernels in Integrated CPU-GPU SoC PlatformsProtecting Real-Time GPU Kernels in Integrated CPU-GPU SoC Platforms
Protecting Real-Time GPU Kernels in Integrated CPU-GPU SoC Platforms
 
Designing High Performance Computing Architectures for Reliable Space Applica...
Designing High Performance Computing Architectures for Reliable Space Applica...Designing High Performance Computing Architectures for Reliable Space Applica...
Designing High Performance Computing Architectures for Reliable Space Applica...
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
 
CUDA
CUDACUDA
CUDA
 
Preliminary xsx die_fact_finding
Preliminary xsx die_fact_findingPreliminary xsx die_fact_finding
Preliminary xsx die_fact_finding
 
Available HPC resources at CSUC
Available HPC resources at CSUCAvailable HPC resources at CSUC
Available HPC resources at CSUC
 
20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom
 
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
 
Google warehouse scale computer
Google warehouse scale computerGoogle warehouse scale computer
Google warehouse scale computer
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraFlow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
 

Similar to Gpu Cuda

Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
bakers84
 
Landset 8 的雲層去除技巧實作
Landset 8 的雲層去除技巧實作Landset 8 的雲層去除技巧實作
Landset 8 的雲層去除技巧實作
鈵斯 倪
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
Chakkrit (Kla) Tantithamthavorn
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
Kohei KaiGai
 
Introduction to Accelerators
Introduction to AcceleratorsIntroduction to Accelerators
Introduction to Accelerators
Dilum Bandara
 
Monte Carlo on GPUs
Monte Carlo on GPUsMonte Carlo on GPUs
Monte Carlo on GPUs
fcassier
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
ssuser413a98
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
Ryousei Takano
 
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATORQGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
NVIDIA Japan
 
Computing using GPUs
Computing using GPUsComputing using GPUs
Computing using GPUs
Shree Kumar
 
Monte Carlo G P U Jan2010
Monte  Carlo  G P U  Jan2010Monte  Carlo  G P U  Jan2010
Monte Carlo G P U Jan2010
John Holden
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 Kiev
Volodymyr Saviak
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
 
gpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsngpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsn
ARUNACHALAM468781
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08
Angela Mendoza M.
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PeterAndreasEntschev
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An Introduction
Dhan V Sagar
 

Similar to Gpu Cuda (20)

Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
Landset 8 的雲層去除技巧實作
Landset 8 的雲層去除技巧實作Landset 8 的雲層去除技巧實作
Landset 8 的雲層去除技巧實作
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
Introduction to Accelerators
Introduction to AcceleratorsIntroduction to Accelerators
Introduction to Accelerators
 
Monte Carlo on GPUs
Monte Carlo on GPUsMonte Carlo on GPUs
Monte Carlo on GPUs
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
 
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATORQGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
 
Computing using GPUs
Computing using GPUsComputing using GPUs
Computing using GPUs
 
Monte Carlo G P U Jan2010
Monte  Carlo  G P U  Jan2010Monte  Carlo  G P U  Jan2010
Monte Carlo G P U Jan2010
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 Kiev
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
gpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsngpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsn
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An Introduction
 

More from melbournepatterns

An Introduction to
An Introduction to An Introduction to
An Introduction to
melbournepatterns
 
State Pattern from GoF
State Pattern from GoFState Pattern from GoF
State Pattern from GoF
melbournepatterns
 
Iterator Pattern
Iterator PatternIterator Pattern
Iterator Pattern
melbournepatterns
 
Iterator
IteratorIterator
Concurrency Patterns
Concurrency PatternsConcurrency Patterns
Concurrency Patterns
melbournepatterns
 
Continuous Integration, Fast Builds and Flot
Continuous Integration, Fast Builds and FlotContinuous Integration, Fast Builds and Flot
Continuous Integration, Fast Builds and Flot
melbournepatterns
 
Command Pattern
Command PatternCommand Pattern
Command Pattern
melbournepatterns
 
Code Contracts API In .Net
Code Contracts API In .NetCode Contracts API In .Net
Code Contracts API In .Net
melbournepatterns
 
LINQ/PLINQ
LINQ/PLINQLINQ/PLINQ
LINQ/PLINQ
melbournepatterns
 
Facade Pattern
Facade PatternFacade Pattern
Facade Pattern
melbournepatterns
 
Phani Kumar - Decorator Pattern
Phani Kumar - Decorator PatternPhani Kumar - Decorator Pattern
Phani Kumar - Decorator Pattern
melbournepatterns
 
Composite Pattern
Composite PatternComposite Pattern
Composite Pattern
melbournepatterns
 
Adapter Design Pattern
Adapter Design PatternAdapter Design Pattern
Adapter Design Pattern
melbournepatterns
 
Prototype Design Pattern
Prototype Design PatternPrototype Design Pattern
Prototype Design Pattern
melbournepatterns
 
Factory Method Design Pattern
Factory Method Design PatternFactory Method Design Pattern
Factory Method Design Pattern
melbournepatterns
 
Abstract Factory Design Pattern
Abstract Factory Design PatternAbstract Factory Design Pattern
Abstract Factory Design Pattern
melbournepatterns
 
A Little Lisp
A Little LispA Little Lisp
A Little Lisp
melbournepatterns
 
State Pattern in Flex
State Pattern in FlexState Pattern in Flex
State Pattern in Flex
melbournepatterns
 
Active Object
Active ObjectActive Object
Active Object
melbournepatterns
 
Extract Composite Talk Andy
Extract Composite Talk AndyExtract Composite Talk Andy
Extract Composite Talk Andy
melbournepatterns
 

More from melbournepatterns (20)

An Introduction to
An Introduction to An Introduction to
An Introduction to
 
State Pattern from GoF
State Pattern from GoFState Pattern from GoF
State Pattern from GoF
 
Iterator Pattern
Iterator PatternIterator Pattern
Iterator Pattern
 
Iterator
IteratorIterator
Iterator
 
Concurrency Patterns
Concurrency PatternsConcurrency Patterns
Concurrency Patterns
 
Continuous Integration, Fast Builds and Flot
Continuous Integration, Fast Builds and FlotContinuous Integration, Fast Builds and Flot
Continuous Integration, Fast Builds and Flot
 
Command Pattern
Command PatternCommand Pattern
Command Pattern
 
Code Contracts API In .Net
Code Contracts API In .NetCode Contracts API In .Net
Code Contracts API In .Net
 
LINQ/PLINQ
LINQ/PLINQLINQ/PLINQ
LINQ/PLINQ
 
Facade Pattern
Facade PatternFacade Pattern
Facade Pattern
 
Phani Kumar - Decorator Pattern
Phani Kumar - Decorator PatternPhani Kumar - Decorator Pattern
Phani Kumar - Decorator Pattern
 
Composite Pattern
Composite PatternComposite Pattern
Composite Pattern
 
Adapter Design Pattern
Adapter Design PatternAdapter Design Pattern
Adapter Design Pattern
 
Prototype Design Pattern
Prototype Design PatternPrototype Design Pattern
Prototype Design Pattern
 
Factory Method Design Pattern
Factory Method Design PatternFactory Method Design Pattern
Factory Method Design Pattern
 
Abstract Factory Design Pattern
Abstract Factory Design PatternAbstract Factory Design Pattern
Abstract Factory Design Pattern
 
A Little Lisp
A Little LispA Little Lisp
A Little Lisp
 
State Pattern in Flex
State Pattern in FlexState Pattern in Flex
State Pattern in Flex
 
Active Object
Active ObjectActive Object
Active Object
 
Extract Composite Talk Andy
Extract Composite Talk AndyExtract Composite Talk Andy
Extract Composite Talk Andy
 

Recently uploaded

Olympic 2024 Esha Singh's Journey to the Olympic a Young Shooter's Rise, Chal...
Olympic 2024 Esha Singh's Journey to the Olympic a Young Shooter's Rise, Chal...Olympic 2024 Esha Singh's Journey to the Olympic a Young Shooter's Rise, Chal...
Olympic 2024 Esha Singh's Journey to the Olympic a Young Shooter's Rise, Chal...
Eticketing.co
 
Sportr pitch deck for our saas based platform
Sportr pitch deck for our saas based platformSportr pitch deck for our saas based platform
Sportr pitch deck for our saas based platform
NathanielMDuncan
 
Euro 2024 Key Tactics and Strategies of the Netherlands.docx
Euro 2024 Key Tactics and Strategies of the Netherlands.docxEuro 2024 Key Tactics and Strategies of the Netherlands.docx
Euro 2024 Key Tactics and Strategies of the Netherlands.docx
Eticketing.co
 
Project 9 - Strategy (24/25 Season)
Project 9 -  Strategy (24/25 Season)Project 9 -  Strategy (24/25 Season)
Project 9 - Strategy (24/25 Season)
PeterMcCormack22
 
This is an exciting platform game called Geometry Dash
This is an exciting platform game called Geometry DashThis is an exciting platform game called Geometry Dash
This is an exciting platform game called Geometry Dash
Prénom Nom de famille
 
Belgium vs Romania A Comprehensive Preview of Euro 2024 Campaigns, Key Player...
Belgium vs Romania A Comprehensive Preview of Euro 2024 Campaigns, Key Player...Belgium vs Romania A Comprehensive Preview of Euro 2024 Campaigns, Key Player...
Belgium vs Romania A Comprehensive Preview of Euro 2024 Campaigns, Key Player...
Eticketing.co
 
Euro 2024 Predictions - Group Stage Outcomes
Euro 2024 Predictions - Group Stage OutcomesEuro 2024 Predictions - Group Stage Outcomes
Euro 2024 Predictions - Group Stage Outcomes
Select Distinct Limited
 
❽❽❻❼❼❻❻❸❾❻ Matka BOSS Result | Satta Matka Tips | Kalyan Matka 143
❽❽❻❼❼❻❻❸❾❻ Matka BOSS Result | Satta Matka Tips | Kalyan Matka 143❽❽❻❼❼❻❻❸❾❻ Matka BOSS Result | Satta Matka Tips | Kalyan Matka 143
❽❽❻❼❼❻❻❸❾❻ Matka BOSS Result | Satta Matka Tips | Kalyan Matka 143
dpbossdpboss69
 
Georgia vs Portugal Historic Euro Cup 2024 Journey, Key Players, and Betting ...
Georgia vs Portugal Historic Euro Cup 2024 Journey, Key Players, and Betting ...Georgia vs Portugal Historic Euro Cup 2024 Journey, Key Players, and Betting ...
Georgia vs Portugal Historic Euro Cup 2024 Journey, Key Players, and Betting ...
Eticketing.co
 
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdfJORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
Arturo Pacheco Alvarez
 
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
asabad1
 
Turkey vs Georgia Tickets: Turkey's Redemption Quest in Euro 2024, A Preview
Turkey vs Georgia Tickets: Turkey's Redemption Quest in Euro 2024, A PreviewTurkey vs Georgia Tickets: Turkey's Redemption Quest in Euro 2024, A Preview
Turkey vs Georgia Tickets: Turkey's Redemption Quest in Euro 2024, A Preview
Eticketing.co
 
442 Diamond Formation Ebook pdf ASC ACADEMY SOCCER COACHING
442 Diamond Formation Ebook pdf ASC ACADEMY SOCCER COACHING442 Diamond Formation Ebook pdf ASC ACADEMY SOCCER COACHING
442 Diamond Formation Ebook pdf ASC ACADEMY SOCCER COACHING
Proximus
 
Italy FIFA World Cup Italy's Ambition for FIFA 2026.docx
Italy FIFA World Cup Italy's Ambition for FIFA 2026.docxItaly FIFA World Cup Italy's Ambition for FIFA 2026.docx
Italy FIFA World Cup Italy's Ambition for FIFA 2026.docx
FIFA World Cup 2026 Tickets
 
Kylian Mbappe Misses Euro 2024 Training Due to Sickness Bug.docx
Kylian Mbappe Misses Euro 2024 Training Due to Sickness Bug.docxKylian Mbappe Misses Euro 2024 Training Due to Sickness Bug.docx
Kylian Mbappe Misses Euro 2024 Training Due to Sickness Bug.docx
Euro Cup 2024 Tickets
 
Turkey vs Georgia Prospects and Challenges in Euro Cup Germany.docx
Turkey vs Georgia Prospects and Challenges in Euro Cup Germany.docxTurkey vs Georgia Prospects and Challenges in Euro Cup Germany.docx
Turkey vs Georgia Prospects and Challenges in Euro Cup Germany.docx
Eticketing.co
 
Serbia vs England Tickets: Serbia Gears Up for a Historic Campaign at UEFA Eu...
Serbia vs England Tickets: Serbia Gears Up for a Historic Campaign at UEFA Eu...Serbia vs England Tickets: Serbia Gears Up for a Historic Campaign at UEFA Eu...
Serbia vs England Tickets: Serbia Gears Up for a Historic Campaign at UEFA Eu...
Eticketing.co
 
一比一原版(Curtin毕业证)科廷大学毕业证如何办理
一比一原版(Curtin毕业证)科廷大学毕业证如何办理一比一原版(Curtin毕业证)科廷大学毕业证如何办理
一比一原版(Curtin毕业证)科廷大学毕业证如何办理
apobqx
 
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
8z10jo1w
 
Indian Premier League (IPL) ---2024.pptx
Indian Premier League (IPL) ---2024.pptxIndian Premier League (IPL) ---2024.pptx
Indian Premier League (IPL) ---2024.pptx
rathinikunj60
 

Recently uploaded (20)

Olympic 2024 Esha Singh's Journey to the Olympic a Young Shooter's Rise, Chal...
Olympic 2024 Esha Singh's Journey to the Olympic a Young Shooter's Rise, Chal...Olympic 2024 Esha Singh's Journey to the Olympic a Young Shooter's Rise, Chal...
Olympic 2024 Esha Singh's Journey to the Olympic a Young Shooter's Rise, Chal...
 
Sportr pitch deck for our saas based platform
Sportr pitch deck for our saas based platformSportr pitch deck for our saas based platform
Sportr pitch deck for our saas based platform
 
Euro 2024 Key Tactics and Strategies of the Netherlands.docx
Euro 2024 Key Tactics and Strategies of the Netherlands.docxEuro 2024 Key Tactics and Strategies of the Netherlands.docx
Euro 2024 Key Tactics and Strategies of the Netherlands.docx
 
Project 9 - Strategy (24/25 Season)
Project 9 -  Strategy (24/25 Season)Project 9 -  Strategy (24/25 Season)
Project 9 - Strategy (24/25 Season)
 
This is an exciting platform game called Geometry Dash
This is an exciting platform game called Geometry DashThis is an exciting platform game called Geometry Dash
This is an exciting platform game called Geometry Dash
 
Belgium vs Romania A Comprehensive Preview of Euro 2024 Campaigns, Key Player...
Belgium vs Romania A Comprehensive Preview of Euro 2024 Campaigns, Key Player...Belgium vs Romania A Comprehensive Preview of Euro 2024 Campaigns, Key Player...
Belgium vs Romania A Comprehensive Preview of Euro 2024 Campaigns, Key Player...
 
Euro 2024 Predictions - Group Stage Outcomes
Euro 2024 Predictions - Group Stage OutcomesEuro 2024 Predictions - Group Stage Outcomes
Euro 2024 Predictions - Group Stage Outcomes
 
❽❽❻❼❼❻❻❸❾❻ Matka BOSS Result | Satta Matka Tips | Kalyan Matka 143
❽❽❻❼❼❻❻❸❾❻ Matka BOSS Result | Satta Matka Tips | Kalyan Matka 143❽❽❻❼❼❻❻❸❾❻ Matka BOSS Result | Satta Matka Tips | Kalyan Matka 143
❽❽❻❼❼❻❻❸❾❻ Matka BOSS Result | Satta Matka Tips | Kalyan Matka 143
 
Georgia vs Portugal Historic Euro Cup 2024 Journey, Key Players, and Betting ...
Georgia vs Portugal Historic Euro Cup 2024 Journey, Key Players, and Betting ...Georgia vs Portugal Historic Euro Cup 2024 Journey, Key Players, and Betting ...
Georgia vs Portugal Historic Euro Cup 2024 Journey, Key Players, and Betting ...
 
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdfJORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
 
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
 
Turkey vs Georgia Tickets: Turkey's Redemption Quest in Euro 2024, A Preview
Turkey vs Georgia Tickets: Turkey's Redemption Quest in Euro 2024, A PreviewTurkey vs Georgia Tickets: Turkey's Redemption Quest in Euro 2024, A Preview
Turkey vs Georgia Tickets: Turkey's Redemption Quest in Euro 2024, A Preview
 
442 Diamond Formation Ebook pdf ASC ACADEMY SOCCER COACHING
442 Diamond Formation Ebook pdf ASC ACADEMY SOCCER COACHING442 Diamond Formation Ebook pdf ASC ACADEMY SOCCER COACHING
442 Diamond Formation Ebook pdf ASC ACADEMY SOCCER COACHING
 
Italy FIFA World Cup Italy's Ambition for FIFA 2026.docx
Italy FIFA World Cup Italy's Ambition for FIFA 2026.docxItaly FIFA World Cup Italy's Ambition for FIFA 2026.docx
Italy FIFA World Cup Italy's Ambition for FIFA 2026.docx
 
Kylian Mbappe Misses Euro 2024 Training Due to Sickness Bug.docx
Kylian Mbappe Misses Euro 2024 Training Due to Sickness Bug.docxKylian Mbappe Misses Euro 2024 Training Due to Sickness Bug.docx
Kylian Mbappe Misses Euro 2024 Training Due to Sickness Bug.docx
 
Turkey vs Georgia Prospects and Challenges in Euro Cup Germany.docx
Turkey vs Georgia Prospects and Challenges in Euro Cup Germany.docxTurkey vs Georgia Prospects and Challenges in Euro Cup Germany.docx
Turkey vs Georgia Prospects and Challenges in Euro Cup Germany.docx
 
Serbia vs England Tickets: Serbia Gears Up for a Historic Campaign at UEFA Eu...
Serbia vs England Tickets: Serbia Gears Up for a Historic Campaign at UEFA Eu...Serbia vs England Tickets: Serbia Gears Up for a Historic Campaign at UEFA Eu...
Serbia vs England Tickets: Serbia Gears Up for a Historic Campaign at UEFA Eu...
 
一比一原版(Curtin毕业证)科廷大学毕业证如何办理
一比一原版(Curtin毕业证)科廷大学毕业证如何办理一比一原版(Curtin毕业证)科廷大学毕业证如何办理
一比一原版(Curtin毕业证)科廷大学毕业证如何办理
 
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
 
Indian Premier League (IPL) ---2024.pptx
Indian Premier League (IPL) ---2024.pptxIndian Premier League (IPL) ---2024.pptx
Indian Premier League (IPL) ---2024.pptx
 

Gpu Cuda

  • 2. CPU Architecture -  good for serial programs -  do many different things well -  many transistors for purposes other than ALUs (eg. flow control and caching) -  memory access is slow (1GB/s) -  switching threads is slow Image from Alex Moore, “Introduction to Programming in CUDA”, http://astro.pas.rochester.edu/~aquillen/gpuworkshop.html
  • 3. GPU Architecture -  many processors perform Cache Control ALUs similar operations on a large data set in parallel (single- instruction multiple-data parallelism) -  recent GPUs have around 30 multiprocessors, each containing 8 stream processors -  GPUs devote most (80%) of their transistors to ALUs -  fast memory (80GB/s)
  • 4. Thread Hierarchy -  a block of threads runs on a single stream processor -  a grid of blocks makes up the entire set -  each thread in a block can access the same shared memory -  many more threads than processors Image from Johan Seland, “CUDA Programming”, http://heim.ifi.uio.no/~knutm/geilo2008/seland.pdf
  • 5. Memory Hierarchy Image from Johan Seland, “CUDA Programming”, http://heim.ifi.uio.no/~knutm/geilo2008/seland.pdf
  • 6. CUDA -  a set of C extensions for running programs on a GPU -  Windows, Linux, Mac…. Nvidia Cards only -  single instruction, multiple data -  gives you direct access to the memory architecture -  don’t need to know graphics APIs -  built-in libraries: cudaMalloc, cudaFree, cudaMemcpy, cublas, cufft
  • 7. Elementwise Matrix Addition CPU Program void add_matrix(float* a, float* b, float* c, int N) { int index; for ( int i = 0; i < N; ++i ) { for ( int j = 0; j < N; ++j ) { index = i + j*N; c[index] = a[index] + b[index]; } } } int main() { add_matrix( a, b, c, N ); }
  • 8. Elemtwise Matix Addition CUDA program __global__ add_matrix(float* a, float* b, float* c, int N ) { int i = blockIdx.x * blockDim.x + threadIdx.x; int j = blockIdx.y * blockDim.y + threadIdx.y; int index = i + j*N; if ( i < N && j < N ) c[index] = a[index] + b[index]; } int main() { dim3 dimBlock( blocksize, blocksize ); dim3 dimGrid( N/dimBlock.x, N/dimBlock.y ); add_matrix<<<dimGrid, dimBlock>>>( a, b, c, N ); }
  • 9. Results (for tasks relevant to the MWA Real Time System) Imaging Gridding * UnpeelTileResponse PeelTileResponse Module ReRotateVisibilities MeasureTileResponse MeasureIonosphericOffset RotateAndAccumulateVisibilities 0 10 20 30 40 50 60 70 Speedup Image from Kevin Dale, “A Graphics Hardware-Accelerated Real-Time Processing Pipeline for Radio Astronomy”, Presented at AstroGPU, Nov 2007.