SlideShare a Scribd company logo
WEEKLY REPORT
Thur., Nov 14, 2013
Pin Yi Tsai
OUTLINE
• Current Work
• Compute Integral Image – computeByRow
 Using shared memory
 Using register

 Result
• CUDA Memory Architecture
USING SHARED MEMORY

• Scope: block

• Shared memory: store the values of the previous line
• computing by Row for img[*][y] and img[*][y+1]
• Time t: calculate img[*][y] + shared memory[*]
• Then store the result back to shared memory[*]
• Time t+1: calculate img[*][y+1] + shared memory[*]
USING REGISTER

• Scope: thread
• One line one thread
 Why not one pixel one thread? The use of _syncthread();

• Using register: store the values of the previous pixel
RESULT
• 16x16

• Serial version: 0.006336 ms
• Parallel version: 5.88559e-39 ms
======== Profiling result:
Time(%)

Time Calls

Avg

Min

Max Name

55.69 18.91us

1 18.91us 18.91us 18.91us computeByRow(float*, int, int)

25.84

8.78us

1

8.78us

12.91

4.38us

2

2.19us 2.18us 2.21us [CUDA memcpy DtoH]

5.56

1.89us

2

944ns

8.78us
928ns

8.78us computeByColumn(float*, int, int)
960ns [CUDA memcpy HtoD]
RESULT (CONT.)
• 640*480

• Serial version: 5.1607 ms
• Parallel version: 4.40496 ms
======== Profiling result:
Time(%)

Time Calls

Avg

Min

Max Name

66.37 2.19ms

1 2.19ms 2.19ms 2.19ms computeByRow(float*, int, int)

12.75 419.74us

2 209.87us 209.28us 210.46us [CUDA memcpy HtoD]

11.74 386.43us

2 193.22us 191.04us 195.39us [CUDA memcpy DtoH]

9.15 301.24us
1 301.24us 301.24us 301.24us
computeByColumn(float*, int, int)
CUDA MEMORY ARCHITECTURE
The End

More Related Content

What's hot

Explore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear RegressionExplore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear Regression
vaishnaviayyappan
 
MongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesMongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databases
Lamprini Koutsokera
 
MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)
Eirik George Tsarpalis
 
MBrace: Cloud Computing with F#
MBrace: Cloud Computing with F#MBrace: Cloud Computing with F#
MBrace: Cloud Computing with F#
Eirik George Tsarpalis
 
Quantum computers
Quantum computersQuantum computers
Quantum computers
mitchwalls23
 
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks  for Aerial Imagery SegmentationExploring Fused Convolutional Neural Networks  for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
Mendrika Ramarlina
 
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...VIMALKUMAR KUMARESAN
 
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
Zalando adtech lab
 
Dream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input FilesDream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input Files
Matthew Priddy
 
K10692 control theory
K10692 control theoryK10692 control theory
K10692 control theory
saagar264
 
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Tokyo Institute of Technology
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
Ivan Goloskokovic
 
Python datetime
Python datetimePython datetime
Python datetime
sureshraj43
 
Presentation: Plotting Systems in R
Presentation: Plotting Systems in RPresentation: Plotting Systems in R
Presentation: Plotting Systems in R
Ilya Zhbannikov
 
Deterministic Machine Learning with MLflow and mlf-core
Deterministic Machine Learning with MLflow and mlf-coreDeterministic Machine Learning with MLflow and mlf-core
Deterministic Machine Learning with MLflow and mlf-core
Databricks
 
Hubba Deep Learning
Hubba Deep LearningHubba Deep Learning
Hubba Deep Learning
Ivan Goloskokovic
 
Megadata With Python and Hadoop
Megadata With Python and HadoopMegadata With Python and Hadoop
Megadata With Python and Hadoop
ryancox
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
Igor Sfiligoi
 
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Kento Aoyama
 

What's hot (20)

Explore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear RegressionExplore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear Regression
 
MongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesMongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databases
 
MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)
 
MBrace: Cloud Computing with F#
MBrace: Cloud Computing with F#MBrace: Cloud Computing with F#
MBrace: Cloud Computing with F#
 
Quantum computers
Quantum computersQuantum computers
Quantum computers
 
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks  for Aerial Imagery SegmentationExploring Fused Convolutional Neural Networks  for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
 
Functions with heap and stack
Functions with heap and stackFunctions with heap and stack
Functions with heap and stack
 
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
 
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
 
Dream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input FilesDream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input Files
 
K10692 control theory
K10692 control theoryK10692 control theory
K10692 control theory
 
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
 
Python datetime
Python datetimePython datetime
Python datetime
 
Presentation: Plotting Systems in R
Presentation: Plotting Systems in RPresentation: Plotting Systems in R
Presentation: Plotting Systems in R
 
Deterministic Machine Learning with MLflow and mlf-core
Deterministic Machine Learning with MLflow and mlf-coreDeterministic Machine Learning with MLflow and mlf-core
Deterministic Machine Learning with MLflow and mlf-core
 
Hubba Deep Learning
Hubba Deep LearningHubba Deep Learning
Hubba Deep Learning
 
Megadata With Python and Hadoop
Megadata With Python and HadoopMegadata With Python and Hadoop
Megadata With Python and Hadoop
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
 
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
 

Viewers also liked

Presentation1
Presentation1Presentation1
Presentation1Olonlog
 
Hedland Variable Area Flow Meters
Hedland Variable Area Flow MetersHedland Variable Area Flow Meters
Hedland Variable Area Flow Meters
Bell Flow Systems
 
Macnaught MX Series of Oval Gear Flow Meters
Macnaught MX Series of Oval Gear Flow MetersMacnaught MX Series of Oval Gear Flow Meters
Macnaught MX Series of Oval Gear Flow Meters
Bell Flow Systems
 
20131121
2013112120131121
20131121
Jocelyn
 
Blancett Turbine Flow Meters
Blancett Turbine Flow MetersBlancett Turbine Flow Meters
Blancett Turbine Flow Meters
Bell Flow Systems
 
20140109
2014010920140109
20140109
Jocelyn
 
20131219
2013121920131219
20131219
Jocelyn
 
Electromagnetic Flow Meters Overview (Badger Meter)
Electromagnetic Flow Meters Overview (Badger Meter)Electromagnetic Flow Meters Overview (Badger Meter)
Electromagnetic Flow Meters Overview (Badger Meter)
Bell Flow Systems
 
The DXN Portable Ultrasonic Flow Meter
The DXN Portable Ultrasonic Flow MeterThe DXN Portable Ultrasonic Flow Meter
The DXN Portable Ultrasonic Flow Meter
Bell Flow Systems
 
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Zubit Life Care
 
Vortex Flow Meters- Badger Meter
Vortex Flow Meters- Badger MeterVortex Flow Meters- Badger Meter
Vortex Flow Meters- Badger Meter
Bell Flow Systems
 
Hydraulic Turbine Flow Meters (flo-tech)
Hydraulic Turbine Flow Meters (flo-tech)Hydraulic Turbine Flow Meters (flo-tech)
Hydraulic Turbine Flow Meters (flo-tech)
Bell Flow Systems
 
VuHeat Ultrasonic Heat Meters Datasheets
VuHeat Ultrasonic Heat Meters DatasheetsVuHeat Ultrasonic Heat Meters Datasheets
VuHeat Ultrasonic Heat Meters Datasheets
Bell Flow Systems
 
20131024
2013102420131024
20131024
Jocelyn
 
20131002
2013100220131002
20131002
Jocelyn
 
금융.경제이슈(2015.05월)
금융.경제이슈(2015.05월)금융.경제이슈(2015.05월)
금융.경제이슈(2015.05월)
Yeojin Choi
 
нестандартні рішення проблем в побудові ігор
нестандартні рішення проблем в побудові ігорнестандартні рішення проблем в побудові ігор
нестандартні рішення проблем в побудові ігорStfalcon Meetups
 
First Aid Course - Tomasz Sarosiek-Valinskas
First Aid Course - Tomasz Sarosiek-ValinskasFirst Aid Course - Tomasz Sarosiek-Valinskas
First Aid Course - Tomasz Sarosiek-ValinskasTomasz Sarosiek-Valinskas
 
Upcoming Webinar on Take Control of Engineering Data & Processes
Upcoming Webinar on Take Control of Engineering Data & ProcessesUpcoming Webinar on Take Control of Engineering Data & Processes
Upcoming Webinar on Take Control of Engineering Data & Processes
PROLIM Global Corporation
 
El prestigfffio de la profesión docente
El prestigfffio de la profesión docenteEl prestigfffio de la profesión docente
El prestigfffio de la profesión docente
delavibora
 

Viewers also liked (20)

Presentation1
Presentation1Presentation1
Presentation1
 
Hedland Variable Area Flow Meters
Hedland Variable Area Flow MetersHedland Variable Area Flow Meters
Hedland Variable Area Flow Meters
 
Macnaught MX Series of Oval Gear Flow Meters
Macnaught MX Series of Oval Gear Flow MetersMacnaught MX Series of Oval Gear Flow Meters
Macnaught MX Series of Oval Gear Flow Meters
 
20131121
2013112120131121
20131121
 
Blancett Turbine Flow Meters
Blancett Turbine Flow MetersBlancett Turbine Flow Meters
Blancett Turbine Flow Meters
 
20140109
2014010920140109
20140109
 
20131219
2013121920131219
20131219
 
Electromagnetic Flow Meters Overview (Badger Meter)
Electromagnetic Flow Meters Overview (Badger Meter)Electromagnetic Flow Meters Overview (Badger Meter)
Electromagnetic Flow Meters Overview (Badger Meter)
 
The DXN Portable Ultrasonic Flow Meter
The DXN Portable Ultrasonic Flow MeterThe DXN Portable Ultrasonic Flow Meter
The DXN Portable Ultrasonic Flow Meter
 
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
 
Vortex Flow Meters- Badger Meter
Vortex Flow Meters- Badger MeterVortex Flow Meters- Badger Meter
Vortex Flow Meters- Badger Meter
 
Hydraulic Turbine Flow Meters (flo-tech)
Hydraulic Turbine Flow Meters (flo-tech)Hydraulic Turbine Flow Meters (flo-tech)
Hydraulic Turbine Flow Meters (flo-tech)
 
VuHeat Ultrasonic Heat Meters Datasheets
VuHeat Ultrasonic Heat Meters DatasheetsVuHeat Ultrasonic Heat Meters Datasheets
VuHeat Ultrasonic Heat Meters Datasheets
 
20131024
2013102420131024
20131024
 
20131002
2013100220131002
20131002
 
금융.경제이슈(2015.05월)
금융.경제이슈(2015.05월)금융.경제이슈(2015.05월)
금융.경제이슈(2015.05월)
 
нестандартні рішення проблем в побудові ігор
нестандартні рішення проблем в побудові ігорнестандартні рішення проблем в побудові ігор
нестандартні рішення проблем в побудові ігор
 
First Aid Course - Tomasz Sarosiek-Valinskas
First Aid Course - Tomasz Sarosiek-ValinskasFirst Aid Course - Tomasz Sarosiek-Valinskas
First Aid Course - Tomasz Sarosiek-Valinskas
 
Upcoming Webinar on Take Control of Engineering Data & Processes
Upcoming Webinar on Take Control of Engineering Data & ProcessesUpcoming Webinar on Take Control of Engineering Data & Processes
Upcoming Webinar on Take Control of Engineering Data & Processes
 
El prestigfffio de la profesión docente
El prestigfffio de la profesión docenteEl prestigfffio de la profesión docente
El prestigfffio de la profesión docente
 

Similar to 20131114

Unity - Internals: memory and performance
Unity - Internals: memory and performanceUnity - Internals: memory and performance
Unity - Internals: memory and performance
Codemotion
 
Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)
RAHUL BHOJWANI
 
002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt
ceyifo9332
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Daosheng Mu
 
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Jose Ricardo da Silva Junior
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
Rakesh Pogula
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Gurbinder Gill
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
ssuser413a98
 
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
mouhouioui
 
Golang in TiDB (GopherChina 2017)
Golang in TiDB  (GopherChina 2017)Golang in TiDB  (GopherChina 2017)
Golang in TiDB (GopherChina 2017)
PingCAP
 
MAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxMAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsx
BharathiLakshmiAAssi
 
matrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsxmatrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsx
Bharathi Lakshmi Pon
 
Graphite
GraphiteGraphite
Graphite
David Lutz
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
Unite2017Tokyo
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
Unity Technologies Japan K.K.
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
bakers84
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
Hiram Fleitas León
 
parallelcomputing-webminar.ppsx
parallelcomputing-webminar.ppsxparallelcomputing-webminar.ppsx
parallelcomputing-webminar.ppsx
Bharathi Lakshmi Pon
 
Parallel Computing--Webminar.ppsx
Parallel Computing--Webminar.ppsxParallel Computing--Webminar.ppsx
Parallel Computing--Webminar.ppsx
BharathiLakshmiAAssi
 

Similar to 20131114 (20)

Unity - Internals: memory and performance
Unity - Internals: memory and performanceUnity - Internals: memory and performance
Unity - Internals: memory and performance
 
Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)
 
002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
 
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
 
Golang in TiDB (GopherChina 2017)
Golang in TiDB  (GopherChina 2017)Golang in TiDB  (GopherChina 2017)
Golang in TiDB (GopherChina 2017)
 
MAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxMAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsx
 
matrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsxmatrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsx
 
Graphite
GraphiteGraphite
Graphite
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
Paralell
ParalellParalell
Paralell
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
parallelcomputing-webminar.ppsx
parallelcomputing-webminar.ppsxparallelcomputing-webminar.ppsx
parallelcomputing-webminar.ppsx
 
Parallel Computing--Webminar.ppsx
Parallel Computing--Webminar.ppsxParallel Computing--Webminar.ppsx
Parallel Computing--Webminar.ppsx
 

Recently uploaded

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 

Recently uploaded (20)

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 

20131114

  • 1. WEEKLY REPORT Thur., Nov 14, 2013 Pin Yi Tsai
  • 2. OUTLINE • Current Work • Compute Integral Image – computeByRow  Using shared memory  Using register  Result • CUDA Memory Architecture
  • 3. USING SHARED MEMORY • Scope: block • Shared memory: store the values of the previous line • computing by Row for img[*][y] and img[*][y+1] • Time t: calculate img[*][y] + shared memory[*] • Then store the result back to shared memory[*] • Time t+1: calculate img[*][y+1] + shared memory[*]
  • 4. USING REGISTER • Scope: thread • One line one thread  Why not one pixel one thread? The use of _syncthread(); • Using register: store the values of the previous pixel
  • 5. RESULT • 16x16 • Serial version: 0.006336 ms • Parallel version: 5.88559e-39 ms ======== Profiling result: Time(%) Time Calls Avg Min Max Name 55.69 18.91us 1 18.91us 18.91us 18.91us computeByRow(float*, int, int) 25.84 8.78us 1 8.78us 12.91 4.38us 2 2.19us 2.18us 2.21us [CUDA memcpy DtoH] 5.56 1.89us 2 944ns 8.78us 928ns 8.78us computeByColumn(float*, int, int) 960ns [CUDA memcpy HtoD]
  • 6. RESULT (CONT.) • 640*480 • Serial version: 5.1607 ms • Parallel version: 4.40496 ms ======== Profiling result: Time(%) Time Calls Avg Min Max Name 66.37 2.19ms 1 2.19ms 2.19ms 2.19ms computeByRow(float*, int, int) 12.75 419.74us 2 209.87us 209.28us 210.46us [CUDA memcpy HtoD] 11.74 386.43us 2 193.22us 191.04us 195.39us [CUDA memcpy DtoH] 9.15 301.24us 1 301.24us 301.24us 301.24us computeByColumn(float*, int, int)
  • 8.
  • 9.