SlideShare a Scribd company logo
1 of 10
WEEKLY REPORT
Thur., Nov 14, 2013
Pin Yi Tsai
OUTLINE
• Current Work
• Compute Integral Image – computeByRow
 Using shared memory
 Using register

 Result
• CUDA Memory Architecture
USING SHARED MEMORY

• Scope: block

• Shared memory: store the values of the previous line
• computing by Row for img[*][y] and img[*][y+1]
• Time t: calculate img[*][y] + shared memory[*]
• Then store the result back to shared memory[*]
• Time t+1: calculate img[*][y+1] + shared memory[*]
USING REGISTER

• Scope: thread
• One line one thread
 Why not one pixel one thread? The use of _syncthread();

• Using register: store the values of the previous pixel
RESULT
• 16x16

• Serial version: 0.006336 ms
• Parallel version: 5.88559e-39 ms
======== Profiling result:
Time(%)

Time Calls

Avg

Min

Max Name

55.69 18.91us

1 18.91us 18.91us 18.91us computeByRow(float*, int, int)

25.84

8.78us

1

8.78us

12.91

4.38us

2

2.19us 2.18us 2.21us [CUDA memcpy DtoH]

5.56

1.89us

2

944ns

8.78us
928ns

8.78us computeByColumn(float*, int, int)
960ns [CUDA memcpy HtoD]
RESULT (CONT.)
• 640*480

• Serial version: 5.1607 ms
• Parallel version: 4.40496 ms
======== Profiling result:
Time(%)

Time Calls

Avg

Min

Max Name

66.37 2.19ms

1 2.19ms 2.19ms 2.19ms computeByRow(float*, int, int)

12.75 419.74us

2 209.87us 209.28us 210.46us [CUDA memcpy HtoD]

11.74 386.43us

2 193.22us 191.04us 195.39us [CUDA memcpy DtoH]

9.15 301.24us
1 301.24us 301.24us 301.24us
computeByColumn(float*, int, int)
CUDA MEMORY ARCHITECTURE
The End

More Related Content

What's hot

Explore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear RegressionExplore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear Regressionvaishnaviayyappan
 
MongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesMongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesLamprini Koutsokera
 
MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)Eirik George Tsarpalis
 
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks  for Aerial Imagery SegmentationExploring Fused Convolutional Neural Networks  for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks for Aerial Imagery SegmentationMendrika Ramarlina
 
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...VIMALKUMAR KUMARESAN
 
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...Zalando adtech lab
 
Dream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input FilesDream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input FilesMatthew Priddy
 
K10692 control theory
K10692 control theoryK10692 control theory
K10692 control theorysaagar264
 
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Tokyo Institute of Technology
 
Presentation: Plotting Systems in R
Presentation: Plotting Systems in RPresentation: Plotting Systems in R
Presentation: Plotting Systems in RIlya Zhbannikov
 
Deterministic Machine Learning with MLflow and mlf-core
Deterministic Machine Learning with MLflow and mlf-coreDeterministic Machine Learning with MLflow and mlf-core
Deterministic Machine Learning with MLflow and mlf-coreDatabricks
 
Megadata With Python and Hadoop
Megadata With Python and HadoopMegadata With Python and Hadoop
Megadata With Python and Hadoopryancox
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsIgor Sfiligoi
 
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...Kento Aoyama
 

What's hot (20)

Explore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear RegressionExplore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear Regression
 
MongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesMongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databases
 
MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)
 
MBrace: Cloud Computing with F#
MBrace: Cloud Computing with F#MBrace: Cloud Computing with F#
MBrace: Cloud Computing with F#
 
Quantum computers
Quantum computersQuantum computers
Quantum computers
 
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks  for Aerial Imagery SegmentationExploring Fused Convolutional Neural Networks  for Aerial Imagery Segmentation
Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation
 
Functions with heap and stack
Functions with heap and stackFunctions with heap and stack
Functions with heap and stack
 
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...
 
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
 
Dream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input FilesDream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input Files
 
K10692 control theory
K10692 control theoryK10692 control theory
K10692 control theory
 
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
 
Python datetime
Python datetimePython datetime
Python datetime
 
Presentation: Plotting Systems in R
Presentation: Plotting Systems in RPresentation: Plotting Systems in R
Presentation: Plotting Systems in R
 
Deterministic Machine Learning with MLflow and mlf-core
Deterministic Machine Learning with MLflow and mlf-coreDeterministic Machine Learning with MLflow and mlf-core
Deterministic Machine Learning with MLflow and mlf-core
 
Hubba Deep Learning
Hubba Deep LearningHubba Deep Learning
Hubba Deep Learning
 
Megadata With Python and Hadoop
Megadata With Python and HadoopMegadata With Python and Hadoop
Megadata With Python and Hadoop
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
 
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...
 

Viewers also liked

Presentation1
Presentation1Presentation1
Presentation1Olonlog
 
Hedland Variable Area Flow Meters
Hedland Variable Area Flow MetersHedland Variable Area Flow Meters
Hedland Variable Area Flow MetersBell Flow Systems
 
Macnaught MX Series of Oval Gear Flow Meters
Macnaught MX Series of Oval Gear Flow MetersMacnaught MX Series of Oval Gear Flow Meters
Macnaught MX Series of Oval Gear Flow MetersBell Flow Systems
 
Blancett Turbine Flow Meters
Blancett Turbine Flow MetersBlancett Turbine Flow Meters
Blancett Turbine Flow MetersBell Flow Systems
 
Electromagnetic Flow Meters Overview (Badger Meter)
Electromagnetic Flow Meters Overview (Badger Meter)Electromagnetic Flow Meters Overview (Badger Meter)
Electromagnetic Flow Meters Overview (Badger Meter)Bell Flow Systems
 
The DXN Portable Ultrasonic Flow Meter
The DXN Portable Ultrasonic Flow MeterThe DXN Portable Ultrasonic Flow Meter
The DXN Portable Ultrasonic Flow MeterBell Flow Systems
 
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...Zubit Life Care
 
Vortex Flow Meters- Badger Meter
Vortex Flow Meters- Badger MeterVortex Flow Meters- Badger Meter
Vortex Flow Meters- Badger MeterBell Flow Systems
 
Hydraulic Turbine Flow Meters (flo-tech)
Hydraulic Turbine Flow Meters (flo-tech)Hydraulic Turbine Flow Meters (flo-tech)
Hydraulic Turbine Flow Meters (flo-tech)Bell Flow Systems
 
VuHeat Ultrasonic Heat Meters Datasheets
VuHeat Ultrasonic Heat Meters DatasheetsVuHeat Ultrasonic Heat Meters Datasheets
VuHeat Ultrasonic Heat Meters DatasheetsBell Flow Systems
 
금융.경제이슈(2015.05월)
금융.경제이슈(2015.05월)금융.경제이슈(2015.05월)
금융.경제이슈(2015.05월)Yeojin Choi
 
нестандартні рішення проблем в побудові ігор
нестандартні рішення проблем в побудові ігорнестандартні рішення проблем в побудові ігор
нестандартні рішення проблем в побудові ігорStfalcon Meetups
 
First Aid Course - Tomasz Sarosiek-Valinskas
First Aid Course - Tomasz Sarosiek-ValinskasFirst Aid Course - Tomasz Sarosiek-Valinskas
First Aid Course - Tomasz Sarosiek-ValinskasTomasz Sarosiek-Valinskas
 
Upcoming Webinar on Take Control of Engineering Data & Processes
Upcoming Webinar on Take Control of Engineering Data & ProcessesUpcoming Webinar on Take Control of Engineering Data & Processes
Upcoming Webinar on Take Control of Engineering Data & ProcessesPROLIM Global Corporation
 
El prestigfffio de la profesión docente
El prestigfffio de la profesión docenteEl prestigfffio de la profesión docente
El prestigfffio de la profesión docentedelavibora
 

Viewers also liked (20)

Presentation1
Presentation1Presentation1
Presentation1
 
Hedland Variable Area Flow Meters
Hedland Variable Area Flow MetersHedland Variable Area Flow Meters
Hedland Variable Area Flow Meters
 
Macnaught MX Series of Oval Gear Flow Meters
Macnaught MX Series of Oval Gear Flow MetersMacnaught MX Series of Oval Gear Flow Meters
Macnaught MX Series of Oval Gear Flow Meters
 
20131121
2013112120131121
20131121
 
Blancett Turbine Flow Meters
Blancett Turbine Flow MetersBlancett Turbine Flow Meters
Blancett Turbine Flow Meters
 
20140109
2014010920140109
20140109
 
20131219
2013121920131219
20131219
 
Electromagnetic Flow Meters Overview (Badger Meter)
Electromagnetic Flow Meters Overview (Badger Meter)Electromagnetic Flow Meters Overview (Badger Meter)
Electromagnetic Flow Meters Overview (Badger Meter)
 
The DXN Portable Ultrasonic Flow Meter
The DXN Portable Ultrasonic Flow MeterThe DXN Portable Ultrasonic Flow Meter
The DXN Portable Ultrasonic Flow Meter
 
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...
 
Vortex Flow Meters- Badger Meter
Vortex Flow Meters- Badger MeterVortex Flow Meters- Badger Meter
Vortex Flow Meters- Badger Meter
 
Hydraulic Turbine Flow Meters (flo-tech)
Hydraulic Turbine Flow Meters (flo-tech)Hydraulic Turbine Flow Meters (flo-tech)
Hydraulic Turbine Flow Meters (flo-tech)
 
VuHeat Ultrasonic Heat Meters Datasheets
VuHeat Ultrasonic Heat Meters DatasheetsVuHeat Ultrasonic Heat Meters Datasheets
VuHeat Ultrasonic Heat Meters Datasheets
 
20131024
2013102420131024
20131024
 
20131002
2013100220131002
20131002
 
금융.경제이슈(2015.05월)
금융.경제이슈(2015.05월)금융.경제이슈(2015.05월)
금융.경제이슈(2015.05월)
 
нестандартні рішення проблем в побудові ігор
нестандартні рішення проблем в побудові ігорнестандартні рішення проблем в побудові ігор
нестандартні рішення проблем в побудові ігор
 
First Aid Course - Tomasz Sarosiek-Valinskas
First Aid Course - Tomasz Sarosiek-ValinskasFirst Aid Course - Tomasz Sarosiek-Valinskas
First Aid Course - Tomasz Sarosiek-Valinskas
 
Upcoming Webinar on Take Control of Engineering Data & Processes
Upcoming Webinar on Take Control of Engineering Data & ProcessesUpcoming Webinar on Take Control of Engineering Data & Processes
Upcoming Webinar on Take Control of Engineering Data & Processes
 
El prestigfffio de la profesión docente
El prestigfffio de la profesión docenteEl prestigfffio de la profesión docente
El prestigfffio de la profesión docente
 

Similar to 20131114

Unity - Internals: memory and performance
Unity - Internals: memory and performanceUnity - Internals: memory and performance
Unity - Internals: memory and performanceCodemotion
 
Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)RAHUL BHOJWANI
 
002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.pptceyifo9332
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Daosheng Mu
 
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...Jose Ricardo da Silva Junior
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Gurbinder Gill
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxssuser413a98
 
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...mouhouioui
 
Golang in TiDB (GopherChina 2017)
Golang in TiDB  (GopherChina 2017)Golang in TiDB  (GopherChina 2017)
Golang in TiDB (GopherChina 2017)PingCAP
 
MAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxMAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxBharathiLakshmiAAssi
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法Unite2017Tokyo
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法Unity Technologies Japan K.K.
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_SummaryHiram Fleitas León
 

Similar to 20131114 (20)

Unity - Internals: memory and performance
Unity - Internals: memory and performanceUnity - Internals: memory and performance
Unity - Internals: memory and performance
 
Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)
 
002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
 
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...
 
Golang in TiDB (GopherChina 2017)
Golang in TiDB  (GopherChina 2017)Golang in TiDB  (GopherChina 2017)
Golang in TiDB (GopherChina 2017)
 
MAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxMAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsx
 
matrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsxmatrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsx
 
Graphite
GraphiteGraphite
Graphite
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
Paralell
ParalellParalell
Paralell
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
Parallel Computing--Webminar.ppsx
Parallel Computing--Webminar.ppsxParallel Computing--Webminar.ppsx
Parallel Computing--Webminar.ppsx
 
parallelcomputing-webminar.ppsx
parallelcomputing-webminar.ppsxparallelcomputing-webminar.ppsx
parallelcomputing-webminar.ppsx
 

Recently uploaded

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

20131114

  • 1. WEEKLY REPORT Thur., Nov 14, 2013 Pin Yi Tsai
  • 2. OUTLINE • Current Work • Compute Integral Image – computeByRow  Using shared memory  Using register  Result • CUDA Memory Architecture
  • 3. USING SHARED MEMORY • Scope: block • Shared memory: store the values of the previous line • computing by Row for img[*][y] and img[*][y+1] • Time t: calculate img[*][y] + shared memory[*] • Then store the result back to shared memory[*] • Time t+1: calculate img[*][y+1] + shared memory[*]
  • 4. USING REGISTER • Scope: thread • One line one thread  Why not one pixel one thread? The use of _syncthread(); • Using register: store the values of the previous pixel
  • 5. RESULT • 16x16 • Serial version: 0.006336 ms • Parallel version: 5.88559e-39 ms ======== Profiling result: Time(%) Time Calls Avg Min Max Name 55.69 18.91us 1 18.91us 18.91us 18.91us computeByRow(float*, int, int) 25.84 8.78us 1 8.78us 12.91 4.38us 2 2.19us 2.18us 2.21us [CUDA memcpy DtoH] 5.56 1.89us 2 944ns 8.78us 928ns 8.78us computeByColumn(float*, int, int) 960ns [CUDA memcpy HtoD]
  • 6. RESULT (CONT.) • 640*480 • Serial version: 5.1607 ms • Parallel version: 4.40496 ms ======== Profiling result: Time(%) Time Calls Avg Min Max Name 66.37 2.19ms 1 2.19ms 2.19ms 2.19ms computeByRow(float*, int, int) 12.75 419.74us 2 209.87us 209.28us 210.46us [CUDA memcpy HtoD] 11.74 386.43us 2 193.22us 191.04us 195.39us [CUDA memcpy DtoH] 9.15 301.24us 1 301.24us 301.24us 301.24us computeByColumn(float*, int, int)
  • 8.
  • 9.