20131114

•Download as PPTX, PDF•

0 likes•206 views

Jocelyn

2013/11/14 meeting at Delta R621

Technology Education

WEEKLY REPORT
Thur., Nov 14, 2013
Pin Yi Tsai

OUTLINE
• Current Work
• Compute Integral Image – computeByRow
 Using shared memory
 Using register

 Result
• CUDA Memory Architecture

USING SHARED MEMORY

• Scope: block

• Shared memory: store the values of the previous line
• computing by Row for img[*][y] and img[*][y+1]
• Time t: calculate img[*][y] + shared memory[*]
• Then store the result back to shared memory[*]
• Time t+1: calculate img[*][y+1] + shared memory[*]

USING REGISTER

• Scope: thread
• One line one thread
 Why not one pixel one thread? The use of _syncthread();

• Using register: store the values of the previous pixel

RESULT
• 16x16

• Serial version: 0.006336 ms
• Parallel version: 5.88559e-39 ms
======== Profiling result:
Time(%)

Time Calls

Avg

Min

Max Name

55.69 18.91us

1 18.91us 18.91us 18.91us computeByRow(float*, int, int)

25.84

8.78us

1

8.78us

12.91

4.38us

2

2.19us 2.18us 2.21us [CUDA memcpy DtoH]

5.56

1.89us

2

944ns

8.78us
928ns

8.78us computeByColumn(float*, int, int)
960ns [CUDA memcpy HtoD]

RESULT (CONT.)
• 640*480

• Serial version: 5.1607 ms
• Parallel version: 4.40496 ms
======== Profiling result:
Time(%)

Time Calls

Avg

Min

Max Name

66.37 2.19ms

1 2.19ms 2.19ms 2.19ms computeByRow(float*, int, int)

12.75 419.74us

2 209.87us 209.28us 210.46us [CUDA memcpy HtoD]

11.74 386.43us

2 193.22us 191.04us 195.39us [CUDA memcpy DtoH]

9.15 301.24us
1 301.24us 301.24us 301.24us
computeByColumn(float*, int, int)

What's hot

Explore ML Beginner Session on Linear Regressionvaishnaviayyappan

MongoDB Project: Relational databases to Document-Oriented databasesLamprini Koutsokera

MBrace: Large-scale cloud computation with F# (CUFP 2014)Eirik George Tsarpalis

MBrace: Cloud Computing with F#Eirik George Tsarpalis

Quantum computersmitchwalls23

Exploring Fused Convolutional Neural Networks for Aerial Imagery SegmentationMendrika Ramarlina

Functions with heap and stackbaabtra.com - No. 1 supplier of quality freshers

IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...VIMALKUMAR KUMARESAN

06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...Zalando adtech lab

Dream3D and its Extension to Abaqus Input FilesMatthew Priddy

K10692 control theorysaagar264

Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Tokyo Institute of Technology

Deep Learning meetupIvan Goloskokovic

Python datetimesureshraj43

Presentation: Plotting Systems in RIlya Zhbannikov

Deterministic Machine Learning with MLflow and mlf-coreDatabricks

Hubba Deep LearningIvan Goloskokovic

Megadata With Python and Hadoopryancox

Porting and optimizing UniFrac for GPUsIgor Sfiligoi

Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...Kento Aoyama

What's hot (20)

Explore ML Beginner Session on Linear Regression

MongoDB Project: Relational databases to Document-Oriented databases

MBrace: Large-scale cloud computation with F# (CUFP 2014)

MBrace: Cloud Computing with F#

Quantum computers

Exploring Fused Convolutional Neural Networks for Aerial Imagery Segmentation

Functions with heap and stack

IITB Poster. Benchmarking GPU-based Acceleration of Spark in ML Workload usin...

06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...

Dream3D and its Extension to Abaqus Input Files

K10692 control theory

Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...

Deep Learning meetup

Python datetime

Presentation: Plotting Systems in R

Deterministic Machine Learning with MLflow and mlf-core

Hubba Deep Learning

Megadata With Python and Hadoop

Porting and optimizing UniFrac for GPUs

Reading: "Pi in the sky: Calculating a record-breaking 31.4 trillion digits o...

Viewers also liked

Presentation1Olonlog

Hedland Variable Area Flow MetersBell Flow Systems

Macnaught MX Series of Oval Gear Flow MetersBell Flow Systems

20131121Jocelyn

Blancett Turbine Flow MetersBell Flow Systems

20140109Jocelyn

20131219Jocelyn

Electromagnetic Flow Meters Overview (Badger Meter)Bell Flow Systems

The DXN Portable Ultrasonic Flow MeterBell Flow Systems

Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...Zubit Life Care

Vortex Flow Meters- Badger MeterBell Flow Systems

Hydraulic Turbine Flow Meters (flo-tech)Bell Flow Systems

VuHeat Ultrasonic Heat Meters DatasheetsBell Flow Systems

20131024Jocelyn

20131002Jocelyn

금융.경제이슈(2015.05월)Yeojin Choi

нестандартні рішення проблем в побудові ігорStfalcon Meetups

First Aid Course - Tomasz Sarosiek-ValinskasTomasz Sarosiek-Valinskas

Upcoming Webinar on Take Control of Engineering Data & ProcessesPROLIM Global Corporation

El prestigfffio de la profesión docentedelavibora

Viewers also liked (20)

Presentation1

Hedland Variable Area Flow Meters

Macnaught MX Series of Oval Gear Flow Meters

20131121

Blancett Turbine Flow Meters

20140109

20131219

Electromagnetic Flow Meters Overview (Badger Meter)

The DXN Portable Ultrasonic Flow Meter

Pricelist Zubit Life Care - PCD Pharma Company | PCD Pharma Franchise | Pharm...

Vortex Flow Meters- Badger Meter

Hydraulic Turbine Flow Meters (flo-tech)

VuHeat Ultrasonic Heat Meters Datasheets

20131024

20131002

금융.경제이슈(2015.05월)

нестандартні рішення проблем в побудові ігор

First Aid Course - Tomasz Sarosiek-Valinskas

Upcoming Webinar on Take Control of Engineering Data & Processes

El prestigfffio de la profesión docente

Similar to 20131114

Unity - Internals: memory and performanceCodemotion

Accelerated Logistic Regression on GPU(s)RAHUL BHOJWANI

002 - Introduction to CUDA Programming_1.pptceyifo9332

Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Daosheng Mu

Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...Jose Ricardo da Silva Junior

chapter4.pptRakesh Pogula

Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Gurbinder Gill

lecture11_GPUArchCUDA01.pptxssuser413a98

Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...mouhouioui

Golang in TiDB (GopherChina 2017)PingCAP

MAtrix Multiplication Parallel.ppsxBharathiLakshmiAAssi

matrixmultiplicationparallel.ppsxBharathi Lakshmi Pon

GraphiteDavid Lutz

【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法Unite2017Tokyo

【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法Unity Technologies Japan K.K.

The Rise of Parallel Computingbakers84

ParalellMark Vicuna

[DBA]_HiramFleitas_SQL_PASS_Summit_2017_SummaryHiram Fleitas León

Parallel Computing--Webminar.ppsxBharathiLakshmiAAssi

parallelcomputing-webminar.ppsxBharathi Lakshmi Pon

Similar to 20131114 (20)

Unity - Internals: memory and performance

Accelerated Logistic Regression on GPU(s)

002 - Introduction to CUDA Programming_1.ppt

Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...

Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...

chapter4.ppt

Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...

lecture11_GPUArchCUDA01.pptx

Etude éducatif sur les GPUs & CPUs et les architectures paralleles -Programmi...

Golang in TiDB (GopherChina 2017)

MAtrix Multiplication Parallel.ppsx

matrixmultiplicationparallel.ppsx

Graphite

【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法

The Rise of Parallel Computing

Paralell

[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary

Parallel Computing--Webminar.ppsx

parallelcomputing-webminar.ppsx

Recently uploaded

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

How to convert PDF to text with Nanonetsnaman860154

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)

How to convert PDF to text with Nanonets

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Driving Behavioral Change for Information Management through Data-Driven Gree...

Tata AIG General Insurance Company - Insurer Innovation Award 2024

How to Troubleshoot Apps for the Modern Connected Worker

Finology Group – Insurtech Innovation Award 2024

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Breaking the Kubernetes Kill Chain: Host Path Mount

Advantages of Hiring UIUX Design Service Providers for Your Business

Exploring the Future Potential of AI-Enabled Smartphone Processors

Handwritten Text Recognition for manuscripts and early printed texts

08448380779 Call Girls In Friends Colony Women Seeking Men

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

What Are The Drone Anti-jamming Systems Technology?

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Automating Google Workspace (GWS) & more with Apps Script

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

2024: Domino Containers - The Next Step. News from the Domino Container commu...

08448380779 Call Girls In Civil Lines Women Seeking Men

20131114

1. WEEKLY REPORT Thur., Nov 14, 2013 Pin Yi Tsai

2. OUTLINE • Current Work • Compute Integral Image – computeByRow  Using shared memory  Using register  Result • CUDA Memory Architecture

3. USING SHARED MEMORY • Scope: block • Shared memory: store the values of the previous line • computing by Row for img[*][y] and img[*][y+1] • Time t: calculate img[*][y] + shared memory[*] • Then store the result back to shared memory[*] • Time t+1: calculate img[*][y+1] + shared memory[*]

4. USING REGISTER • Scope: thread • One line one thread  Why not one pixel one thread? The use of _syncthread(); • Using register: store the values of the previous pixel

5. RESULT • 16x16 • Serial version: 0.006336 ms • Parallel version: 5.88559e-39 ms ======== Profiling result: Time(%) Time Calls Avg Min Max Name 55.69 18.91us 1 18.91us 18.91us 18.91us computeByRow(float*, int, int) 25.84 8.78us 1 8.78us 12.91 4.38us 2 2.19us 2.18us 2.21us [CUDA memcpy DtoH] 5.56 1.89us 2 944ns 8.78us 928ns 8.78us computeByColumn(float*, int, int) 960ns [CUDA memcpy HtoD]

6. RESULT (CONT.) • 640*480 • Serial version: 5.1607 ms • Parallel version: 4.40496 ms ======== Profiling result: Time(%) Time Calls Avg Min Max Name 66.37 2.19ms 1 2.19ms 2.19ms 2.19ms computeByRow(float*, int, int) 12.75 419.74us 2 209.87us 209.28us 210.46us [CUDA memcpy HtoD] 11.74 386.43us 2 193.22us 191.04us 195.39us [CUDA memcpy DtoH] 9.15 301.24us 1 301.24us 301.24us 301.24us computeByColumn(float*, int, int)

7. CUDA MEMORY ARCHITECTURE

10. The End

20131114

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to 20131114

Similar to 20131114 (20)

Recently uploaded

Recently uploaded (20)

20131114