USING PARALLEL PROGRAMMING TOIMPROVE PERFORMANCE OF IMAGEPROCESSINGChan Le – KAIST ’13
INTRODUCTION   Me:     Chan Le – 3rd year undergraduate student     Double major in Computer Science & Management Scien...
MOTIVATION   Biomedical researches work with images       Really big images           Take long time to process     Ra...
RELATED WORKS: USING PDE IN NOISE-REDUCTION                                               IN                              ...
RELATED WORKS: ANISOTROPIC DIFFUSION   Paper: Scale-space and edge detection using    anisotropic diffusion (Pietro Peron...
RELATED WORKS
NVIDIA CUDA   Serial vs Parallel program      Thread: unit of processing      In the past: CPU has only 1 core -> 1 thr...
MY IMPLEMENTATION   Implement Anisotropic    Diffusion on CUDA    platform 1 thread handle 1 pixel Dividing the image t...
SOME RESULT – SMALL
SOME RESULT – SMALL
SOME RESULT – MEDIUM
SOME RESULT – MEDIUM
BENCHMARK   100 times iteration
CONCLUSION   The result of this project could be use to help    improving quality of images before using. Utilizing GPU ...
Upcoming SlideShare
Loading in …5
×

Using parallel programming to improve performance of image processing

1,780 views

Published on

Implement Anisotropic Diffusion on CUDA platform

1 thread handle 1 pixel
Dividing the image to multiple sub-regions, process them parallely to exploit multiple cores

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,780
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
36
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Using parallel programming to improve performance of image processing

  1. 1. USING PARALLEL PROGRAMMING TOIMPROVE PERFORMANCE OF IMAGEPROCESSINGChan Le – KAIST ’13
  2. 2. INTRODUCTION Me:  Chan Le – 3rd year undergraduate student  Double major in Computer Science & Management Science  A Vietnamese - KAIST ’13 Professor:  Won-Ki Jeong  GPU-accelerated large-scale biomedical image processing Project:  Apply parallel programming to improve performance of image processing
  3. 3. MOTIVATION Biomedical researches work with images  Really big images  Take long time to process  Raw images are hard to analyze & use for research  Really noisy sometimes  Need to preprocess before using Image preprocessing using serial algorithms are slow Nowadays, parallel computing are developing  Thanks to the popularity of multi-core CPUs and GPUs
  4. 4. RELATED WORKS: USING PDE IN NOISE-REDUCTION IN (x,y+1 ΔW = IW – It ) ΔN IW ΔW It ΔE IE (x-1,y) (x,y) (x+1,y) ΔS IS (x,y- 1) Heat equation  At pixel every (x,y) of the image at the time t: I =It+ΔI  t+1 ΔI = (ΔW+ ΔN+ ΔE+ ΔS) / 4
  5. 5. RELATED WORKS: ANISOTROPIC DIFFUSION Paper: Scale-space and edge detection using anisotropic diffusion (Pietro Perona & Jitendra Malik, 1990)  Basic idea: Adding coefficient to each ΔW,ΔN,ΔS,ΔE  .  /4 How to calculate each c? C= C=
  6. 6. RELATED WORKS
  7. 7. NVIDIA CUDA Serial vs Parallel program  Thread: unit of processing  In the past: CPU has only 1 core -> 1 thread at a time  Nowadays: multi-cores -> multiple thread at a time CUDA™ is a parallel computing platform and programming model invented by NVIDIA. http://www.nvidia.com/object/cuda_home_new.html How could it helps?  CPU: 1-6 cores  GPU: hundreds   improve performance by the scale of 10 to 100, depends on the algorithm
  8. 8. MY IMPLEMENTATION Implement Anisotropic Diffusion on CUDA platform 1 thread handle 1 pixel Dividing the image to multiple sub-regions, process them parallely to exploit multiple cores
  9. 9. SOME RESULT – SMALL
  10. 10. SOME RESULT – SMALL
  11. 11. SOME RESULT – MEDIUM
  12. 12. SOME RESULT – MEDIUM
  13. 13. BENCHMARK 100 times iteration
  14. 14. CONCLUSION The result of this project could be use to help improving quality of images before using. Utilizing GPU computing power could improve the performance of your program by 100-200 times Partial Differential Equations are good choices when design parallel algorithm However, the performance is limited by the GPU’s memory size

×