Bp.On.Cuda

955 views

Published on

Belief Propagation Algorithm using CUDA

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
955
On SlideShare
0
From Embeds
0
Number of Embeds
21
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Bp.On.Cuda

  1. 1. Disparity-Map Generation using GPUs<br />Yan Xu<br />Tutor: Hui Chen<br />School of Information Science and Engineering<br />Aug. 1, 2009<br />
  2. 2. Tsukuba Right Image<br />Tsukuba Left Image<br />Ground Truth<br />
  3. 3. Disparity-Map in Stereo Vision<br />Parallel Programming<br />Programming on GPUs<br />Belief Propagation<br />BP on CUDA<br />Experiment Results<br />Conclusions and Future Works<br />Over View<br />
  4. 4. Disparity-Map Generation<br />Disparity-Map<br />Stereo Match<br />Rectification<br />Calibration<br />
  5. 5. Local Algorithm<br />Belief Propagation<br />Graph Cut<br />Dynamic Programming<br />Disparity-Map Generation<br />
  6. 6. Ground Truth<br />Tsukuba Left Image<br />Tsukuba Right Image<br />Disparity Image by BP (F. Felzenszwalb)<br />Disparity Image by DP<br />Disparity Image by GC (Kolmogorov)<br />
  7. 7. Parallel Programming<br />Serial Programming<br />Parallel Programming<br />
  8. 8. Parallel Programming<br />Traditionally, software has been written for serial computation:<br />• To be run on a single computer having a single Central Processing Unit (CPU).<br />• A problem is broken into a discrete series of instructions.<br />• Instructions are executed one after another.<br />• Only one instruction may execute at any moment in time.<br />•Serial <br />Programming<br />Parallel <br />Programming<br />
  9. 9. Parallel Programming<br />In the simplest sense, parallel computing is the simultaneous use of multiple compute resources to solve a computational problem:<br />• To be run using multiple CPUs.<br />• A problem is broken into discrete parts that can be solved concurrently.<br />• Each part is further broken down to a series of instructions.<br />• Instructions from each part execute simultaneously on different CPUs.<br />Serial <br />Programming<br />• Parallel <br />Programming<br />
  10. 10. Serial<br />Parallel<br />
  11. 11. Programming on GPUs<br />CPU (Host)<br />GPU (Device)<br />
  12. 12. (Device) Grid<br />Block (0, 0)<br />Block (1, 0)<br />Shared Memory<br />Shared Memory<br />Registers<br />Registers<br />Registers<br />Registers<br />Thread (0, 0)<br />Thread (1, 0)<br />Thread (0, 0)<br />Thread (1, 0)<br />Local<br />Memory<br />Local<br />Memory<br />Local<br />Memory<br />Local<br />Memory<br />Host<br />Global<br />Memory<br />Constant<br />Memory<br />Texture<br />Memory<br />Grid 1<br />Block<br />(0, 0)<br />Block<br />(1, 0)<br />Block<br />(2, 0)<br />Block<br />(0, 1)<br />Block<br />(1, 1)<br />Block<br />(2, 1)<br />Grid 2<br />Block (1, 1)<br />Thread<br />(0, 1)<br />Thread<br />(1, 1)<br />Thread<br />(2, 1)<br />Thread<br />(3, 1)<br />Thread<br />(4, 1)<br />Thread<br />(0, 2)<br />Thread<br />(1, 2)<br />Thread<br />(2, 2)<br />Thread<br />(3, 2)<br />Thread<br />(4, 2)<br />Thread<br />(0, 0)<br />Thread<br />(1, 0)<br />Thread<br />(2, 0)<br />Thread<br />(3, 0)<br />Thread<br />(4, 0)<br />Programming on GPUs<br />Host<br />Device<br />Kernel 1<br />Kernel 2<br />
  13. 13. Programming on GPUs<br />Main() {<br />//Allocate memory on GPU<br />float *Md;<br />cudaMalloc((void**)&Md, size);<br />//Copy data from CPU to GPU<br />cudaMemcpy(Md, M, size, cudaMemcpyHostToDevice);<br />//Call GPU kernel function<br /> kernel&lt;&lt;&lt;dimGrid, dimBlock&gt;&gt;&gt; (arguments);<br />//Copy data from GPU back to CPU<br />CopyFromDeviceMatrix(M, Md);<br />//Free device matrices<br />FreeDeviceMatrix(Md);<br />}<br />
  14. 14. Programming on GPUs<br />• CUDA (Compute Unified Device Architecture) is a computing architecture developed by nVIDIA to use graphic processing unit as a general purpose parallel processor.<br />nVIDIAGeFroce 8800<br />
  15. 15. Belief Propagation Algorithm<br />mlabels<br />s sites<br />data costs + discontinuity costs<br />
  16. 16. Belief Propagation Algorithm<br />
  17. 17. Belief Propagation on CUDA<br />1. Allocate GPU global memory<br />2. Load original images (left and right) to GPU global memory<br />3. (If real-world image) Pre-process images with Sobel / Residual<br />4. Calculate data cost<br />5. Calculate the data (Gaussian) pyramid<br />6. Message passing using created pyramid<br />7. Compute disparity map from messages and data-cost<br />8. Retrieve disparity map to local (host) memory<br />
  18. 18. Experiment Results<br />video<br />
  19. 19. Experiment Results<br />Original<br />
  20. 20. Experiment Results<br />Sobel<br />
  21. 21. Experiment Results<br />video<br />Residual<br />
  22. 22. Conclusions and Future Works<br />• Improve Belief Propagation (faster and better)<br />• Implement other stereo algorithms in parallel (such as DP, GC…)<br />• Apply the algorithm to stereo images captured by Truck <br />
  23. 23. Thank you for your attention !<br />Questions ?<br />

×