Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Advanced Real-time Post-Processing using GPGPU techniques


Published on

Master thesis about GPGPU post-processing by Per Lönroth and Mattias Unger from Linköping University, Department of Science and Technology.

  • Be the first to comment

Advanced Real-time Post-Processing using GPGPU techniques

  1. 1. Advanced Real-time Post-Processing using GPGPU techniques
  2. 2. Presentation overview <ul><li>Problem description and objectives </li></ul><ul><li>Depth of field </li></ul><ul><li>Methods </li></ul><ul><li>GPGPU programming </li></ul><ul><li>Results </li></ul><ul><li>Conclusion </li></ul><ul><li>Questions </li></ul>
  3. 3. Problem description and objectives <ul><li>Post processing filters </li></ul><ul><ul><li>Different depth of field algorithms </li></ul></ul><ul><ul><li>Visual quality </li></ul></ul><ul><li>Implement using HLSL and CUDA </li></ul><ul><ul><li>Performance </li></ul></ul><ul><ul><li>Usability </li></ul></ul>
  4. 4. Depth of field <ul><li>Depth cue </li></ul><ul><li>Focus plane </li></ul><ul><ul><li>Focus in area in front of and beyond </li></ul></ul><ul><ul><li>Different blurriness </li></ul></ul>
  5. 5. Depth of field <ul><li>Thin lens camera model </li></ul><ul><ul><li>Circle of confusion </li></ul></ul>
  6. 6. Depth of field <ul><li>Calculate Circle of confusion </li></ul><ul><ul><li>Depth value and lins parameters </li></ul></ul>Depth map COC map
  7. 7. Methods <ul><li>Poisson disc blur </li></ul><ul><li>Multi-passed diffusion </li></ul><ul><li>Separable diffusion </li></ul><ul><li>Summed-area table </li></ul>
  8. 8. Methods – Poisson disc blur <ul><li>Distribution function </li></ul><ul><li>COC defines scale </li></ul><ul><li>Downscaled image </li></ul>
  9. 9. Methods – Poisson disc blur <ul><li>Calculate values and interpolate depending on COC </li></ul>
  10. 10. Methods – Multi-passed diffusion <ul><li>Every pixel gets new value depending on the COC gradient </li></ul>Iterations
  11. 11. Methods – Separable diffusion <ul><li>Use a tridiagonal system to represent the heat conductivity </li></ul><ul><li>Cyclic reduction can solve the matrices for each row </li></ul>
  12. 12. Methods – Separable diffusion <ul><li>Each row is solved independently </li></ul><ul><li>In each step a reduced tridiagonal matrix is calculated (and output value) until the system is solved </li></ul>
  13. 13. GPGPU programming <ul><li>General </li></ul><ul><ul><li>Better flexibility </li></ul></ul><ul><ul><li>Potential advantages </li></ul></ul><ul><li>CUDA </li></ul><ul><ul><li>Extension of C </li></ul></ul><ul><ul><li>Large community </li></ul></ul>
  14. 14. GPGPU programming <ul><li>Executes in chunks of threads </li></ul><ul><ul><li>User specified blocks </li></ul></ul><ul><li>Several memory types </li></ul><ul><ul><li>Global </li></ul></ul><ul><ul><li>Texture </li></ul></ul><ul><ul><li>Shared </li></ul></ul><ul><ul><li>Constant </li></ul></ul><ul><li>More choices and possibilities </li></ul><ul><ul><li>Hardware specific limits </li></ul></ul><ul><ul><li>Great potential </li></ul></ul>
  15. 15. GPGPU programming <ul><li>Gaussian blur timings </li></ul>
  16. 16. GPGPU programming <ul><li>Implementation impact using CUDA </li></ul><ul><ul><li>+ </li></ul></ul><ul><ul><ul><li>Easy to get started (C) </li></ul></ul></ul><ul><ul><ul><li>Memory indexing (no more floating point texture indices) </li></ul></ul></ul><ul><ul><ul><li>Good support for timing on the GPU </li></ul></ul></ul><ul><ul><ul><li>Good control over computations (threads and memory) </li></ul></ul></ul><ul><ul><li>- </li></ul></ul><ul><ul><ul><li>A lot of ”rules” (amount of threads, occupancy, etc) </li></ul></ul></ul><ul><ul><ul><li>Hard to optimize </li></ul></ul></ul><ul><ul><ul><li>Beta problems (lack of interop, slow operations) </li></ul></ul></ul>
  17. 17. Results <ul><li>HLSL and CUDA for most methods </li></ul><ul><ul><li>Exceptions </li></ul></ul><ul><ul><ul><li>Poisson disc (HLSL only) </li></ul></ul></ul><ul><ul><ul><li>Summed Area-Table (CUDA only) </li></ul></ul></ul><ul><ul><li>Timings in runs of 100 on recent hardware </li></ul></ul>
  18. 18. Results <ul><li>Poisson disc timings </li></ul><ul><li>Separable simluated diffusion timings </li></ul><ul><li>Multi-passed diffusion timings </li></ul>
  19. 19. Results <ul><li>Artifacts </li></ul><ul><ul><li>Color leaking </li></ul></ul><ul><ul><li>Sharp edges </li></ul></ul>
  20. 20. Results <ul><li>Input data </li></ul>
  21. 21. Results <ul><li>Poisson disc </li></ul><ul><li>Multi-passed diffusion </li></ul><ul><li>Separable simulated diffusion </li></ul>
  22. 22. Results <ul><li>Poisson disc </li></ul><ul><li>Multi-passed diffusion </li></ul><ul><li>Separable simulated diffusion </li></ul>
  23. 23. Results <ul><li>Lens parameter settings </li></ul>
  24. 24. Conclusions <ul><li>Current depth of field filters are good enough </li></ul><ul><ul><li>Not really, but better is too expensive </li></ul></ul><ul><ul><li>Cut scenes do get time for more computations </li></ul></ul><ul><li>GPGPU techniques have great potential </li></ul><ul><ul><li>Not mature enough (hardware support etc.) </li></ul></ul><ul><ul><li>Maybe better for other things than image processing </li></ul></ul><ul><li>Future work </li></ul><ul><ul><li>Diffusion based approach offers best visual quality </li></ul></ul><ul><ul><li>Compute shaders anyone? </li></ul></ul>
  25. 25. Videos
  26. 26. End