Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Practical Occlusion Culling in          Killzone 3           Michal Valient      Lead Tech, Guerrilla B.V.
Talk takeawayOcclusion culling system used in Killzone 3The reasons why to use software rasterization(Some) technical d...
Talk takeawayOcclusion culling system used in Killzone 3You’ll love software rasterization(Some) technical detailsHow ...
Talk takeawayOcclusion culling system used in Killzone 3You’ll love software rasterization(Some) technical detailsHow ...
Talk takeawayOcclusion culling system used in Killzone 3You’ll love software rasterization(Some) technical detailsHow ...
Talk takeawayOcclusion culling system used in Killzone 3You’ll love software rasterization(Some) technical detailsHow ...
Killzone 3 visibility solutionSoftware rasterization running on SPUsRender occluders into depth buffer –Use simplified v...
Why software rasterizationPrevious solution did not scale well –Manually placed portals
Why software rasterizationPrevious solution did not scale well –Manually placed portalsWorks automatically –Can be enabl...
Why software rasterizationPrevious solution did not scale well –Manually placed portalsWorks automatically –Can be enabl...
Why software rasterizationPrevious solution did not scale well –Manually placed portalsWorks automatically –Can be enabl...
Implementation overview
Occluder setupFirst stage, not parallelOutputs clipped + projected triangles –One list of triangle data –One “index” DMA...
RasterizationSplit 640x360p depth buffer into 16 pixel-high strips –Rasterize in parallel, one SPU job per strip –Load tr...
RasterizationCompress depth buffer –One output pixel is maximum depth of 16x16 block.  –But patch single pixel holes firs...
Occlusion testsTests happen in parallelEach object consists of one or more partsFirst test object bounding box –Skip fo...
Occlusion testsAccurate tests –Bounding box rasterization and depth test –Working on small depth buffer, be conservative...
Generating good occluders
Where to get occludersAiming for automated solutionOriginally wanted to use scene geometry –Reduced polygon count –Too m...
Where to get occluders
How to select good occludersSimple heuristics to identify good occluders –Discard anything which is small –Discard by met...
Artist generated occluders
Artist generated occluders
ConclusionSoftware rasterization is great –Fast on SPUs –Easy to integrate –Very accurate (if occluders are accurate)Cre...
ConclusionSpecial thanks to Will Vale (Second Intention Ltd) forimplementing this system for us.
Statistics100 occluders, 1500 trianglesTest 1000 objects, 2700 partsTimings –Setup job: 0.5ms –Rasterize job: 2.0ms (on...
Practical Occlusion Culling in Killzone 3
Practical Occlusion Culling in Killzone 3
Practical Occlusion Culling in Killzone 3
Practical Occlusion Culling in Killzone 3
Practical Occlusion Culling in Killzone 3
Upcoming SlideShare
Loading in …5
×

Practical Occlusion Culling in Killzone 3

12,492 views

Published on

Killzone 3 features complex occluded environments. To cull non-visible geometry early in the frame, the game uses PlayStation 3 SPUs to rasterize a conservative depth buffer and perform fast synchronous occlusion queries against it. This talk presents an overview of the approach and key lessons learned during its development.

  • Be the first to comment

Practical Occlusion Culling in Killzone 3

  1. 1. Practical Occlusion Culling in Killzone 3 Michal Valient Lead Tech, Guerrilla B.V.
  2. 2. Talk takeawayOcclusion culling system used in Killzone 3The reasons why to use software rasterization(Some) technical detailsHow to pick good occludersQ&A
  3. 3. Talk takeawayOcclusion culling system used in Killzone 3You’ll love software rasterization(Some) technical detailsHow to pick good occludersQ&A
  4. 4. Talk takeawayOcclusion culling system used in Killzone 3You’ll love software rasterization(Some) technical detailsHow to pick good occludersQ&A Huge environments
  5. 5. Talk takeawayOcclusion culling system used in Killzone 3You’ll love software rasterization(Some) technical detailsHow to pick good occludersQ&A Armed JetPacks
  6. 6. Talk takeawayOcclusion culling system used in Killzone 3You’ll love software rasterization(Some) technical detailsHow to pick good occludersQ&A Giant Spider Robots
  7. 7. Killzone 3 visibility solutionSoftware rasterization running on SPUsRender occluders into depth buffer –Use simplified version of the scene geometryConservatively scale down –To make it fit into SPU memory –To make it faster to test againstTest all objects against small depth buffer –Test bounding boxes
  8. 8. Why software rasterizationPrevious solution did not scale well –Manually placed portals
  9. 9. Why software rasterizationPrevious solution did not scale well –Manually placed portalsWorks automatically –Can be enabled early in production
  10. 10. Why software rasterizationPrevious solution did not scale well –Manually placed portalsWorks automatically –Can be enabled early in productionCompletely dynamic solution –Any object can become an occluder
  11. 11. Why software rasterizationPrevious solution did not scale well –Manually placed portalsWorks automatically –Can be enabled early in productionCompletely dynamic solution –Any object can become an occluderMaps well to SPUs –No sync issues –No GPU costs related to visibility testing
  12. 12. Implementation overview
  13. 13. Occluder setupFirst stage, not parallelOutputs clipped + projected triangles –One list of triangle data –One “index” DMA list per rasterizer jobCaches are important at this stage –2KB vertex array cache (90% hit rate) –32-entry post-transform cache (60% hit rate) –Various double-buffered output caches
  14. 14. RasterizationSplit 640x360p depth buffer into 16 pixel-high strips –Rasterize in parallel, one SPU job per strip –Load triangles using the prepared DMA listTraditional scanline rasterizer –Fill internal 640x16 floating point depth bufferVectorization is the key –Set up three or four edges at once –Generate 4x1 pixels at once –Optimize in assembly
  15. 15. RasterizationCompress depth buffer –One output pixel is maximum depth of 16x16 block. –But patch single pixel holes first. –Encode as uint16, reserve 0xffffu for infinityOutput single scanline of 40x23 occlusion buffer
  16. 16. Occlusion testsTests happen in parallelEach object consists of one or more partsFirst test object bounding box –Skip for objects visible last frameThen test individual partsContinue with submesh culling –Small Spatial Kd-Tree inside most meshes –Allows for culling arbitrarily small mesh chunks
  17. 17. Occlusion testsAccurate tests –Bounding box rasterization and depth test –Working on small depth buffer, be conservativeFast bounding sphere tests –Precomputed hierarchical reject data –Constant time test for small spheres –Only used for fast reject
  18. 18. Generating good occluders
  19. 19. Where to get occludersAiming for automated solutionOriginally wanted to use scene geometry –Reduced polygon count –Too many errors, in general does not work Now using physics mesh –Closed, low polygon meshes –Not always conservative in the right sense –Visual mesh can be inside physics mesh causing drops
  20. 20. Where to get occluders
  21. 21. How to select good occludersSimple heuristics to identify good occluders –Discard anything which is small –Discard by meta data –clutter, set dressing, foliage, railings… –Discard if surface area is significantly smaller than bounding box surface areaArtists can override the process –Still creating the best occluders by hand
  22. 22. Artist generated occluders
  23. 23. Artist generated occluders
  24. 24. ConclusionSoftware rasterization is great –Fast on SPUs –Easy to integrate –Very accurate (if occluders are accurate)Creating occluders is hard –Automatic system was not enough –Plan for this in content creation –Define workflow for finding and fixing leaksVoxelization anyone?
  25. 25. ConclusionSpecial thanks to Will Vale (Second Intention Ltd) forimplementing this system for us.
  26. 26. Statistics100 occluders, 1500 trianglesTest 1000 objects, 2700 partsTimings –Setup job: 0.5ms –Rasterize job: 2.0ms (on 5 SPUs) –Query job: 4.5ms (on 5 SPUs) –Overall latency: ~2ms

×