Your SlideShare is downloading. ×
Embarrassingly Parallel Computation for Occlusion Culling
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Embarrassingly Parallel Computation for Occlusion Culling

1,145
views

Published on

One of the key challenges of modern 3D game rendering engines powering the next-generation of console games is to minimize resources spent on assets that do not actually contribute to the user …

One of the key challenges of modern 3D game rendering engines powering the next-generation of console games is to minimize resources spent on assets that do not actually contribute to the user experience. More specifically, determining which surfaces are hidden behind (occluded by) other surfaces can be a very hard problem to solve in real-time, but will typically yield significant performance gains.

Real-time occlusion culling typically requires either a vast amount of manual labor or a computationally intensive pre-processing step. In this talk, I will show how the occluder generation step can actually be considered embarrassingly parallel, and distributed across multiple nodes accordingly. I will also discuss how this model can be further improved.

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,145
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. EmbarrassinglyParallel Computationfor VisibilityJasin BushnaiefUmbra Software
  • 2. Who are we?• The only occlusion culling middleware company in the world• Founded in 2006• Based in Helsinki• 12 people• Customers: Bungie (Halo), Guerrilla (Killzone), Remedy (Alan Wake), Bioware (Mass Effect), CD Projekt (Witcher), ArenaNet (Guild Wars) and many more
  • 3. We’re going to talk about• The past – Brief introduction to occlusion culling – Traditional methods of visibility computation• The present – Umbra’s visibility computation algorithm – How it can be distributed• The future – Challenges of modern games and engines
  • 4. The Past:SO, WHAT’S OCCLUSION CULLINGANYWAY?
  • 5. Graphics in games• Game development process: – Artists create content – Engine runtime renders it• Rendering – Content consists of objects – Which consist of triangles – Which get rendered by the GPU• Our business: rendering optimization
  • 6. Occlusion culling explained• ”Culling is the process of removing breeding animals from a group based on specific criteria.” (Wikipedia)• Hidden surface removal: ”Which surfaces do not contribute to the final rendered image on the screen?”• Some popular HSR methods: – Frustum culling – Backface culling – Occlusion culling
  • 7. Occlusion culling explained• Occlusion culling: ”Which surfaces are blocked (occluded) by other surfaces?”• Depth buffering is one way to do OC – Very accurate (i.e. pixel level) – Ubiquitous on hardware, easy problem to solve – Occurs very late in the pipeline
  • 8. Occlusion culling explained• Higher-level methods complement depth- buffering nicely• These cull entire objects, groups of objects or entire sections of the scene – Not easy!• The earlier, the better
  • 9. Occlusion cullingOnly the objects visible tothe camera are rendered
  • 10. ”Traditional” way to do OC• Preprocess: – Divide scene into cells – Compute visibility between cells • Results in a visibility matrix (PVS)• Runtime: – Locate the camera – Do a lookup into the PVS matrix
  • 11. Simple example
  • 12. Split scene into cells A B C D
  • 13. Compute visibility (sampling) A B A B C D A 1 1 1 0 B C D C D
  • 14. Compute visibilityA B A B C D A 1 1 1 0 B 1 1 0 1 C DC D
  • 15. Compute visibilityA B A B C D A 1 1 1 0 B 1 1 0 1 C 1 0 1 1 DC D
  • 16. Compute visibilityA B A B C D A 1 1 1 0 B 1 1 0 1 C 1 0 1 1 D 0 1 1 1C D
  • 17. Runtime PVS cullingA B A B C D A 1 1 1 0 B 1 1 0 1 C 1 0 1 1 D 0 1 1 1C D
  • 18. Problem?• Solving visibility between cells is very difficult – E.g. Solving analytically is actually O(n4)• Global operation by nature• Doesn’t play well with dynamic scenes – Worst case: a change in one cell requires recomputation of the entire matrix
  • 19. The PresentUMBRA DOES IT BETTER
  • 20. Welcome to the 2010s• Modern game worlds are huge• So it’d be cool if you didn’t need the entire scene in memory, ever• It’d be even cooler if the heavy lifting could be distributed. Or sent to the Cloud™• Buildings collapse. Things change.
  • 21. The Umbra approach• Don’t actually compute visibility for the entire scene• Instead, process geometry to create a datastructure to solve visibility in the runtime• Portal culling in the runtime
  • 22. Data generation• Data = portal graph• Generate local graphs individually reasonably- sized geometry chunks (tiles), in parallel• Combine the results into a global portal graph that can be quickly traversed• Solve visibility quickly in the runtime using this graph
  • 23. Will this work?• Portal generation – Is very hard, but possible to do automatically – Only local geometry needed →Pretty much an embarrassingly parallel problem• Runtime – Not as simple as a PVS lookup, but still quite fast
  • 24. Simple example revisited
  • 25. Split geometry into tiles
  • 26. Dispatch tiles to worker nodes Tile 0 Tile 1 Tile 2 Tile 3
  • 27. Generate portalsTile 0 Tile 1 Tile 2 Tile 3
  • 28. Combine portal graph
  • 29. Runtime query: traverse portals
  • 30. What did we do here? • Essentially a map-reduce – Split scene into distributable tiles – Generate local portal graph for each tile – Combine results, link global portal graph RuntimeScene Tile 0 Portals 0 Global portal Visible graph objects Reduce Tile 1 Portals 1 Query Map ... ... Tile n Portals n
  • 31. The FutureTHE NEXT GENERATION
  • 32. Turns out...• Even the initial ”map” is too much for large game worlds• A global graph of a vast world is too expensive in the runtime• You need to support multiple versions of some chunks for dynamic content – Quite a combinatorial problem→ Next-gen games require an even bettersolution!
  • 33. So we did something like this Runtime Tile 0 Portals 0 Graph A Visible objects Combine Query Tile 1 Portals 1 Tile 2 Portals 2 Tile 3 Portals 3 Graph B Visible Combine objects Query... ... ... Tile n Portals n
  • 34. Got rid of ”map” Runtime Tile 0 Portals 0 Graph A Visible objects Combine Query Tile 1 Portals 1 Tile 2 Portals 2 Tile 3 Portals 3 Graph B Visible Combine objects Query... ... ... Tile n Portals n
  • 35. Split up ”reduce”, moved to runtime Runtime Tile 0 Portals 0 Graph A Visible objects Combine Query Tile 1 Portals 1 Tile 2 Portals 2 Tile 3 Portals 3 Graph B Visible Combine objects Query... ... ... Tile n Portals n
  • 36. Questions?jasin@umbrasoftware.com