GPU based methods can be summarized like this: after we have rendered the frame we can really efficiently figure out which objects were actually visible. Or we can make guesses based on the previous frame. However these guesses can fail making the approach unrealiable.What we want is to figure out visibility independent of the rendering pipeline. This also greatly expands the use cases of the visibility data.
These are the requirements we had, we need to create a data structure and algorithm within these bounds.For any input 3D scene, memory usage and processing time of the visibility system must be bounded and independent of the complexity of the input.The whole process should be automatic.It should allow fast content iteration. And it should set no limits on the size of the game worlds.
So what is visibility? It’s the set of all the lines in space which do not intersect occluding geometry.Creating data strucure for accurate visibility presentation from polygon soup is theoretically possible, but in practice the data strucure would be so large and so slow to compute it’s unusable.Instead we need to use conservative visibility, where we have bounded time and space and within these bounds give the best results we can.Visibility is ultimately reasoning about space. Interesting thing here to note is that once we have a topological data structure like this it can be used for other things as well, not just visibility.
Here you’re looking at top down view of an example scene.The requirement was that there should be no manual markup or other requirements on the input geometry. So what we take as input is all the level geometry as it is. So we really don’t have any other information besides the long list of polygons, which are possibly grouped into objects.Doing geometric operations with polygons directly has all kinds of difficulties related to floating point accuracy. Also, almost all real life game levels contain some small modeling errors such as t-vertices and cracks between objects. We need to be able to work with this kind of input as well.
What we do next is to voxelize all the geometry. Great thing about voxelization is that it removes all the nasty problems with floating point accuracy and automatically removes common modeling errors such as cracks and t-vertices in the geometry. Voxelization also discretizes the input, making the following processing independent of polygons count. In effect we can choose the resolution of the input data. This is important for the goal of creating a bounded size data structure. The input could have billions of triangles but after this step we can throw all the original geometry away and work on the voxels instead.Bad thing about voxelization is that it requires quite a lot of memory. In fact since we need accurate visibility data we have to make the voxels quite small and the number of them might be measured in billions or even hundreds of billions for larger levels. Even compressed this data can take gigabytes of memory. The memory requirements alone indicate we need to further refine turn this voxel presentation into something else to make it usable in practice.
The approach we chose is to create a cell-and-portal graph by grouping the voxels together based on proximity and connectivity. Cells are created from groups of voxels. Portals are then created on the boundaries of these groups.We chose to create portals because in the past they have been proven to be an efficient way to represent visibility, and we solve the issues of manually placed portals by generating them automatically. In constrast to manually placed portals, we might generate thousands of portals which allows us to have accurate visibility in outdoor spaces as well.By controlling the number of output cells and portals we can choose the output resolution of the visibility data so that it meets the memory and performance requirements.
In addition to the cell and portal graph we need to be able to figure out in which cell the viewpoint is located when making a visibility query.We create a KD-tree from the voxels, which can be packed very efficiently with only a few bits per node. The leaf nodes of the tree contain indices to cells in the graph.
The final data stucture contains the cell-and-portal graph, the view tree, and a list of statis objects contained in each cell. We call the output data structure a tome .. Tome of visibility .. It contains all the information about topological structure need for visibility determination in a 3D scene.
Next we look at how this data is used to perform the visibility queries.
The basic operation is the visibility query, which uses the tome data to figure out what is visible from a specific view point inside the scene.The goal is to create a depth buffer, or occlusion buffer, from the tome, which can be then used to very efficiently test the visibility of objects.
First step is to figure out in which cell the view point is located in. This is done by traversing the view tree based on the query origin. This is a simple tree traversal and it’s a fast operation.
There are several options on how we could use the portal data. We could do a traditional recursive portal traversal, clipping the view frustum as we go through each portal. Or we could do ray-tracing or cone-tracing in the graph.The approach we choose is to rasterize the portals using a custom software rasterizer optimized for this purpose. With rasterizer we need to touch each portal only once, as opposed to recursive portal traversal which can suffer from exponential blowup if there’s a lot of portal intersections in the screenspace. (We could also traverse other kind of queries for connectivity.)
Also really useful property of the rasterizer is that it produces a depth buffer as output, which is almost optimal data structure for doing further visibility tests on the query output. Also with rasterization we can choose the output resolution based on the platform and accuracy requirements. Since we’re rasterizing portals instead of occluders, it’s trivial to implement conservative rasterization, which is a requirement for getting correct results in lower resolutions.
After we have generated the depth buffer, testing objects for visibility is quite straightforward bounding box vs. depth buffer visibility test. If we keep the depth buffer resolution small enough to fit into the CPU cache, this test is super-super fast, and also trivial to parallelize.
So how about dynamic objects and occluders?Again,Depth buffer is great for combining occlusion from different sourcesTesting visibility of dynamic objects is alreay handled by the depth buffer, and we can also support dynamic occluders by rasterizing them into the same depth buffer.We have implement an another software rasterizer optimized for depth only rasterization of triangle meshes. So we have two rasterizers, one for portals and one for occluder triangle meshes. Usually only a small portain of dynamic occluders are actually good occluders for the triangle counts here a small enough for software triangle rasterization.
One unique property of our chosen solution is that it’s not resctricted to from-point visibility. We can also do visibility queries from regions such as radius around the camera location. If maximum camera movement speed is known this can be used to conservatively predict visibility even over multiple frames.
...We have taken best parts of portal culling and rasterization, we rasterize automatically placed portals. Depth buffer can be used to combine occlusion from different sources.
Next we look at some demo videos.Please note that the following content is from our internal testing and not from Destiny. Hao will present real life results from Destiny later in the talk.
To meet the requirements, we needed to come up with a data structure for spatial reasoning in polygon soup.We used voxelization and automatic portals generation to create it.
With this data structure we can also solve other problems.Shadow caster culling for faster shadow map rendering.Connectivity queries and ray casts for game logic and AI.Audio occlusion.Speeding up lighting computation.Visibility based data streaming.
The goal is to make the data generation fast enough to that it can be done at runtime.Automated level analysis could be used to automatically find hotspots in the game level. For example to figure out what is the worst case camera position for triangle count.
Automatic Software Occlusion Culling for Massive Streaming Worlds
Automatic Software Occlusion
Culling for Massive Streaming