Vertex Shader Tricks
New Ways to Use the Vertex Shader to Improve
Performance
Bill Bilodeau
Developer Technology Engineer, AMD
Topics Covered
● Overview of the DX11 front-end pipeline
● Common bottlenecks
● Advanced Vertex Shader Features
● Vertex Shader Techniques
● Samples and Results
Graphics Hardware
DX11 Front-End Pipeline
● VS –vertex data
● HS – control points
● Tessellator
● DS – generated vertices
● GS – primitives
● Write to UAV at all stages
● Starting with DX11.1
Vector GPR’s
(256 2048-bit registers)
Vector ALU
(1 64-way single precision operation every 4 clocks)
Scalar ALU
(1 operation every 4 clocks)
Scalar GPR’s
(256 64-bit registers)
Vector/Scalar cross communication bus
Vector GPR’s
(256 2048-bit registers)
Vector ALU
(1 64-way single precision operation every 4 clocks)
Scalar ALU
(1 operation every 4 clocks)
Scalar GPR’s
(256 64-bit registers)
Vector/Scalar cross communication bus
Vector GPR’s
(256 2048-bit registers)
Vector ALU
(1 64-way single precision operation every 4 clocks)
Scalar ALU
(1 operation every 4 clocks)
Scalar GPR’s
(256 64-bit registers)
Vector/Scalar cross communication bus
.
.
.
Input Assembler
Hull Shader
Domain
Shader
Tessellator
Geometry
Shader
Stream
Out
CB,
SRV,
or
UAV
Vertex Shader
Bottlenecks - VS
● VS Attributes
● Limit outputs to 4 attributes (AMD)
●This applies to all shader stages (except PS)
● VS Texture Fetches
● Too many texture fetches can add latency
●Especially dependent texture fetches
●Group fetches together for better performance
●Hide latency with ALU instructions
Bottlenecks - VS
● Use the caches wisely
● Avoid large vertex formats
that waste pre-VS cache
space
● DrawIndexed() allows for
reuse of processed vertices
saved in the post-VS cache
●Vertices with the same index
only need to get processed once
Vertex Shader
Pre-VS Cache
(Hides Latency)
Input Assembler
Post-VS Cache
(Vertex Reuse)
Bottlenecks - GS
● GS
● Can add or remove primitives
● Adding new primitives requires storing new
vertices
●Going off chip to store data can be a bandwidth issue
● Using the GS means another shader stage
●This means more competition for shader resources
●Better if you can do everything in the VS
Advanced Vertex Shader Features
● SV_VertexID, SV_InstanceID
● UAV output (DX11.1)
● NULL vertex buffer
● VS can create its own vertex data
SV_VertexID
● Can use the vertex id to decide what
vertex data to fetch
● Fetch from SRV, or procedurally create a
vertex
VSOut VertexShader(SV_VertexID id)
{
float3 vertex = g_VertexBuffer[id];
…
}
UAV buffers
● Write to UAVs from a Vertex Shader
● New feature in DX11.1 (UAV at any stage)
● Can be used instead of stream-out for
writing vertex data
● Triangle output not limited to strips
●You can use whatever format you want
● Can output anything useful to a UAV
NULL Vertex Buffer
● DX11/DX10 allows this
● Just set the number of vertices in Draw()
● VS will execute without a vertex buffer bound
● Can be used for instancing
● Call Draw() with the total number of vertices
● Bind mesh and instance data as SRVs
Vertex Shader Techniques
● Full Screen Triangle
● Vertex Shader Instancing
● Merged Instancing
● Vertex Shader UAVs
Full Screen Triangle
● For post-processing effects
● Triangle has better performance
than quad
● Fast and easy with VS
generated coordinates
● No IB or VB is necessary
● Something you should be
using for full screen effects
Clip Space Coordinates
(-1, -1, 0)
(-1, 3, 0)
(3, -1, 0)
Full Screen Triangle: C++ code
// Null VB, IB
pd3dImmediateContext->IASetVertexBuffers( 0, 0, NULL, NULL, NULL );
pd3dImmediateContext->IASetIndexBuffer( NULL, (DXGI_FORMAT)0, 0 );
pd3dImmediateContext->IASetInputLayout( NULL );
// Set Shaders
pd3dImmediateContext->VSSetShader( g_pFullScreenVS, NULL, 0 );
pd3dImmediateContext->PSSetShader( … );
pd3dImmediateContext->PSSetShaderResources( … );
pd3dImmediateContext->IASetPrimitiveTopology( D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST );
// Render 3 vertices for the triangle
pd3dImmediateContext->Draw(3, 0);
Full Screen Triangle: HLSL Code
VSOutput VSFullScreenTest(uint id:SV_VERTEXID)
{
VSOutput output;
// generate clip space position
output.pos.x = (float)(id / 2) * 4.0 - 1.0;
output.pos.y = (float)(id % 2) * 4.0 - 1.0;
output.pos.z = 0.0;
output.pos.w = 1.0;
// texture coordinates
output.tex.x = (float)(id / 2) * 2.0;
output.tex.y = 1.0 - (float)(id % 2) * 2.0;
// color
output.color = float4(1, 1, 1, 1);
return output;
}
Clip Space Coordinates
(-1, -1, 0)
(-1, 3, 0)
(3, -1, 0)
VS Instancing: Point Sprites
● Often done on GS, but can be faster on VS
● Create an SRV point buffer and bind to VS
● Call Draw or DrawIndexed to render the full
triangle list.
● Read the location from the point buffer and
expand to vertex location in quad
● Can be used for particles or Bokeh DOF sprites
● Don’t use DrawInstanced for a small mesh
Point Sprites: C++ Code
pd3d->IASetIndexBuffer( g_pParticleIndexBuffer, DXGI_FORMAT_R32_UINT, 0 );
pd3d->IASetPrimitiveTopology( D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST );
pd3dImmediateContext->DrawIndexed( g_particleCount * 6, 0, 0);
Point Sprites: HLSL Code
VSInstancedParticleDrawOut VSIndexBuffer(uint id:SV_VERTEXID)
{
VSInstancedParticleDrawOut output;
uint particleIndex = id / 4;
uint vertexInQuad = id % 4;
// calculate the position of the vertex
float3 position;
position.x = (vertexInQuad % 2) ? 1.0 : -1.0;
position.y = (vertexInQuad & 2) ? -1.0 : 1.0;
position.z = 0.0;
position.xy *= PARTICLE_RADIUS;
position = mul( position, (float3x3)g_mInvView ) +
g_bufPosColor[particleIndex].pos.xyz;
output.pos = mul( float4(position,1.0), g_mWorldViewProj );
output.color = g_bufPosColor[particleIndex].color;
// texture coordinate
output.tex.x = (vertexInQuad % 2) ? 1.0 : 0.0;
output.tex.y = (vertexInQuad & 2) ? 1.0 : 0.0;
return output;
}
Point Sprite Performance
Indexed, 500K SpritesNon-Indexed, 500K SpritesGS, 500K SpritesDrawInstanced, 500K SpritesIndexed, 1M SpritesNon-Indexed, 1M SpritesGS, 1M SpritesDrawInstanced, 1M Sprit
R9 290x (ms) 0.52 0.77 1.38 1.77 1.02 1.53 2.7 3.54
Titan (ms) 0.52 0.87 0.83 5.1 1.5 1.92 1.6 10.3
0
2
4
6
8
10
12
AMD Radeon R9 290x
Nvidia Titan
Point Sprite Performance
● DrawIndexed() is the fastest method
● Draw() is slower but doesn’t need an IB
● Don’t use DrawInstanced() for creating
sprites on either AMD or NVidia hardware
● Not recommended for a small number of
vertices
Merge Instancing
● Combine multiple meshes that can be
instanced many times
● Better than normal instancing which renders
only one mesh
● Instance nearby meshes for smaller bounding box
● Each mesh is a page in the vertex data
● Fixed vertex count for each mesh
●Meshes smaller than page size use degenerate triangles
Merge Instancing
Mesh Vertex Data
Mesh Data 0
Mesh Data 1
Mesh Data 2
.
.
.
Mesh Instance Data
Instance 0
Mesh Index 2
Instance 1
Mesh Index 0
.
.
.
Degenerate
Triangle
Vertex 0
Vertex 1
Vertex 2
Vertex 3
.
.
.
0
0
0
Fixed Length Page
Merged Instancing using VS
● Use the vertex ID to look up the mesh to
instance
● All meshes are the same size, so (id / SIZE)
can be used as an offset to the mesh
● Faster than using DrawInstanced()
Merge Instancing Performance
0
5
10
15
20
25
30
DrawInstanced Soft Instancing
R9 290x
GTX 780
● Instancing performance test by
Cloud Imperium Games for Star
Citizen
● Renders 13.5M triangles (~40M
verts)
● DrawInstanced version calls
DrawInstanced() and uses instance
data in a vertex buffer
● Soft Instancing version uses
vertex instancing with Draw() calls
and fetches instance data from
SRV
AMD Radeon
R9 290X
Nvidia
GTX 780
ms
Vertex Shader UAVs
● Random access Read/Write in a VS
● Can be used to store transformed vertex
data for use in multi-pass algorithms
● Can be used for passing constant
attributes between any shader stage (not
just from VS)
Skinning to UAV
● Skin vertex data then output to UAV
● Instance the skinned UAV data multiple times
● Can also be used for non-instanced data
● Multiple passes can reuse the transformed
vertex data – Shadow map rendering
● Performance is about the same as
stream-out, but you can do more …
Bounding Box to UAV
● Can calculate and store Bbox in the VS
● Use a UAV to store the min/max values (6)
● InterlockedMin/InterlockedMax determine min
and max of the bbox
●Need to use integer values with atomics
● Use the stored bbox in later passes
● GPU physics (collision)
● Tile based processing
Bounding Box: HLSL Code
void UAVBBoxSkinVS(VSSkinnedIn input, uint id:SV_VERTEXID )
{
// skin the vertex
. . .
// output the max and min for the bounding box
int x = (int) (vSkinned.Pos.x * FLOAT_SCALE); // convert to integer
int y = (int) (vSkinned.Pos.y * FLOAT_SCALE);
int z = (int) (vSkinned.Pos.z * FLOAT_SCALE);
InterlockedMin(g_BBoxUAV[0], x);
InterlockedMin(g_BBoxUAV[1], y);
InterlockedMin(g_BBoxUAV[2], z);
InterlockedMax(g_BBoxUAV[3], x);
InterlockedMax(g_BBoxUAV[4], y);
InterlockedMax(g_BBoxUAV[5], z);
. . .
Particle System UAV
● Single pass GPU-only particle system
● In the VS:
● Generate sprites for rendering
● Do Euler integration and update the particle
system state to a UAV
Particle System: HLSL Code
uint particleIndex = id / 4;
uint vertexInQuad = id % 4;
// calculate the new position of the vertex
float3 oldPosition = g_bufPosColor[particleIndex].pos.xyz;
float3 oldVelocity = g_bufPosColor[particleIndex].velocity.xyz;
// Euler integration to find new position and velocity
float3 acceleration = normalize(oldVelocity) * ACCELLERATION;
float3 newVelocity = acceleration * g_deltaT + oldVelocity;
float3 newPosition = newVelocity * g_deltaT + oldPosition;
g_particleUAV[particleIndex].pos = float4(newPosition, 1.0);
g_particleUAV[particleIndex].velocity = float4(newVelocity, 0.0);
// Generate sprite vertices
. . .
Conclusion
● Vertex shader “tricks” can be more
efficient than more commonly used methods
● Use SV_Vertex ID for smarter instancing
●Sprites
●Merge Instancing
● UAVs add lots of freedom to vertex shaders
●Bounding box calculation
●Single pass VS particle system
Demos
● Particle System
● UAV Skinning
● Bbox
Acknowledgements
● Merge Instancing
● Emil Person, “Graphics Gems for Games”
SIGGRAPH 2011
● Brendan Jackson, Cloud Imperium
● Thanks to
● Nick Thibieroz, AMD
● Raul Aguaviva (particle system UAV), AMD
● Alex Kharlamov, AMD
Questions
● bill.bilodeau@amd.com

Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14

  • 1.
    Vertex Shader Tricks NewWays to Use the Vertex Shader to Improve Performance Bill Bilodeau Developer Technology Engineer, AMD
  • 2.
    Topics Covered ● Overviewof the DX11 front-end pipeline ● Common bottlenecks ● Advanced Vertex Shader Features ● Vertex Shader Techniques ● Samples and Results
  • 3.
    Graphics Hardware DX11 Front-EndPipeline ● VS –vertex data ● HS – control points ● Tessellator ● DS – generated vertices ● GS – primitives ● Write to UAV at all stages ● Starting with DX11.1 Vector GPR’s (256 2048-bit registers) Vector ALU (1 64-way single precision operation every 4 clocks) Scalar ALU (1 operation every 4 clocks) Scalar GPR’s (256 64-bit registers) Vector/Scalar cross communication bus Vector GPR’s (256 2048-bit registers) Vector ALU (1 64-way single precision operation every 4 clocks) Scalar ALU (1 operation every 4 clocks) Scalar GPR’s (256 64-bit registers) Vector/Scalar cross communication bus Vector GPR’s (256 2048-bit registers) Vector ALU (1 64-way single precision operation every 4 clocks) Scalar ALU (1 operation every 4 clocks) Scalar GPR’s (256 64-bit registers) Vector/Scalar cross communication bus . . . Input Assembler Hull Shader Domain Shader Tessellator Geometry Shader Stream Out CB, SRV, or UAV Vertex Shader
  • 4.
    Bottlenecks - VS ●VS Attributes ● Limit outputs to 4 attributes (AMD) ●This applies to all shader stages (except PS) ● VS Texture Fetches ● Too many texture fetches can add latency ●Especially dependent texture fetches ●Group fetches together for better performance ●Hide latency with ALU instructions
  • 5.
    Bottlenecks - VS ●Use the caches wisely ● Avoid large vertex formats that waste pre-VS cache space ● DrawIndexed() allows for reuse of processed vertices saved in the post-VS cache ●Vertices with the same index only need to get processed once Vertex Shader Pre-VS Cache (Hides Latency) Input Assembler Post-VS Cache (Vertex Reuse)
  • 6.
    Bottlenecks - GS ●GS ● Can add or remove primitives ● Adding new primitives requires storing new vertices ●Going off chip to store data can be a bandwidth issue ● Using the GS means another shader stage ●This means more competition for shader resources ●Better if you can do everything in the VS
  • 7.
    Advanced Vertex ShaderFeatures ● SV_VertexID, SV_InstanceID ● UAV output (DX11.1) ● NULL vertex buffer ● VS can create its own vertex data
  • 8.
    SV_VertexID ● Can usethe vertex id to decide what vertex data to fetch ● Fetch from SRV, or procedurally create a vertex VSOut VertexShader(SV_VertexID id) { float3 vertex = g_VertexBuffer[id]; … }
  • 9.
    UAV buffers ● Writeto UAVs from a Vertex Shader ● New feature in DX11.1 (UAV at any stage) ● Can be used instead of stream-out for writing vertex data ● Triangle output not limited to strips ●You can use whatever format you want ● Can output anything useful to a UAV
  • 10.
    NULL Vertex Buffer ●DX11/DX10 allows this ● Just set the number of vertices in Draw() ● VS will execute without a vertex buffer bound ● Can be used for instancing ● Call Draw() with the total number of vertices ● Bind mesh and instance data as SRVs
  • 11.
    Vertex Shader Techniques ●Full Screen Triangle ● Vertex Shader Instancing ● Merged Instancing ● Vertex Shader UAVs
  • 12.
    Full Screen Triangle ●For post-processing effects ● Triangle has better performance than quad ● Fast and easy with VS generated coordinates ● No IB or VB is necessary ● Something you should be using for full screen effects Clip Space Coordinates (-1, -1, 0) (-1, 3, 0) (3, -1, 0)
  • 13.
    Full Screen Triangle:C++ code // Null VB, IB pd3dImmediateContext->IASetVertexBuffers( 0, 0, NULL, NULL, NULL ); pd3dImmediateContext->IASetIndexBuffer( NULL, (DXGI_FORMAT)0, 0 ); pd3dImmediateContext->IASetInputLayout( NULL ); // Set Shaders pd3dImmediateContext->VSSetShader( g_pFullScreenVS, NULL, 0 ); pd3dImmediateContext->PSSetShader( … ); pd3dImmediateContext->PSSetShaderResources( … ); pd3dImmediateContext->IASetPrimitiveTopology( D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST ); // Render 3 vertices for the triangle pd3dImmediateContext->Draw(3, 0);
  • 14.
    Full Screen Triangle:HLSL Code VSOutput VSFullScreenTest(uint id:SV_VERTEXID) { VSOutput output; // generate clip space position output.pos.x = (float)(id / 2) * 4.0 - 1.0; output.pos.y = (float)(id % 2) * 4.0 - 1.0; output.pos.z = 0.0; output.pos.w = 1.0; // texture coordinates output.tex.x = (float)(id / 2) * 2.0; output.tex.y = 1.0 - (float)(id % 2) * 2.0; // color output.color = float4(1, 1, 1, 1); return output; } Clip Space Coordinates (-1, -1, 0) (-1, 3, 0) (3, -1, 0)
  • 15.
    VS Instancing: PointSprites ● Often done on GS, but can be faster on VS ● Create an SRV point buffer and bind to VS ● Call Draw or DrawIndexed to render the full triangle list. ● Read the location from the point buffer and expand to vertex location in quad ● Can be used for particles or Bokeh DOF sprites ● Don’t use DrawInstanced for a small mesh
  • 16.
    Point Sprites: C++Code pd3d->IASetIndexBuffer( g_pParticleIndexBuffer, DXGI_FORMAT_R32_UINT, 0 ); pd3d->IASetPrimitiveTopology( D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST ); pd3dImmediateContext->DrawIndexed( g_particleCount * 6, 0, 0);
  • 17.
    Point Sprites: HLSLCode VSInstancedParticleDrawOut VSIndexBuffer(uint id:SV_VERTEXID) { VSInstancedParticleDrawOut output; uint particleIndex = id / 4; uint vertexInQuad = id % 4; // calculate the position of the vertex float3 position; position.x = (vertexInQuad % 2) ? 1.0 : -1.0; position.y = (vertexInQuad & 2) ? -1.0 : 1.0; position.z = 0.0; position.xy *= PARTICLE_RADIUS; position = mul( position, (float3x3)g_mInvView ) + g_bufPosColor[particleIndex].pos.xyz; output.pos = mul( float4(position,1.0), g_mWorldViewProj ); output.color = g_bufPosColor[particleIndex].color; // texture coordinate output.tex.x = (vertexInQuad % 2) ? 1.0 : 0.0; output.tex.y = (vertexInQuad & 2) ? 1.0 : 0.0; return output; }
  • 18.
    Point Sprite Performance Indexed,500K SpritesNon-Indexed, 500K SpritesGS, 500K SpritesDrawInstanced, 500K SpritesIndexed, 1M SpritesNon-Indexed, 1M SpritesGS, 1M SpritesDrawInstanced, 1M Sprit R9 290x (ms) 0.52 0.77 1.38 1.77 1.02 1.53 2.7 3.54 Titan (ms) 0.52 0.87 0.83 5.1 1.5 1.92 1.6 10.3 0 2 4 6 8 10 12 AMD Radeon R9 290x Nvidia Titan
  • 19.
    Point Sprite Performance ●DrawIndexed() is the fastest method ● Draw() is slower but doesn’t need an IB ● Don’t use DrawInstanced() for creating sprites on either AMD or NVidia hardware ● Not recommended for a small number of vertices
  • 20.
    Merge Instancing ● Combinemultiple meshes that can be instanced many times ● Better than normal instancing which renders only one mesh ● Instance nearby meshes for smaller bounding box ● Each mesh is a page in the vertex data ● Fixed vertex count for each mesh ●Meshes smaller than page size use degenerate triangles
  • 21.
    Merge Instancing Mesh VertexData Mesh Data 0 Mesh Data 1 Mesh Data 2 . . . Mesh Instance Data Instance 0 Mesh Index 2 Instance 1 Mesh Index 0 . . . Degenerate Triangle Vertex 0 Vertex 1 Vertex 2 Vertex 3 . . . 0 0 0 Fixed Length Page
  • 22.
    Merged Instancing usingVS ● Use the vertex ID to look up the mesh to instance ● All meshes are the same size, so (id / SIZE) can be used as an offset to the mesh ● Faster than using DrawInstanced()
  • 23.
    Merge Instancing Performance 0 5 10 15 20 25 30 DrawInstancedSoft Instancing R9 290x GTX 780 ● Instancing performance test by Cloud Imperium Games for Star Citizen ● Renders 13.5M triangles (~40M verts) ● DrawInstanced version calls DrawInstanced() and uses instance data in a vertex buffer ● Soft Instancing version uses vertex instancing with Draw() calls and fetches instance data from SRV AMD Radeon R9 290X Nvidia GTX 780 ms
  • 24.
    Vertex Shader UAVs ●Random access Read/Write in a VS ● Can be used to store transformed vertex data for use in multi-pass algorithms ● Can be used for passing constant attributes between any shader stage (not just from VS)
  • 25.
    Skinning to UAV ●Skin vertex data then output to UAV ● Instance the skinned UAV data multiple times ● Can also be used for non-instanced data ● Multiple passes can reuse the transformed vertex data – Shadow map rendering ● Performance is about the same as stream-out, but you can do more …
  • 26.
    Bounding Box toUAV ● Can calculate and store Bbox in the VS ● Use a UAV to store the min/max values (6) ● InterlockedMin/InterlockedMax determine min and max of the bbox ●Need to use integer values with atomics ● Use the stored bbox in later passes ● GPU physics (collision) ● Tile based processing
  • 27.
    Bounding Box: HLSLCode void UAVBBoxSkinVS(VSSkinnedIn input, uint id:SV_VERTEXID ) { // skin the vertex . . . // output the max and min for the bounding box int x = (int) (vSkinned.Pos.x * FLOAT_SCALE); // convert to integer int y = (int) (vSkinned.Pos.y * FLOAT_SCALE); int z = (int) (vSkinned.Pos.z * FLOAT_SCALE); InterlockedMin(g_BBoxUAV[0], x); InterlockedMin(g_BBoxUAV[1], y); InterlockedMin(g_BBoxUAV[2], z); InterlockedMax(g_BBoxUAV[3], x); InterlockedMax(g_BBoxUAV[4], y); InterlockedMax(g_BBoxUAV[5], z); . . .
  • 28.
    Particle System UAV ●Single pass GPU-only particle system ● In the VS: ● Generate sprites for rendering ● Do Euler integration and update the particle system state to a UAV
  • 29.
    Particle System: HLSLCode uint particleIndex = id / 4; uint vertexInQuad = id % 4; // calculate the new position of the vertex float3 oldPosition = g_bufPosColor[particleIndex].pos.xyz; float3 oldVelocity = g_bufPosColor[particleIndex].velocity.xyz; // Euler integration to find new position and velocity float3 acceleration = normalize(oldVelocity) * ACCELLERATION; float3 newVelocity = acceleration * g_deltaT + oldVelocity; float3 newPosition = newVelocity * g_deltaT + oldPosition; g_particleUAV[particleIndex].pos = float4(newPosition, 1.0); g_particleUAV[particleIndex].velocity = float4(newVelocity, 0.0); // Generate sprite vertices . . .
  • 30.
    Conclusion ● Vertex shader“tricks” can be more efficient than more commonly used methods ● Use SV_Vertex ID for smarter instancing ●Sprites ●Merge Instancing ● UAVs add lots of freedom to vertex shaders ●Bounding box calculation ●Single pass VS particle system
  • 31.
    Demos ● Particle System ●UAV Skinning ● Bbox
  • 32.
    Acknowledgements ● Merge Instancing ●Emil Person, “Graphics Gems for Games” SIGGRAPH 2011 ● Brendan Jackson, Cloud Imperium ● Thanks to ● Nick Thibieroz, AMD ● Raul Aguaviva (particle system UAV), AMD ● Alex Kharlamov, AMD
  • 33.

Editor's Notes

  • #9 The value of SV_VertexID depends on the draw call. For non-indexed Draw, the vertex ID starts with 0 and increments by 1 for every vertex processed by the shader. For DrawIndexed(), the vertexID is the value of the index in the index buffer for that vertex.
  • #17 For indexed Draw calls, create an index buffer which contains (index location + index number). That way you can calculate (vertexID/vertsPerMesh) to get the instance index, and (vertexID % vertsPerMesh) to get the index value which you can use to look up the vertex.
  • #27 - If the mesh is being reused many times, then calculating the bounding box has little overhead.Bounding box can be used for collision detection
  • #30 Could read and write from the UAV instead of binding an input SRV