Alex_Vlachos_Advanced_VR_Rendering_GDC2015

Alex Vlachos
Alex VlachosSenior Graphics Programmer at Valve
Advanced VR Rendering
Alex Vlachos, Valve
Alex@ValveSoftware.com
Outline
● VR at Valve
● Methods for Stereo Rendering
● Timing: Scheduling, Prediction, VSync, GPU Bubbles
● Specular Aliasing & Anisotropic Lighting
● Miscellaneous VR Rendering Topics
2
VR at Valve
● Began VR research 3+ years ago
● Both hardware and software engineers
● Custom optics designed for VR
● Display technology – low persistence, global display
● Tracking systems
● Fiducial-based positional tracking
● Desktop dot-based tracking and controllers
● Laser-tracked headset and controllers
● SteamVR API – Cross-platform, OpenVR
3
HTC Vive Developer Edition Specs
● Refresh rate: 90 Hz (11.11 ms per frame)
● Low persistence, global display
● Framebuffer: 2160x1200 (1080x1200 per-eye)
● Off-screen rendering ~1.4x in each dimension:
● 1512x1680 per-eye = 2,540,160 shaded pixels per-eye (brute-force)
● FOV is about 110 degrees
● 360⁰ room-scale tracking
● Multiple tracked controllers and other input devices
4
Room-Scale Tracking
5
Optics & Distortion (Pre-Warp)
Warp pass uses 3 sets of UVs for RGB separately to account for spatial and chromatic distortion
(Visualizing 1.4x render target scalar)
6
Optics & Distortion (Post-Warp)
Warp pass uses 3 sets of UVs for RGB separately to account for spatial and chromatic distortion
(Visualizing 1.4x render target scalar)
7
Shaded Visible Pixels per Second
● 720p @ 30 Hz: 27 million pixels/sec
● 1080p @ 60 Hz: 124 million pixels/sec
● 30” Monitor 2560x1600 @ 60 Hz: 245 million pixels/sec
● 4k Monitor 4096x2160 @ 30 Hz: 265 million pixels/sec
● VR 1512x1680x2 @ 90 Hz: 457 million pixels/sec
● We can reduce this to 378 million pixels/sec (later in the talk)
● Equivalent to 30” Monitor @ 100 Hz for a non-VR renderer
8
There Are No “Small” Effects
● Tracking allows users to get up close to anything in the
tracked volume
● Can’t implement a super expensive effect and claim “it’s
just this small little thing in the corner”
● Even your floors need to be higher fidelity than we have
traditionally authored
● If it’s in your tracked volume, it must be high fidelity
9
VR Rendering Goals
10
● Lowest GPU min spec possible
● We want VR to succeed, but we need customers
● The lower the min spec, the more customers we have
● Aliasing should not be noticeable to customers
● Customers refer to aliasing as “sparkling”
● Algorithms should scale up to multi-GPU installations
● Ask yourself, “Will ‘X’ scale efficiently to a 4-GPU machine?”
Outline
● VR at Valve
● Methods for Stereo Rendering
● Timing: Scheduling, Prediction, VSync, GPU Bubbles
● Specular Aliasing & Anisotropic Lighting
● Miscellaneous VR Rendering Topics
11
Stereo Rendering (Single-GPU)
● Brute-force run your CPU code twice (BAD)
● Use geometry shader to amplify geometry (BAD)
● Resubmit command buffers (GOOD, our current solution)
● Use instancing to double geo (BETTER. Half the API calls, improved
cache coherency for VB/IB/texture reads)
● “High Performance Stereo Rendering For VR”, Timothy Wilson, San Diego
Virtual Reality Meetup
12
Stereo Rendering (Multi-GPU)
● AMD and NVIDIA both provide DX11 extensions to accelerate stereo
rendering across multiple GPUs
● We have already tested the AMD implementation and it nearly doubles our
framerate – have yet to test the NVIDIA implementation but will soon
● Great for developers
● Everyone on your team can have a multi-GPU solution in their dev box
● This allows you to break framerate without uncomfortable low-framerate VR
● But lie to your team about framerate and report single-GPU fps :)
13
Outline
● VR at Valve
● Methods for Stereo Rendering
● Timing: Scheduling, Prediction, VSync, GPU Bubbles
● Specular Aliasing & Anisotropic Lighting
● Miscellaneous VR Rendering Topics
14
Prediction
● We aim to keep prediction times (render to photons) for the HMD and controller
transforms as short as possible (accuracy is more important than total time)
● Low persistence global displays: panel is lit for only ~2 ms of the 11.11 ms frame
NOTE: Image above is not optimal VR rendering, but helps describe prediction (See later slides)
15
Pipelined Architectures
● Simulating next frame while rendering the current frame
● We re-predict transforms and update our global cbuffer right before submit
● VR practically requires this due to prediction constraints
● You must conservatively cull on the CPU by about 5 degrees
16
Waiting for VSync
● Simplest VR implementation, predict right after VSync
● Pattern #1: Present(), clear back buffer, read a pixel
● Pattern #2: Present(), clear back buffer, spin on a query
● Great for initial implementation, but please DO NOT DO
THIS. GPUs are not designed for this.
● See John McDonald’s talk:
● “Avoiding Catastrophic Performance Loss: Detecting CPU-GPU
Sync Points”, John McDonald, NVIDIA, GDC 2014
17
GPU Bubbles
● If you start submitting draw calls after VSync:
● Ideally, your capture should look like this:
(Images are screen captures of NVIDIA Nsight)
18
“Running Start”
● If you start to submit D3D calls after VSync:
● Instead, start submitting D3D calls 2 ms before VSync. (2 ms is a magic number
based on the 1.5-2.0ms GPU bubbles we measured on current GPUs):
● But, you end up predicting another 2 ms (24.22 ms total)
19
“Running Start” VSync
20
● Question: How do you know how far you are from VSync?
● Answer: It’s tricky. Rendering APIs don’t directly provide this.
● The SteamVR/OpenVR API on Windows in a separate process spins
on calls to IDXGIOutput::WaitForVBlank() and notes the time and
increments a frame counter. The application can then call
GetTimeSinceLastVSync() that also returns a frame ID.
● GPU vendors, HMD devices, and rendering APIs should provide this
“Running Start” Details
21
● To deal with a bad frame, you need to partially synchronize with the GPU
● We inject a query after clearing the back buffer, submit our entire frame, spin on
that query, then call Present()
● This ensures we are on the correct side of VSync for the current frame, and we can
now spin until our running start time
Why the Query Is Critical
22
● If a frame is late, the query will keep you on the right side of VSync
for the following frame ensuring your prediction remains accurate
Running Start Summary
23
● This is a solid 1.5-2.0ms GPU perf gain!
● You want to see this in NVIDIA Nsight:
● You want to see this in Microsoft’s GPUView:
Outline
● VR at Valve
● Methods for Stereo Rendering
● Timing: Scheduling, Prediction, VSync, GPU Bubbles
● Specular Aliasing & Anisotropic Lighting
● Miscellaneous VR Rendering Topics
24
Aliasing Is Your Enemy
● The camera (your head) never stops moving. Aliasing is
amplified because of this.
● While there are more pixels to render, each pixel fills a
larger angle than anything we’ve done before. Here are
some averages:
● 2560x1600 30” monitor: ~50 pixels/degree (50 degree H fov)
● 720p 30” monitor: ~25 pixels/degree (50 degree H fov)
● VR: ~15.3 pixels/degree (110 degree fov w/ 1.4x)
● We must increase the quality of our pixels
25
4xMSAA Minimum Quality
● Forward renderers win for antialiasing because MSAA just works
● We use 8xMSAA if perf allows
● Image-space antialiasing algorithms must be compared side-by-side
with 4xMSAA and 8xMSAA to know how your renderer will compare
to others in the industry
● Jittered SSAA is obviously the best using the HLSL ‘sample’ modifier,
but only if you can spare the perf
26
Normal Maps Are Not Dead
● Most normal maps work great in VR...mostly.
● What doesn’t work:
● Feature detail larger than a few cm inside tracked volume is bad
● Surface shape inside a tracked volume can’t be in a normal map
● What does work:
● Distant objects outside the tracked volume you can’t inspect up close
● Surface “texture” and fine details:
27
Normal Map Mipping Error
28
Blinn-Phong Specular
Zoomed out
normal map
box filtered mips
Zoomed out
super-sampled
36 samples
Expected
glossiness
Incorrect
glossiness
Normal Map Mipping Problems
● Any mip filter that just generates an averaged normal loses
important roughness information
29
Normal Map Visualization
30
4096x4096 Normal Map
Fire Alarm
4x4 Mip Visualization
2x2 Mip
1x1 Mip
Normal Map Visualization
31
4096x4096 Normal Map
Fire Alarm
8x8 Mip Visualization16x16 Mip Visualization
Normal Map Visualization
32
4096x4096 Normal Map
Dota 2 Mirana Body
4x4 Mip Visualization
2x2 Mip
1x1 Mip
Normal Map Visualization
33
4096x4096 Normal Map
Dota 2 Juggernaut Sword Handle
4x4 Mip Visualization
2x2 Mip
1x1 Mip
Normal Map Visualization
34
4096x4096 Normal Map
Shoulder Armor
4x4 Mip Visualization
2x2 Mip
1x1 Mip
Normal Map Visualization
35
4096x4096 Normal Map
Metal Siding
4x4 Mip Visualization
2x2 Mip
1x1 Mip
Roughness Encoded in Mips
● We can store a single isotropic value (visualized
as the radius of a circle) that is the standard
deviation of all 2D tangent normals from the
highest mip that contributed to this texel
● We can also store a 2D anisotropic value
(visualized as the dimensions of an ellipse) for the
standard deviation in X and Y separately that can
be used to compute tangent-space axis-aligned
anisotropic lighting!
36
Final Mip Chain
37
Add Artist-Authored Roughness
● We author 2D gloss = 1.0 – roughness
● Mip with a simple box filter
● Add/sum it with the normal map roughness at each mip level
● Because we have anisotropic gloss maps anyway, storing the generated normal
map roughness is FREE
38
Isotropic Gloss Anisotropic Gloss
Tangent-Space Axis-Aligned Anisotropic Lighting
● Standard isotropic lighting is
represented along the diagonal
● Anisotropy is aligned with either of
the tangent-space axes
● Requires only 2 additional values
paired with a 2D tangent normal =
Fits into an RGBA texture (DXT5
>95% of the time)
39
Roughness to Exponent Conversion
40
void RoughnessEllipseToScaleAndExp( float2 vRoughness,
out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut )
{
o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4
o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384
o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent
}
● Diffuse lighting is Lambert raised to exponent
(N.Lk) where k is in the range 0.6-1.4
● Experimented with anisotropic diffuse
lighting, but not worth the instructions
● Specular exponent range is 1-16,384 and is a
modified Blinn-Phong with anisotropy (more
on this later)
How Anisotropy Is Computed
41
Tangent U Lighting
* =
Tangent V Lighting Final Lighting
* =
Shader Code
42
Anisotropic Specular Lighting:
float3 vHalfAngleDirWs = normalize( vPositionToLightDirWs.xyz + vPositionToCameraDirWs.xyz );
float3 vSpecularNormalX = vHalfAngleDirWs.xyz - ( vTangentUWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentUWs.xyz ) );
float3 vSpecularNormalY = vHalfAngleDirWs.xyz - ( vTangentVWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentVWs.xyz ) );
float flNDotHX = max( 0.0, dot( vSpecularNormalX.xyz, vHalfAngleDirWs.xyz ) );
float flNDotHkX = pow( flNDotHX, vSpecularExponent.x * 0.5 );
flNDotHkX *= vSpecularScale.x;
float flNDotHY = max( 0.0, dot( vSpecularNormalY.xyz, vHalfAngleDirWs.xyz ) );
float flNDotHkY = pow( flNDotHY, vSpecularExponent.y * 0.5 );
flNDotHkY *= vSpecularScale.y;
float flSpecularTerm = flNDotHkX * flNDotHkY;
Isotropic Diffuse Lighting:
float flDiffuseTerm = pow( flNDotL, flDiffuseExponent ) * ( ( flDiffuseExponent + 1.0 ) * 0.5 );
Isotropic Specular Lighting:
float flNDotH = saturate( dot( vNormalWs.xyz, vHalfAngleDirWs.xyz ) );
float flNDotHk = pow( flNDotH, dot( vSpecularExponent.xy, float2( 0.5, 0.5 ) ) );
flNDotHk *= dot( vSpecularScale.xy, float2( 0.33333, 0.33333 ) ); // 0.33333 is to match the spec intensity of the aniso algorithm above
float flSpecularTerm = flNDotHk;
void RoughnessEllipseToScaleAndExp( float2 vRoughness,
out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut )
{
o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4
o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384
o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent
}
Geometric Specular Aliasing
● Dense meshes without normal maps also alias, and roughness mips can’t help you!
● We use partial derivatives of interpolated vertex normals to generate a geometric
roughness term that approximates curvature. Here is the hacky math:
float3 vNormalWsDdx = ddx( vGeometricNormalWs.xyz );
float3 vNormalWsDdy = ddy( vGeometricNormalWs.xyz );
float flGeometricRoughnessFactor = pow( saturate( max( dot( vNormalWsDdx.xyz, vNormalWsDdx.xyz ), dot( vNormalWsDdy.xyz, vNormalWsDdy.xyz ) ) ), 0.333 );
vRoughness.xy = max( vRoughness.xy, flGeometricRoughnessFactor.xx ); // Ensure we don’t double-count roughness if normal map encodes geometric roughness
43
Visualization of flGeometricRoughnessFactor
Geometric Specular Aliasing Part 2
● MSAA center vs centroid interpolation: It’s not perfect
● Normal interpolation can cause specular sparkling at silhouettes due
to over-interpolated vertex normals
● Here’s a trick we are using:
● Interpolate normal twice: once with centroid, once without
float3 vNormalWs : TEXCOORD0;
centroid float3 vCentroidNormalWs : TEXCOORD1;
● In the pixel shader, choose the centroid normal if normal length squared is
greater than 1.01
if ( dot( i.vNormalWs.xyz, i.vNormalWs.xyz ) >= 1.01 )
{
i.vNormalWs.xyz = i.vCentroidNormalWs.xyz;
}
44
Outline
● VR at Valve
● Methods for Stereo Rendering
● Timing: Scheduling, Prediction, VSync, GPU Bubbles
● Specular Aliasing & Anisotropic Lighting
● Miscellaneous VR Rendering Topics
45
Normal Map Encoding
● Projecting tangent normals onto Z plane only uses 78.5% of
the range of a 2D texel
● Hemi-octahedron encoding uses the full range of a 2D texel
● “A Survey of Efficient Representations for Independent Unit Vectors”,
Cigolle et al., Journal of Computer Graphics Techniques Vol. 3, No. 2,
2014
(Image modified from above paper)
46
Scale Render Target Resolution
● Turns out, 1.4x is just a recommendation for the HTC Vive (Each
HMD design has a different recommended scalar based on optics
and panels)
● On slower GPUs, scale the recommended render target scalar down
● On faster GPUs, scale the recommended render target scalar up
● If you’ve got GPU cycles to burn, BURN THEM
47
Anisotropic Texture Filtering
● Increases the perceived resolution of the panels (don’t
forget, we only have fewer pixels per degree)
● Force this on for color and normal maps
● We use 8x by default
● Disable for everything else. Trilinear only, but measure
perf. Anisotropic filtering may be “free” if you are
bottlenecked elsewhere.
48
Noise Is Your Friend
● Gradients are horrible in VR. Banding is more obvious than LCD TVs.
● We add noise on the way into the framebuffer when we have
floating-point precision in the pixel shader
float3 ScreenSpaceDither( float2 vScreenPos )
{
// Iestyn's RGB dither (7 asm instructions) from Portal 2 X360, slightly modified for VR
float3 vDither = dot( float2( 171.0, 231.0 ), vScreenPos.xy + g_flTime ).xxx;
vDither.rgb = frac( vDither.rgb / float3( 103.0, 71.0, 97.0 ) ) - float3( 0.5, 0.5, 0.5 );
return ( vDither.rgb / 255.0 ) * 0.375;
}
49
Environment Maps
● Standard implementation at infinity = only works for sky
● Need to use some type of distance remapping for environment maps
● Sphere is cheap
● Box is more expensive
● Both are useful in different situations
● Read this online article:
● “Image-based Lighting approaches and parallax-corrected cubemaps”, Sébastien
Lagarde, 2012
50
Stencil Mesh (Hidden Area Mesh)
● Stencil out the pixels you can’t actually see through the
lenses. GPUs are fast at early stencil-rejection.
● Alternatively you can render to the depth buffer at near z
so everything early z-rejects instead
● Lenses produce radially symmetric distortion which means
you effectively see a circular area projected on the panels
51
Stencil Mesh (Warped View)
52
Stencil Mesh (Ideal Warped View)
53
Stencil Mesh (Wasted Pixels)
54
Stencil Mesh (Unwarped View)
55
Stencil Mesh (Unwarped View)
56
Stencil Mesh (Final Unwarped View)
57
Stencil Mesh (Final Warped View)
58
Stencil Mesh (Hidden Area Mesh)
● SteamVR/OpenVR API will provide this mesh to you
● Results in a 17% fill rate reduction!
● No stencil mesh: VR 1512x1680x2 @ 90Hz: 457 million pixels/sec
● 2,540,160 pixels per eye (5,080,320 pixels total)
● With stencil mesh: VR 1512x1680x2 @ 90Hz: 378 million pixels/sec
● About 2,100,000 pixels per eye (4,200,000 pixels total)
59
Warp Mesh (Lens Distortion Mesh)
60
Warp Mesh (Brute-Force)
61
Warp Mesh (Cull UV’s Outside 0-1)
62
Warp Mesh (Cull Stencil Mesh)
63
Warp Mesh (Shrink Wrap)
15% of pixels culled from the warp mesh
64
Performance Queries Required!
● You are always VSync’d
● Disabling VSync to see framerate will make you dizzy
● Need to use performance queries to report GPU workload
● Simplest implementation is to measure first to last draw call
● Ideally measure these things:
● Idle time from Present() to first draw call
● First draw call to last draw call
● Idle time from last draw call to Present()
65
Summary
● Stereo Rendering
● Prediction
● “Running Start” (Saves 1.5-2.0 ms/frame)
● Anisotropic Lighting & Mipping Normal Maps
● Geometric Specular Antialiasing
● Stencil Mesh (Saves 17% pixels rendered)
● Optimized Warp Mesh (Reduces cost by 15%)
● Etc.
66
Thank You!
Alex Vlachos, Valve
Alex@ValveSoftware.com
67
1 of 67

Recommended

Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016 by
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex Vlachos
1.2K views52 slides
Mobile VR, Programming, Rendering by
Mobile VR, Programming, RenderingMobile VR, Programming, Rendering
Mobile VR, Programming, RenderingUnity Technologies
1.4K views35 slides
Avoiding 19 Common OpenGL Pitfalls by
Avoiding 19 Common OpenGL PitfallsAvoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL PitfallsMark Kilgard
2.7K views24 slides
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour... by
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...AMD Developer Central
2.1K views36 slides
Siggraph 2016 - Vulkan and nvidia : the essentials by
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsTristan Lorach
8.2K views73 slides
Porting the Source Engine to Linux: Valve's Lessons Learned by
Porting the Source Engine to Linux: Valve's Lessons LearnedPorting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons Learnedbasisspace
3K views90 slides

More Related Content

What's hot

OpenGL 4.5 Update for NVIDIA GPUs by
OpenGL 4.5 Update for NVIDIA GPUsOpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUsMark Kilgard
36.1K views81 slides
Efficient Buffer Management by
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Managementbasisspace
1.4K views55 slides
Avoiding Catastrophic Performance Loss by
Avoiding Catastrophic Performance LossAvoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance Lossbasisspace
2.9K views38 slides
Customizing a production pipeline by
Customizing a production pipelineCustomizing a production pipeline
Customizing a production pipelineFelipe Lira
1.9K views44 slides
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow by
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering WorkflowTakahiro Harada
3.4K views44 slides
vkFX: Effect(ive) approach for Vulkan API by
vkFX: Effect(ive) approach for Vulkan APIvkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan APITristan Lorach
2.7K views37 slides

What's hot(10)

OpenGL 4.5 Update for NVIDIA GPUs by Mark Kilgard
OpenGL 4.5 Update for NVIDIA GPUsOpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUs
Mark Kilgard36.1K views
Efficient Buffer Management by basisspace
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Management
basisspace1.4K views
Avoiding Catastrophic Performance Loss by basisspace
Avoiding Catastrophic Performance LossAvoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance Loss
basisspace2.9K views
Customizing a production pipeline by Felipe Lira
Customizing a production pipelineCustomizing a production pipeline
Customizing a production pipeline
Felipe Lira1.9K views
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow by Takahiro Harada
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
Takahiro Harada3.4K views
vkFX: Effect(ive) approach for Vulkan API by Tristan Lorach
vkFX: Effect(ive) approach for Vulkan APIvkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan API
Tristan Lorach2.7K views
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis... by Owen Wu
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
Owen Wu1.9K views
Unity mobile game performance profiling – using arm mobile studio by Owen Wu
Unity mobile game performance profiling – using arm mobile studioUnity mobile game performance profiling – using arm mobile studio
Unity mobile game performance profiling – using arm mobile studio
Owen Wu714 views
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio by Owen Wu
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
Owen Wu901 views
NVIDIA OpenGL and Vulkan Support for 2017 by Mark Kilgard
NVIDIA OpenGL and Vulkan Support for 2017NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017
Mark Kilgard5.2K views

Similar to Alex_Vlachos_Advanced_VR_Rendering_GDC2015

Oculus insight building the best vr aaron davies by
Oculus insight building the best vr   aaron daviesOculus insight building the best vr   aaron davies
Oculus insight building the best vr aaron daviesMary Chan
1.5K views49 slides
Killzone Shadow Fall Demo Postmortem by
Killzone Shadow Fall Demo PostmortemKillzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemGuerrilla
42.8K views103 slides
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019 by
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019Unity Technologies
8.6K views46 slides
Smedberg niklas bringing_aaa_graphics by
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicschangehee lee
966 views69 slides
High performance graphics and computation - OpenGL ES and RenderScript by
High performance graphics and computation - OpenGL ES and RenderScript High performance graphics and computation - OpenGL ES and RenderScript
High performance graphics and computation - OpenGL ES and RenderScript BlrDroid
1.9K views28 slides
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons... by
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unity Technologies
2.4K views60 slides

Similar to Alex_Vlachos_Advanced_VR_Rendering_GDC2015(20)

Oculus insight building the best vr aaron davies by Mary Chan
Oculus insight building the best vr   aaron daviesOculus insight building the best vr   aaron davies
Oculus insight building the best vr aaron davies
Mary Chan1.5K views
Killzone Shadow Fall Demo Postmortem by Guerrilla
Killzone Shadow Fall Demo PostmortemKillzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo Postmortem
Guerrilla42.8K views
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019 by Unity Technologies
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019
Unity Technologies8.6K views
Smedberg niklas bringing_aaa_graphics by changehee lee
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
changehee lee966 views
High performance graphics and computation - OpenGL ES and RenderScript by BlrDroid
High performance graphics and computation - OpenGL ES and RenderScript High performance graphics and computation - OpenGL ES and RenderScript
High performance graphics and computation - OpenGL ES and RenderScript
BlrDroid 1.9K views
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons... by Unity Technologies
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unity Technologies2.4K views
Design of vga based pong game using fpga by Ramanan Rajaraman
Design  of vga based pong game using fpgaDesign  of vga based pong game using fpga
Design of vga based pong game using fpga
Ramanan Rajaraman1.2K views
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc by Igalia
Dissecting and fixing Vulkan rendering issues in drivers with RenderDocDissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Igalia118 views
Dynamic Resolution and Interlaced Rendering by MartinMueller34
Dynamic Resolution and Interlaced RenderingDynamic Resolution and Interlaced Rendering
Dynamic Resolution and Interlaced Rendering
MartinMueller34106 views
"Computational Photography: Understanding and Expanding the Capabilities of S... by Edge AI and Vision Alliance
"Computational Photography: Understanding and Expanding the Capabilities of S..."Computational Photography: Understanding and Expanding the Capabilities of S...
"Computational Photography: Understanding and Expanding the Capabilities of S...
Tessellation on any_budget-gdc2011 by basisspace
Tessellation on any_budget-gdc2011Tessellation on any_budget-gdc2011
Tessellation on any_budget-gdc2011
basisspace1.1K views
“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres... by Edge AI and Vision Alliance
“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...
“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...
XR graphics in Unity: delivering the best AR/VR experiences – Unite Copenhage... by Unity Technologies
XR graphics in Unity: delivering the best AR/VR experiences – Unite Copenhage...XR graphics in Unity: delivering the best AR/VR experiences – Unite Copenhage...
XR graphics in Unity: delivering the best AR/VR experiences – Unite Copenhage...
Unity Technologies4.3K views
Luis cataldi-ue4-vr-best-practices2 by Luis Cataldi
Luis cataldi-ue4-vr-best-practices2Luis cataldi-ue4-vr-best-practices2
Luis cataldi-ue4-vr-best-practices2
Luis Cataldi3.9K views
【Unite 2017 Tokyo】NVIDIA Gameworks アップデートおよびAnselとVRWorksの紹介 by Unity Technologies Japan K.K.
【Unite 2017 Tokyo】NVIDIA Gameworks アップデートおよびAnselとVRWorksの紹介【Unite 2017 Tokyo】NVIDIA Gameworks アップデートおよびAnselとVRWorksの紹介
【Unite 2017 Tokyo】NVIDIA Gameworks アップデートおよびAnselとVRWorksの紹介
[1C7] Developing with Oculus by NAVER D2
[1C7] Developing with Oculus[1C7] Developing with Oculus
[1C7] Developing with Oculus
NAVER D256.8K views
Crysis 2-key-rendering-features by Raimundo Renato
Crysis 2-key-rendering-featuresCrysis 2-key-rendering-features
Crysis 2-key-rendering-features
Raimundo Renato1.3K views
Performance Evaluation and Comparison of Service-based Image Processing based... by Matthias Trapp
Performance Evaluation and Comparison of Service-based Image Processing based...Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...
Matthias Trapp3 views
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren... by Takahiro Harada
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
Takahiro Harada2.1K views

Recently uploaded

Web Dev - 1 PPT.pdf by
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdfgdsczhcet
55 views45 slides
Attacking IoT Devices from a Web Perspective - Linux Day by
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day Simone Onofri
15 views68 slides
AMAZON PRODUCT RESEARCH.pdf by
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdfJerikkLaureta
15 views13 slides
AI: mind, matter, meaning, metaphors, being, becoming, life values by
AI: mind, matter, meaning, metaphors, being, becoming, life valuesAI: mind, matter, meaning, metaphors, being, becoming, life values
AI: mind, matter, meaning, metaphors, being, becoming, life valuesTwain Liu 刘秋艳
35 views16 slides
ChatGPT and AI for Web Developers by
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web DevelopersMaximiliano Firtman
181 views82 slides

Recently uploaded(20)

Web Dev - 1 PPT.pdf by gdsczhcet
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet55 views
Attacking IoT Devices from a Web Perspective - Linux Day by Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri15 views
AMAZON PRODUCT RESEARCH.pdf by JerikkLaureta
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdf
JerikkLaureta15 views
AI: mind, matter, meaning, metaphors, being, becoming, life values by Twain Liu 刘秋艳
AI: mind, matter, meaning, metaphors, being, becoming, life valuesAI: mind, matter, meaning, metaphors, being, becoming, life values
AI: mind, matter, meaning, metaphors, being, becoming, life values
[2023] Putting the R! in R&D.pdf by Eleanor McHugh
[2023] Putting the R! in R&D.pdf[2023] Putting the R! in R&D.pdf
[2023] Putting the R! in R&D.pdf
Eleanor McHugh38 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada130 views
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by NUS-ISS
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
NUS-ISS28 views
.conf Go 2023 - Data analysis as a routine by Splunk
.conf Go 2023 - Data analysis as a routine.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - Data analysis as a routine
Splunk93 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
Black and White Modern Science Presentation.pptx by maryamkhalid2916
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptx
maryamkhalid291614 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2216 views
The Importance of Cybersecurity for Digital Transformation by NUS-ISS
The Importance of Cybersecurity for Digital TransformationThe Importance of Cybersecurity for Digital Transformation
The Importance of Cybersecurity for Digital Transformation
NUS-ISS27 views
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV by Splunk
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
Splunk88 views
Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst470 views
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze by NUS-ISS
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeDigital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
NUS-ISS19 views

Alex_Vlachos_Advanced_VR_Rendering_GDC2015

  • 1. Advanced VR Rendering Alex Vlachos, Valve Alex@ValveSoftware.com
  • 2. Outline ● VR at Valve ● Methods for Stereo Rendering ● Timing: Scheduling, Prediction, VSync, GPU Bubbles ● Specular Aliasing & Anisotropic Lighting ● Miscellaneous VR Rendering Topics 2
  • 3. VR at Valve ● Began VR research 3+ years ago ● Both hardware and software engineers ● Custom optics designed for VR ● Display technology – low persistence, global display ● Tracking systems ● Fiducial-based positional tracking ● Desktop dot-based tracking and controllers ● Laser-tracked headset and controllers ● SteamVR API – Cross-platform, OpenVR 3
  • 4. HTC Vive Developer Edition Specs ● Refresh rate: 90 Hz (11.11 ms per frame) ● Low persistence, global display ● Framebuffer: 2160x1200 (1080x1200 per-eye) ● Off-screen rendering ~1.4x in each dimension: ● 1512x1680 per-eye = 2,540,160 shaded pixels per-eye (brute-force) ● FOV is about 110 degrees ● 360⁰ room-scale tracking ● Multiple tracked controllers and other input devices 4
  • 6. Optics & Distortion (Pre-Warp) Warp pass uses 3 sets of UVs for RGB separately to account for spatial and chromatic distortion (Visualizing 1.4x render target scalar) 6
  • 7. Optics & Distortion (Post-Warp) Warp pass uses 3 sets of UVs for RGB separately to account for spatial and chromatic distortion (Visualizing 1.4x render target scalar) 7
  • 8. Shaded Visible Pixels per Second ● 720p @ 30 Hz: 27 million pixels/sec ● 1080p @ 60 Hz: 124 million pixels/sec ● 30” Monitor 2560x1600 @ 60 Hz: 245 million pixels/sec ● 4k Monitor 4096x2160 @ 30 Hz: 265 million pixels/sec ● VR 1512x1680x2 @ 90 Hz: 457 million pixels/sec ● We can reduce this to 378 million pixels/sec (later in the talk) ● Equivalent to 30” Monitor @ 100 Hz for a non-VR renderer 8
  • 9. There Are No “Small” Effects ● Tracking allows users to get up close to anything in the tracked volume ● Can’t implement a super expensive effect and claim “it’s just this small little thing in the corner” ● Even your floors need to be higher fidelity than we have traditionally authored ● If it’s in your tracked volume, it must be high fidelity 9
  • 10. VR Rendering Goals 10 ● Lowest GPU min spec possible ● We want VR to succeed, but we need customers ● The lower the min spec, the more customers we have ● Aliasing should not be noticeable to customers ● Customers refer to aliasing as “sparkling” ● Algorithms should scale up to multi-GPU installations ● Ask yourself, “Will ‘X’ scale efficiently to a 4-GPU machine?”
  • 11. Outline ● VR at Valve ● Methods for Stereo Rendering ● Timing: Scheduling, Prediction, VSync, GPU Bubbles ● Specular Aliasing & Anisotropic Lighting ● Miscellaneous VR Rendering Topics 11
  • 12. Stereo Rendering (Single-GPU) ● Brute-force run your CPU code twice (BAD) ● Use geometry shader to amplify geometry (BAD) ● Resubmit command buffers (GOOD, our current solution) ● Use instancing to double geo (BETTER. Half the API calls, improved cache coherency for VB/IB/texture reads) ● “High Performance Stereo Rendering For VR”, Timothy Wilson, San Diego Virtual Reality Meetup 12
  • 13. Stereo Rendering (Multi-GPU) ● AMD and NVIDIA both provide DX11 extensions to accelerate stereo rendering across multiple GPUs ● We have already tested the AMD implementation and it nearly doubles our framerate – have yet to test the NVIDIA implementation but will soon ● Great for developers ● Everyone on your team can have a multi-GPU solution in their dev box ● This allows you to break framerate without uncomfortable low-framerate VR ● But lie to your team about framerate and report single-GPU fps :) 13
  • 14. Outline ● VR at Valve ● Methods for Stereo Rendering ● Timing: Scheduling, Prediction, VSync, GPU Bubbles ● Specular Aliasing & Anisotropic Lighting ● Miscellaneous VR Rendering Topics 14
  • 15. Prediction ● We aim to keep prediction times (render to photons) for the HMD and controller transforms as short as possible (accuracy is more important than total time) ● Low persistence global displays: panel is lit for only ~2 ms of the 11.11 ms frame NOTE: Image above is not optimal VR rendering, but helps describe prediction (See later slides) 15
  • 16. Pipelined Architectures ● Simulating next frame while rendering the current frame ● We re-predict transforms and update our global cbuffer right before submit ● VR practically requires this due to prediction constraints ● You must conservatively cull on the CPU by about 5 degrees 16
  • 17. Waiting for VSync ● Simplest VR implementation, predict right after VSync ● Pattern #1: Present(), clear back buffer, read a pixel ● Pattern #2: Present(), clear back buffer, spin on a query ● Great for initial implementation, but please DO NOT DO THIS. GPUs are not designed for this. ● See John McDonald’s talk: ● “Avoiding Catastrophic Performance Loss: Detecting CPU-GPU Sync Points”, John McDonald, NVIDIA, GDC 2014 17
  • 18. GPU Bubbles ● If you start submitting draw calls after VSync: ● Ideally, your capture should look like this: (Images are screen captures of NVIDIA Nsight) 18
  • 19. “Running Start” ● If you start to submit D3D calls after VSync: ● Instead, start submitting D3D calls 2 ms before VSync. (2 ms is a magic number based on the 1.5-2.0ms GPU bubbles we measured on current GPUs): ● But, you end up predicting another 2 ms (24.22 ms total) 19
  • 20. “Running Start” VSync 20 ● Question: How do you know how far you are from VSync? ● Answer: It’s tricky. Rendering APIs don’t directly provide this. ● The SteamVR/OpenVR API on Windows in a separate process spins on calls to IDXGIOutput::WaitForVBlank() and notes the time and increments a frame counter. The application can then call GetTimeSinceLastVSync() that also returns a frame ID. ● GPU vendors, HMD devices, and rendering APIs should provide this
  • 21. “Running Start” Details 21 ● To deal with a bad frame, you need to partially synchronize with the GPU ● We inject a query after clearing the back buffer, submit our entire frame, spin on that query, then call Present() ● This ensures we are on the correct side of VSync for the current frame, and we can now spin until our running start time
  • 22. Why the Query Is Critical 22 ● If a frame is late, the query will keep you on the right side of VSync for the following frame ensuring your prediction remains accurate
  • 23. Running Start Summary 23 ● This is a solid 1.5-2.0ms GPU perf gain! ● You want to see this in NVIDIA Nsight: ● You want to see this in Microsoft’s GPUView:
  • 24. Outline ● VR at Valve ● Methods for Stereo Rendering ● Timing: Scheduling, Prediction, VSync, GPU Bubbles ● Specular Aliasing & Anisotropic Lighting ● Miscellaneous VR Rendering Topics 24
  • 25. Aliasing Is Your Enemy ● The camera (your head) never stops moving. Aliasing is amplified because of this. ● While there are more pixels to render, each pixel fills a larger angle than anything we’ve done before. Here are some averages: ● 2560x1600 30” monitor: ~50 pixels/degree (50 degree H fov) ● 720p 30” monitor: ~25 pixels/degree (50 degree H fov) ● VR: ~15.3 pixels/degree (110 degree fov w/ 1.4x) ● We must increase the quality of our pixels 25
  • 26. 4xMSAA Minimum Quality ● Forward renderers win for antialiasing because MSAA just works ● We use 8xMSAA if perf allows ● Image-space antialiasing algorithms must be compared side-by-side with 4xMSAA and 8xMSAA to know how your renderer will compare to others in the industry ● Jittered SSAA is obviously the best using the HLSL ‘sample’ modifier, but only if you can spare the perf 26
  • 27. Normal Maps Are Not Dead ● Most normal maps work great in VR...mostly. ● What doesn’t work: ● Feature detail larger than a few cm inside tracked volume is bad ● Surface shape inside a tracked volume can’t be in a normal map ● What does work: ● Distant objects outside the tracked volume you can’t inspect up close ● Surface “texture” and fine details: 27
  • 28. Normal Map Mipping Error 28 Blinn-Phong Specular Zoomed out normal map box filtered mips Zoomed out super-sampled 36 samples Expected glossiness Incorrect glossiness
  • 29. Normal Map Mipping Problems ● Any mip filter that just generates an averaged normal loses important roughness information 29
  • 30. Normal Map Visualization 30 4096x4096 Normal Map Fire Alarm 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 31. Normal Map Visualization 31 4096x4096 Normal Map Fire Alarm 8x8 Mip Visualization16x16 Mip Visualization
  • 32. Normal Map Visualization 32 4096x4096 Normal Map Dota 2 Mirana Body 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 33. Normal Map Visualization 33 4096x4096 Normal Map Dota 2 Juggernaut Sword Handle 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 34. Normal Map Visualization 34 4096x4096 Normal Map Shoulder Armor 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 35. Normal Map Visualization 35 4096x4096 Normal Map Metal Siding 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 36. Roughness Encoded in Mips ● We can store a single isotropic value (visualized as the radius of a circle) that is the standard deviation of all 2D tangent normals from the highest mip that contributed to this texel ● We can also store a 2D anisotropic value (visualized as the dimensions of an ellipse) for the standard deviation in X and Y separately that can be used to compute tangent-space axis-aligned anisotropic lighting! 36
  • 38. Add Artist-Authored Roughness ● We author 2D gloss = 1.0 – roughness ● Mip with a simple box filter ● Add/sum it with the normal map roughness at each mip level ● Because we have anisotropic gloss maps anyway, storing the generated normal map roughness is FREE 38 Isotropic Gloss Anisotropic Gloss
  • 39. Tangent-Space Axis-Aligned Anisotropic Lighting ● Standard isotropic lighting is represented along the diagonal ● Anisotropy is aligned with either of the tangent-space axes ● Requires only 2 additional values paired with a 2D tangent normal = Fits into an RGBA texture (DXT5 >95% of the time) 39
  • 40. Roughness to Exponent Conversion 40 void RoughnessEllipseToScaleAndExp( float2 vRoughness, out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut ) { o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4 o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384 o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent } ● Diffuse lighting is Lambert raised to exponent (N.Lk) where k is in the range 0.6-1.4 ● Experimented with anisotropic diffuse lighting, but not worth the instructions ● Specular exponent range is 1-16,384 and is a modified Blinn-Phong with anisotropy (more on this later)
  • 41. How Anisotropy Is Computed 41 Tangent U Lighting * = Tangent V Lighting Final Lighting * =
  • 42. Shader Code 42 Anisotropic Specular Lighting: float3 vHalfAngleDirWs = normalize( vPositionToLightDirWs.xyz + vPositionToCameraDirWs.xyz ); float3 vSpecularNormalX = vHalfAngleDirWs.xyz - ( vTangentUWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentUWs.xyz ) ); float3 vSpecularNormalY = vHalfAngleDirWs.xyz - ( vTangentVWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentVWs.xyz ) ); float flNDotHX = max( 0.0, dot( vSpecularNormalX.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHkX = pow( flNDotHX, vSpecularExponent.x * 0.5 ); flNDotHkX *= vSpecularScale.x; float flNDotHY = max( 0.0, dot( vSpecularNormalY.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHkY = pow( flNDotHY, vSpecularExponent.y * 0.5 ); flNDotHkY *= vSpecularScale.y; float flSpecularTerm = flNDotHkX * flNDotHkY; Isotropic Diffuse Lighting: float flDiffuseTerm = pow( flNDotL, flDiffuseExponent ) * ( ( flDiffuseExponent + 1.0 ) * 0.5 ); Isotropic Specular Lighting: float flNDotH = saturate( dot( vNormalWs.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHk = pow( flNDotH, dot( vSpecularExponent.xy, float2( 0.5, 0.5 ) ) ); flNDotHk *= dot( vSpecularScale.xy, float2( 0.33333, 0.33333 ) ); // 0.33333 is to match the spec intensity of the aniso algorithm above float flSpecularTerm = flNDotHk; void RoughnessEllipseToScaleAndExp( float2 vRoughness, out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut ) { o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4 o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384 o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent }
  • 43. Geometric Specular Aliasing ● Dense meshes without normal maps also alias, and roughness mips can’t help you! ● We use partial derivatives of interpolated vertex normals to generate a geometric roughness term that approximates curvature. Here is the hacky math: float3 vNormalWsDdx = ddx( vGeometricNormalWs.xyz ); float3 vNormalWsDdy = ddy( vGeometricNormalWs.xyz ); float flGeometricRoughnessFactor = pow( saturate( max( dot( vNormalWsDdx.xyz, vNormalWsDdx.xyz ), dot( vNormalWsDdy.xyz, vNormalWsDdy.xyz ) ) ), 0.333 ); vRoughness.xy = max( vRoughness.xy, flGeometricRoughnessFactor.xx ); // Ensure we don’t double-count roughness if normal map encodes geometric roughness 43 Visualization of flGeometricRoughnessFactor
  • 44. Geometric Specular Aliasing Part 2 ● MSAA center vs centroid interpolation: It’s not perfect ● Normal interpolation can cause specular sparkling at silhouettes due to over-interpolated vertex normals ● Here’s a trick we are using: ● Interpolate normal twice: once with centroid, once without float3 vNormalWs : TEXCOORD0; centroid float3 vCentroidNormalWs : TEXCOORD1; ● In the pixel shader, choose the centroid normal if normal length squared is greater than 1.01 if ( dot( i.vNormalWs.xyz, i.vNormalWs.xyz ) >= 1.01 ) { i.vNormalWs.xyz = i.vCentroidNormalWs.xyz; } 44
  • 45. Outline ● VR at Valve ● Methods for Stereo Rendering ● Timing: Scheduling, Prediction, VSync, GPU Bubbles ● Specular Aliasing & Anisotropic Lighting ● Miscellaneous VR Rendering Topics 45
  • 46. Normal Map Encoding ● Projecting tangent normals onto Z plane only uses 78.5% of the range of a 2D texel ● Hemi-octahedron encoding uses the full range of a 2D texel ● “A Survey of Efficient Representations for Independent Unit Vectors”, Cigolle et al., Journal of Computer Graphics Techniques Vol. 3, No. 2, 2014 (Image modified from above paper) 46
  • 47. Scale Render Target Resolution ● Turns out, 1.4x is just a recommendation for the HTC Vive (Each HMD design has a different recommended scalar based on optics and panels) ● On slower GPUs, scale the recommended render target scalar down ● On faster GPUs, scale the recommended render target scalar up ● If you’ve got GPU cycles to burn, BURN THEM 47
  • 48. Anisotropic Texture Filtering ● Increases the perceived resolution of the panels (don’t forget, we only have fewer pixels per degree) ● Force this on for color and normal maps ● We use 8x by default ● Disable for everything else. Trilinear only, but measure perf. Anisotropic filtering may be “free” if you are bottlenecked elsewhere. 48
  • 49. Noise Is Your Friend ● Gradients are horrible in VR. Banding is more obvious than LCD TVs. ● We add noise on the way into the framebuffer when we have floating-point precision in the pixel shader float3 ScreenSpaceDither( float2 vScreenPos ) { // Iestyn's RGB dither (7 asm instructions) from Portal 2 X360, slightly modified for VR float3 vDither = dot( float2( 171.0, 231.0 ), vScreenPos.xy + g_flTime ).xxx; vDither.rgb = frac( vDither.rgb / float3( 103.0, 71.0, 97.0 ) ) - float3( 0.5, 0.5, 0.5 ); return ( vDither.rgb / 255.0 ) * 0.375; } 49
  • 50. Environment Maps ● Standard implementation at infinity = only works for sky ● Need to use some type of distance remapping for environment maps ● Sphere is cheap ● Box is more expensive ● Both are useful in different situations ● Read this online article: ● “Image-based Lighting approaches and parallax-corrected cubemaps”, Sébastien Lagarde, 2012 50
  • 51. Stencil Mesh (Hidden Area Mesh) ● Stencil out the pixels you can’t actually see through the lenses. GPUs are fast at early stencil-rejection. ● Alternatively you can render to the depth buffer at near z so everything early z-rejects instead ● Lenses produce radially symmetric distortion which means you effectively see a circular area projected on the panels 51
  • 53. Stencil Mesh (Ideal Warped View) 53
  • 54. Stencil Mesh (Wasted Pixels) 54
  • 57. Stencil Mesh (Final Unwarped View) 57
  • 58. Stencil Mesh (Final Warped View) 58
  • 59. Stencil Mesh (Hidden Area Mesh) ● SteamVR/OpenVR API will provide this mesh to you ● Results in a 17% fill rate reduction! ● No stencil mesh: VR 1512x1680x2 @ 90Hz: 457 million pixels/sec ● 2,540,160 pixels per eye (5,080,320 pixels total) ● With stencil mesh: VR 1512x1680x2 @ 90Hz: 378 million pixels/sec ● About 2,100,000 pixels per eye (4,200,000 pixels total) 59
  • 60. Warp Mesh (Lens Distortion Mesh) 60
  • 62. Warp Mesh (Cull UV’s Outside 0-1) 62
  • 63. Warp Mesh (Cull Stencil Mesh) 63
  • 64. Warp Mesh (Shrink Wrap) 15% of pixels culled from the warp mesh 64
  • 65. Performance Queries Required! ● You are always VSync’d ● Disabling VSync to see framerate will make you dizzy ● Need to use performance queries to report GPU workload ● Simplest implementation is to measure first to last draw call ● Ideally measure these things: ● Idle time from Present() to first draw call ● First draw call to last draw call ● Idle time from last draw call to Present() 65
  • 66. Summary ● Stereo Rendering ● Prediction ● “Running Start” (Saves 1.5-2.0 ms/frame) ● Anisotropic Lighting & Mipping Normal Maps ● Geometric Specular Antialiasing ● Stencil Mesh (Saves 17% pixels rendered) ● Optimized Warp Mesh (Reduces cost by 15%) ● Etc. 66
  • 67. Thank You! Alex Vlachos, Valve Alex@ValveSoftware.com 67