SlideShare a Scribd company logo
Tessellation on Any Budget
John McDonald
Developer Technology
NVIDIA Corporation
Topics Covered
• Canonical Work Breakdown
• Techniques
• Debugging
• Optimizing
New Direct3D 11 StagesNew Direct3D 11 Stages
Brief Recap
IAIA VS HS TSTS DS GS RASRAS PS OMOM
Programmable (Shader)
Fixed Function
Canonical Work Breakdown
• VS: conversion to camera space, control point
animation
• HS (CP): Compute Control Point locations,
compute per-control point culling info
• HS (PC): Use info from HS (CP) to compute
per-edge LOD; cull patches outside frustum
Canonical Work Breakdown cont’d
• DS: Merge Data from HS, TS. Transform from
eye to clip space.
Techniques
• All techniques (except vanilla Flat Dicing) will
improve silhouettes and lighting
• But don’t incur the corresponding increase in
memory consumption
• And continuous LOD!
Flat Dicing
• Simplest form of tessellation
• Merely adds triangles where fewer were
previously
• Does not improve silhouettes alone
– Usually paired with displacement mapping
– Can also be used to reduce PS complexity
Flat Dicing
Flat Dicing Code (HS-CP)
Flat Dicing Code (HS-PC)
Flat Dicing Code (DS)
PN
• Originally proposed by Alex Vlachos, et al, in
Curved PN Triangles.
• Treats primitives as descriptions of Bezier
Surfaces, using the location as a position and
normal as a description of tangent of surface
at that position
PN Details
• Uses existing vertex and index buffer without
modification
PN Modifications
• PN calls for quadratic interpolation of normals
• This allows for inflection points while lighting
curved triangles
• The inflection points will show up
geometrically
• Skip the quadratic normal for lighting
Quadratic Normals
• Per-pixel lighting would require quadratic
tangents and binormals as well
– Lots of extra math
– Potential for gimbal lock in lighting!
• While correct according to the surface, this
does not match artist intent for lighting
PN Code (HS-CP)
New Code (as compared to Flat Dicing HS-CP)
PN Code (HS-PC)
New Code (as compared to Flat Dicing HS-PC)
PN Code (DS)
New Code (as compared to Flat Dicing DS)
PN – Pros
• Completely procedural
• No memory requirements
PN – Cons
• Meshes can become
“Stay Puft”, particularly
around the feet
• C1 discontinuities (same position, different
normal) in input result in C0 discontinuity!
• Fixing discontinuities requires artist
involvement
PN-AEN
• Proposed by John McDonald and Mark Kilgard
in Crack-Free Point-Normal Triangles Using
Adjacent Edge Normals
• Uses PN with a twist—neighbor information
determined during a preprocess step to avoid
cracking
PN-AEN Details
• Uses existing VB without modification, but a
second IB must be generated
• Tool available from NVIDIA to
generate second IB
automatically (works for all
vendors)
PN-AEN Code (HS-CP)
New Code (as compared to PN HS-CP)
a b c d e f g h i
AddtlData
PN-AEN Code (HS-PC)
New Code (as compared to PN HS-PC)
PN-AEN Code (DS)
PN-AEN – Pros
• Completely Procedural
• Small memory overhead
Cube with normals PN PN-AEN
PN-AEN – Cons
• More expensive (runtime cost) than PN
• Still can result in some ‘Stay Pufting’ of
meshes
• No artist involvement means less artistic
control
• Requires second index buffer, 9 indices pp
Displacement Mapping
• Used together with another
tessellation technique (often
Flat Dicing)
• Texture controls displacement
at each generated vertex
during Domain Shading
Wretch used courtesy Epic Games, Inc
Displacement Mapping Details
• Requires displacement map to be authored
– Although tools exist to generate from
normal maps
Displacement Mapping – Pros
• High impact silhouette and lighting
adjustments
• “Pay as you go”: Easy to add displacement to
“key” assets without adding to all assets
Displacement Mapping – Cons
• Care must be taken to avoid:
– Mesh swimming when LOD changes
– Cracks between patches
• Artist involvement means money being spent
Continuity
• Games have had continuity errors forever
• Normal/lighting discontinuities break C1.
• Tessellation, particularly displacement
mapping, makes breaking C0 easy.
• This is undesirable
Discontinuity
• What causes discontinuities?
– Vertices with same position pre-tessellation, but
different position post-tessellation
– Math Errors (Edge LOD calculation for triangles)
– Displacing along normals when normals are
disjoint
– Sampling Errors
Discontinuity
Wretch used Courtesy of Epic Games
Sampling Errors?!
• Impossible to ensure bit-accurate samples
across texture discontinuities
• With normal maps, this causes a lighting seam
• With tessellation, this causes a surface
discontinuity
Discontinuity Solution
• For each patch, store dominant
edge/dominant vertex information
• Detect that you’re at an edge or corner
• If so, sample UVs from dominant information
instead of self.
• Everyone agrees, cracks are gone!
Discontinuity Solution cont’d
• Orthogonal to choice of tessellation (works
everywhere!)
a b c d e f g h i j k l
Dominant Edges
Dominant Verts
Discontinuity Code (HS-CP)
New Code (as compared to Flat Dicing HS-CP)
Flat + Displacement (DS)
New Code (as compared to Flat Dicing DS)
Discontinuity Code (DS)
New Code (as compared to Flat + Displacement DS)
Other Tessellation Techniques
• Phong Tessellation
– Works with existing assets
– No artist intervention required
– Suffers same issue as PN (C1 input discontinuity
result in C0 output discontinuity)
Other Tessellation Techniques
• Catmull Clark sub-d surfaces
– Great for the future
– Not great for fitting into existing engines
• NURBS
– Not a great fit for the GPU
– Definitely not a good fit for existing engines
Summary / Questions
Technique Production Cost Runtime Cost Value Add
Flat Dicing Free ~Free
May improve perf, good
basis for other
techniques
PN May require art fixes
Small runtime
overhead
Improved
silhouettes/lighting
PN-AEN Free
Additional indices
pulled versus PN
Crack free, better
silhouettes/lighting,
preserve hard edges
Displacement
Requires art, but pay
as you go
1 texture lookup
Works with other
techniques, allows very
fine detail
Debugging Techniques
• Verify your conventions
– Output Barycentric coordinates as
diffuse color
• Reduce shader to flat
tessellation, add pieces back
• Remove clipping, clever optimizations
Barycentric Coordinates as colors
Debugging Techniques cont’d
• Edge LOD specification for triangles is
surprising
• Parallel nSight – Version 1.51 is
available for free
Optimization Strategies
• Work at the lowest frequency appropriate
• Be aware that with poor LOD computation, DS
could run more than the PS.
• Shade Control Points in Vertex Shader to
leverage V$
• Clipping saves significant workload
Optimization Strategies cont’d
• Code should be written to maximize SIMD
parallelism
• Prefer shorter Patch Constant shaders (only
one thread per patch)
Optimization Strategies cont’d
• Avoid tessellation factors <2 if possible
– Paying for GPU to tessellate when expansion is
very low is just cost—no benefit
• Avoid tessellation factors that would result in
triangles smaller than 3 pix/side (PS will still
operate at quad-granularity)
Questions?
• jmcdonald at nvidia dot com
NVIDIA Confidential
NVIDIA @ GDC 2011
CAN’T GET ENOUGH? MORE WAYS TO LEARN:
NVIDIA GAME TECHNOLOGY THEATER
Wed, March 2nd
and Fri, March 4th
@ NVIDIA Booth
Open to all attendees. Featuring talks and demos from leading developers at game studios and
more, covering a wide range of topics on the latest in GPU game technology.
NVIDIA DEVELOPER SESSIONS
All Day Thurs, March 3rd
@ Room 110, North Hall E
Open to all attendees. Full schedule on www.nvidia.com/gdc2011
MORE DEVELOPER TOOLS & RESOURCES
Available online 24/7 @ developer.nvidia.com
WE’RE HIRING
More info @ careers.nvidia.com
NVIDIA Booth
South Hall #1802
Details on schedule and to
download copies of
presentations visit
www.nvidia.com/gdc2011
Appendix
float4 ApplyProjection(float4x4 projMatrix,
float3 eyePosition)
{
float4 clipPos;
clipPos[0] = projMatrix[0][0] * eyePosition[0];
clipPos[1] = projMatrix[1][1] * eyePosition[1];
clipPos[2] = projMatrix[2][2] * eyePosition[2] + projMatrix[3][2];
clipPos[3] = eyePosition[2];
return clipPos;
}
float2 ProjectAndScale(float4x4 projMatrix, float3 inPos)
{
float4 posClip = ApplyProjection(projMatrix, inPos);
float2 posNDC = posClip.xy / posClip.w;
return posNDC * g_f4ViewportScale.xy / g_f4TessFactors.z;
}
Appendix
float ComputeClipping(float4x4 projMatrix, float3 cpA, float3 cpB, float3 cpC)
{
float4 projPosA = ApplyProjection(projMatrix, cpA),
projPosB = ApplyProjection(projMatrix, cpB),
projPosC = ApplyProjection(projMatrix, cpC);
return min(min(IsClipped(projPosA), IsClipped(projPosB)), IsClipped(projPosC));
}
Note: This isn’t quite correct for clipping—it will clip primitives that are so close to the camera that
the control points are all out of bounds. The correct clipping would actually store in-bounds/out-of
bounds for each plane, then determine if all points failed any one plane.
Appendix
float ComputeEdgeLOD(float4x4 projMatrix, float3 cpA, float3 cpB, float3 cpC, float3 cpD)
{
float2 projCpA = ProjectAndScale(projMatrix, cpA).xy,
projCpB = ProjectAndScale(projMatrix, cpB).xy,
projCpC = ProjectAndScale(projMatrix, cpC).xy,
projCpD = ProjectAndScale(projMatrix, cpD).xy;
float edgeLOD = distance(projCpA, projCpB)
+ distance(projCpB, projCpC)
+ distance(projCpC, projCpD);
return max(edgeLOD, 1);
}
References
• Vlachos, Alex, Jorg Peters, Chas Boyd, and Jason L. Mitchell. "Curved PN Triangles."
Proceedings of the 2001 Symposium on Interactive 3D Graphics (2001): 159-66. Print.
• McDonald, John, and Mark Kilgard. "Crack-Free Point-Normal Triangles Using Adjacent Edge
Normals." developer.nvidia.com. NVIDIA Corporation, 21 Dec. 2010. Web. 25 Feb. 2011.

More Related Content

What's hot

A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
Electronic Arts / DICE
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
Electronic Arts / DICE
 
The rendering technology of 'lords of the fallen' philip hammer
The rendering technology of 'lords of the fallen'   philip hammerThe rendering technology of 'lords of the fallen'   philip hammer
The rendering technology of 'lords of the fallen' philip hammer
Mary Chan
 
The PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case Study
The PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case StudyThe PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case Study
The PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case Study
Guerrilla
 
Dissecting the Rendering of The Surge
Dissecting the Rendering of The SurgeDissecting the Rendering of The Surge
Dissecting the Rendering of The Surge
Philip Hammer
 
Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...
Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...
Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...
Electronic Arts / DICE
 
Screen Space Reflections in The Surge
Screen Space Reflections in The SurgeScreen Space Reflections in The Surge
Screen Space Reflections in The Surge
Michele Giacalone
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
Tiago Sousa
 
The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2
Guerrilla
 
Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2
Philip Hammer
 
Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with compute
WuBinbo
 
Battlefield 4 + Frostbite + Mantle
Battlefield 4 + Frostbite + MantleBattlefield 4 + Frostbite + Mantle
Battlefield 4 + Frostbite + Mantle
Electronic Arts / DICE
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
Ki Hyunwoo
 
OpenGL for 2015
OpenGL for 2015OpenGL for 2015
OpenGL for 2015
Mark Kilgard
 
Khronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and VulkanKhronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and Vulkan
Electronic Arts / DICE
 
OpenGL Basics
OpenGL BasicsOpenGL Basics
OpenGL Basics
Sandip Jadhav
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
Owen Wu
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in Frostbite
Electronic Arts / DICE
 
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
Owen Wu
 
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
Intel® Software
 

What's hot (20)

A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
 
The rendering technology of 'lords of the fallen' philip hammer
The rendering technology of 'lords of the fallen'   philip hammerThe rendering technology of 'lords of the fallen'   philip hammer
The rendering technology of 'lords of the fallen' philip hammer
 
The PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case Study
The PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case StudyThe PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case Study
The PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case Study
 
Dissecting the Rendering of The Surge
Dissecting the Rendering of The SurgeDissecting the Rendering of The Surge
Dissecting the Rendering of The Surge
 
Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...
Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...
Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...
 
Screen Space Reflections in The Surge
Screen Space Reflections in The SurgeScreen Space Reflections in The Surge
Screen Space Reflections in The Surge
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
 
The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2
 
Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2
 
Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with compute
 
Battlefield 4 + Frostbite + Mantle
Battlefield 4 + Frostbite + MantleBattlefield 4 + Frostbite + Mantle
Battlefield 4 + Frostbite + Mantle
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
 
OpenGL for 2015
OpenGL for 2015OpenGL for 2015
OpenGL for 2015
 
Khronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and VulkanKhronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and Vulkan
 
OpenGL Basics
OpenGL BasicsOpenGL Basics
OpenGL Basics
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in Frostbite
 
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
 
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
 

Viewers also liked

Tessellations
TessellationsTessellations
Tessellations
richardw
 
32 teaching strategies in math
32 teaching strategies in math 32 teaching strategies in math
Tessellations
TessellationsTessellations
Tessellation
TessellationTessellation
Tessellation
Bevster01
 
Mc escher presentation
Mc escher presentationMc escher presentation
Mc escher presentation
BSU
 
Tessellation ppt
Tessellation pptTessellation ppt
Tessellation ppt
mrsfrasure
 

Viewers also liked (6)

Tessellations
TessellationsTessellations
Tessellations
 
32 teaching strategies in math
32 teaching strategies in math 32 teaching strategies in math
32 teaching strategies in math
 
Tessellations
TessellationsTessellations
Tessellations
 
Tessellation
TessellationTessellation
Tessellation
 
Mc escher presentation
Mc escher presentationMc escher presentation
Mc escher presentation
 
Tessellation ppt
Tessellation pptTessellation ppt
Tessellation ppt
 

Similar to Tessellation on any_budget-gdc2011

Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014
Droidcon Berlin
 
5035-Pipeline-Optimization-Techniques.pdf
5035-Pipeline-Optimization-Techniques.pdf5035-Pipeline-Optimization-Techniques.pdf
5035-Pipeline-Optimization-Techniques.pdf
ssmukherjee2013
 
Develop2012 deferred sanchez_stachowiak
Develop2012 deferred sanchez_stachowiakDevelop2012 deferred sanchez_stachowiak
Develop2012 deferred sanchez_stachowiak
Matt Filer
 
Alex_Vlachos_Advanced_VR_Rendering_GDC2015
Alex_Vlachos_Advanced_VR_Rendering_GDC2015Alex_Vlachos_Advanced_VR_Rendering_GDC2015
Alex_Vlachos_Advanced_VR_Rendering_GDC2015
Alex Vlachos
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakov
mistercteam
 
S12075-GPU-Accelerated-Video-Encoding.pptx
S12075-GPU-Accelerated-Video-Encoding.pptxS12075-GPU-Accelerated-Video-Encoding.pptx
S12075-GPU-Accelerated-Video-Encoding.pptx
gopikahari7
 
The Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next StepsThe Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next Steps
Johan Andersson
 
IoT consideration selection
IoT consideration selectionIoT consideration selection
IoT consideration selection
Yoss Cohen
 
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Dissecting and fixing Vulkan rendering issues in drivers with RenderDocDissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Igalia
 
[Osxdev]metal
[Osxdev]metal[Osxdev]metal
[Osxdev]metal
NAVER D2
 
OpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CADOpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CAD
Design World
 
S12075-GPU-Accelerated-Video-Encoding.pdf
S12075-GPU-Accelerated-Video-Encoding.pdfS12075-GPU-Accelerated-Video-Encoding.pdf
S12075-GPU-Accelerated-Video-Encoding.pdf
gopikahari7
 
Scala can do this, too
Scala can do this, tooScala can do this, too
Scala can do this, too
Hairy Fotr
 
Past, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in GamesPast, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in Games
Colin Barré-Brisebois
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I
💻 Anton Gerdelan
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
AMD Developer Central
 
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCJeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
MLconf
 
PSGL (PlayStation Graphics Library)
PSGL (PlayStation Graphics Library)PSGL (PlayStation Graphics Library)
PSGL (PlayStation Graphics Library)
Slide_N
 
Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)
Johan Andersson
 
NVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and TransparencyNVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and Transparency
Mark Kilgard
 

Similar to Tessellation on any_budget-gdc2011 (20)

Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014
 
5035-Pipeline-Optimization-Techniques.pdf
5035-Pipeline-Optimization-Techniques.pdf5035-Pipeline-Optimization-Techniques.pdf
5035-Pipeline-Optimization-Techniques.pdf
 
Develop2012 deferred sanchez_stachowiak
Develop2012 deferred sanchez_stachowiakDevelop2012 deferred sanchez_stachowiak
Develop2012 deferred sanchez_stachowiak
 
Alex_Vlachos_Advanced_VR_Rendering_GDC2015
Alex_Vlachos_Advanced_VR_Rendering_GDC2015Alex_Vlachos_Advanced_VR_Rendering_GDC2015
Alex_Vlachos_Advanced_VR_Rendering_GDC2015
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakov
 
S12075-GPU-Accelerated-Video-Encoding.pptx
S12075-GPU-Accelerated-Video-Encoding.pptxS12075-GPU-Accelerated-Video-Encoding.pptx
S12075-GPU-Accelerated-Video-Encoding.pptx
 
The Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next StepsThe Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next Steps
 
IoT consideration selection
IoT consideration selectionIoT consideration selection
IoT consideration selection
 
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Dissecting and fixing Vulkan rendering issues in drivers with RenderDocDissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
 
[Osxdev]metal
[Osxdev]metal[Osxdev]metal
[Osxdev]metal
 
OpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CADOpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CAD
 
S12075-GPU-Accelerated-Video-Encoding.pdf
S12075-GPU-Accelerated-Video-Encoding.pdfS12075-GPU-Accelerated-Video-Encoding.pdf
S12075-GPU-Accelerated-Video-Encoding.pdf
 
Scala can do this, too
Scala can do this, tooScala can do this, too
Scala can do this, too
 
Past, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in GamesPast, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in Games
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
 
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCJeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
 
PSGL (PlayStation Graphics Library)
PSGL (PlayStation Graphics Library)PSGL (PlayStation Graphics Library)
PSGL (PlayStation Graphics Library)
 
Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)
 
NVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and TransparencyNVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and Transparency
 

More from basisspace

Avoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance LossAvoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance Loss
basisspace
 
Porting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons LearnedPorting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons Learned
basisspace
 
Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Management
basisspace
 
Borderless Per Face Texture Mapping
Borderless Per Face Texture MappingBorderless Per Face Texture Mapping
Borderless Per Face Texture Mapping
basisspace
 
The nitty gritty of game development
The nitty gritty of game developmentThe nitty gritty of game development
The nitty gritty of game development
basisspace
 
Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)
basisspace
 

More from basisspace (6)

Avoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance LossAvoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance Loss
 
Porting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons LearnedPorting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons Learned
 
Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Management
 
Borderless Per Face Texture Mapping
Borderless Per Face Texture MappingBorderless Per Face Texture Mapping
Borderless Per Face Texture Mapping
 
The nitty gritty of game development
The nitty gritty of game developmentThe nitty gritty of game development
The nitty gritty of game development
 
Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)
 

Recently uploaded

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
GDSC PJATK
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 

Recently uploaded (20)

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 

Tessellation on any_budget-gdc2011

  • 1.
  • 2. Tessellation on Any Budget John McDonald Developer Technology NVIDIA Corporation
  • 3. Topics Covered • Canonical Work Breakdown • Techniques • Debugging • Optimizing
  • 4. New Direct3D 11 StagesNew Direct3D 11 Stages Brief Recap IAIA VS HS TSTS DS GS RASRAS PS OMOM Programmable (Shader) Fixed Function
  • 5. Canonical Work Breakdown • VS: conversion to camera space, control point animation • HS (CP): Compute Control Point locations, compute per-control point culling info • HS (PC): Use info from HS (CP) to compute per-edge LOD; cull patches outside frustum
  • 6. Canonical Work Breakdown cont’d • DS: Merge Data from HS, TS. Transform from eye to clip space.
  • 7. Techniques • All techniques (except vanilla Flat Dicing) will improve silhouettes and lighting • But don’t incur the corresponding increase in memory consumption • And continuous LOD!
  • 8. Flat Dicing • Simplest form of tessellation • Merely adds triangles where fewer were previously • Does not improve silhouettes alone – Usually paired with displacement mapping – Can also be used to reduce PS complexity
  • 10. Flat Dicing Code (HS-CP)
  • 11. Flat Dicing Code (HS-PC)
  • 13. PN • Originally proposed by Alex Vlachos, et al, in Curved PN Triangles. • Treats primitives as descriptions of Bezier Surfaces, using the location as a position and normal as a description of tangent of surface at that position
  • 14. PN Details • Uses existing vertex and index buffer without modification
  • 15. PN Modifications • PN calls for quadratic interpolation of normals • This allows for inflection points while lighting curved triangles • The inflection points will show up geometrically • Skip the quadratic normal for lighting
  • 16. Quadratic Normals • Per-pixel lighting would require quadratic tangents and binormals as well – Lots of extra math – Potential for gimbal lock in lighting! • While correct according to the surface, this does not match artist intent for lighting
  • 17. PN Code (HS-CP) New Code (as compared to Flat Dicing HS-CP)
  • 18. PN Code (HS-PC) New Code (as compared to Flat Dicing HS-PC)
  • 19. PN Code (DS) New Code (as compared to Flat Dicing DS)
  • 20. PN – Pros • Completely procedural • No memory requirements
  • 21. PN – Cons • Meshes can become “Stay Puft”, particularly around the feet • C1 discontinuities (same position, different normal) in input result in C0 discontinuity! • Fixing discontinuities requires artist involvement
  • 22. PN-AEN • Proposed by John McDonald and Mark Kilgard in Crack-Free Point-Normal Triangles Using Adjacent Edge Normals • Uses PN with a twist—neighbor information determined during a preprocess step to avoid cracking
  • 23. PN-AEN Details • Uses existing VB without modification, but a second IB must be generated • Tool available from NVIDIA to generate second IB automatically (works for all vendors)
  • 24. PN-AEN Code (HS-CP) New Code (as compared to PN HS-CP) a b c d e f g h i AddtlData
  • 25. PN-AEN Code (HS-PC) New Code (as compared to PN HS-PC)
  • 27. PN-AEN – Pros • Completely Procedural • Small memory overhead Cube with normals PN PN-AEN
  • 28. PN-AEN – Cons • More expensive (runtime cost) than PN • Still can result in some ‘Stay Pufting’ of meshes • No artist involvement means less artistic control • Requires second index buffer, 9 indices pp
  • 29. Displacement Mapping • Used together with another tessellation technique (often Flat Dicing) • Texture controls displacement at each generated vertex during Domain Shading Wretch used courtesy Epic Games, Inc
  • 30. Displacement Mapping Details • Requires displacement map to be authored – Although tools exist to generate from normal maps
  • 31. Displacement Mapping – Pros • High impact silhouette and lighting adjustments • “Pay as you go”: Easy to add displacement to “key” assets without adding to all assets
  • 32. Displacement Mapping – Cons • Care must be taken to avoid: – Mesh swimming when LOD changes – Cracks between patches • Artist involvement means money being spent
  • 33. Continuity • Games have had continuity errors forever • Normal/lighting discontinuities break C1. • Tessellation, particularly displacement mapping, makes breaking C0 easy. • This is undesirable
  • 34. Discontinuity • What causes discontinuities? – Vertices with same position pre-tessellation, but different position post-tessellation – Math Errors (Edge LOD calculation for triangles) – Displacing along normals when normals are disjoint – Sampling Errors
  • 36. Sampling Errors?! • Impossible to ensure bit-accurate samples across texture discontinuities • With normal maps, this causes a lighting seam • With tessellation, this causes a surface discontinuity
  • 37. Discontinuity Solution • For each patch, store dominant edge/dominant vertex information • Detect that you’re at an edge or corner • If so, sample UVs from dominant information instead of self. • Everyone agrees, cracks are gone!
  • 38. Discontinuity Solution cont’d • Orthogonal to choice of tessellation (works everywhere!) a b c d e f g h i j k l Dominant Edges Dominant Verts
  • 39. Discontinuity Code (HS-CP) New Code (as compared to Flat Dicing HS-CP)
  • 40. Flat + Displacement (DS) New Code (as compared to Flat Dicing DS)
  • 41. Discontinuity Code (DS) New Code (as compared to Flat + Displacement DS)
  • 42. Other Tessellation Techniques • Phong Tessellation – Works with existing assets – No artist intervention required – Suffers same issue as PN (C1 input discontinuity result in C0 output discontinuity)
  • 43. Other Tessellation Techniques • Catmull Clark sub-d surfaces – Great for the future – Not great for fitting into existing engines • NURBS – Not a great fit for the GPU – Definitely not a good fit for existing engines
  • 44. Summary / Questions Technique Production Cost Runtime Cost Value Add Flat Dicing Free ~Free May improve perf, good basis for other techniques PN May require art fixes Small runtime overhead Improved silhouettes/lighting PN-AEN Free Additional indices pulled versus PN Crack free, better silhouettes/lighting, preserve hard edges Displacement Requires art, but pay as you go 1 texture lookup Works with other techniques, allows very fine detail
  • 45. Debugging Techniques • Verify your conventions – Output Barycentric coordinates as diffuse color • Reduce shader to flat tessellation, add pieces back • Remove clipping, clever optimizations Barycentric Coordinates as colors
  • 46. Debugging Techniques cont’d • Edge LOD specification for triangles is surprising • Parallel nSight – Version 1.51 is available for free
  • 47. Optimization Strategies • Work at the lowest frequency appropriate • Be aware that with poor LOD computation, DS could run more than the PS. • Shade Control Points in Vertex Shader to leverage V$ • Clipping saves significant workload
  • 48. Optimization Strategies cont’d • Code should be written to maximize SIMD parallelism • Prefer shorter Patch Constant shaders (only one thread per patch)
  • 49. Optimization Strategies cont’d • Avoid tessellation factors <2 if possible – Paying for GPU to tessellate when expansion is very low is just cost—no benefit • Avoid tessellation factors that would result in triangles smaller than 3 pix/side (PS will still operate at quad-granularity)
  • 50. Questions? • jmcdonald at nvidia dot com
  • 51. NVIDIA Confidential NVIDIA @ GDC 2011 CAN’T GET ENOUGH? MORE WAYS TO LEARN: NVIDIA GAME TECHNOLOGY THEATER Wed, March 2nd and Fri, March 4th @ NVIDIA Booth Open to all attendees. Featuring talks and demos from leading developers at game studios and more, covering a wide range of topics on the latest in GPU game technology. NVIDIA DEVELOPER SESSIONS All Day Thurs, March 3rd @ Room 110, North Hall E Open to all attendees. Full schedule on www.nvidia.com/gdc2011 MORE DEVELOPER TOOLS & RESOURCES Available online 24/7 @ developer.nvidia.com WE’RE HIRING More info @ careers.nvidia.com NVIDIA Booth South Hall #1802 Details on schedule and to download copies of presentations visit www.nvidia.com/gdc2011
  • 52. Appendix float4 ApplyProjection(float4x4 projMatrix, float3 eyePosition) { float4 clipPos; clipPos[0] = projMatrix[0][0] * eyePosition[0]; clipPos[1] = projMatrix[1][1] * eyePosition[1]; clipPos[2] = projMatrix[2][2] * eyePosition[2] + projMatrix[3][2]; clipPos[3] = eyePosition[2]; return clipPos; } float2 ProjectAndScale(float4x4 projMatrix, float3 inPos) { float4 posClip = ApplyProjection(projMatrix, inPos); float2 posNDC = posClip.xy / posClip.w; return posNDC * g_f4ViewportScale.xy / g_f4TessFactors.z; }
  • 53. Appendix float ComputeClipping(float4x4 projMatrix, float3 cpA, float3 cpB, float3 cpC) { float4 projPosA = ApplyProjection(projMatrix, cpA), projPosB = ApplyProjection(projMatrix, cpB), projPosC = ApplyProjection(projMatrix, cpC); return min(min(IsClipped(projPosA), IsClipped(projPosB)), IsClipped(projPosC)); } Note: This isn’t quite correct for clipping—it will clip primitives that are so close to the camera that the control points are all out of bounds. The correct clipping would actually store in-bounds/out-of bounds for each plane, then determine if all points failed any one plane.
  • 54. Appendix float ComputeEdgeLOD(float4x4 projMatrix, float3 cpA, float3 cpB, float3 cpC, float3 cpD) { float2 projCpA = ProjectAndScale(projMatrix, cpA).xy, projCpB = ProjectAndScale(projMatrix, cpB).xy, projCpC = ProjectAndScale(projMatrix, cpC).xy, projCpD = ProjectAndScale(projMatrix, cpD).xy; float edgeLOD = distance(projCpA, projCpB) + distance(projCpB, projCpC) + distance(projCpC, projCpD); return max(edgeLOD, 1); }
  • 55. References • Vlachos, Alex, Jorg Peters, Chas Boyd, and Jason L. Mitchell. "Curved PN Triangles." Proceedings of the 2001 Symposium on Interactive 3D Graphics (2001): 159-66. Print. • McDonald, John, and Mark Kilgard. "Crack-Free Point-Normal Triangles Using Adjacent Edge Normals." developer.nvidia.com. NVIDIA Corporation, 21 Dec. 2010. Web. 25 Feb. 2011.

Editor's Notes

  1. The index buffer is showing the indices laid out for a single primitive.
  2. C1: Normal discontinuity C0: Surface discontinuity