SlideShare a Scribd company logo
1 of 36
Solving Some Common Problems
in a Modern Deferred Rendering
Engine
Jose Luis Sanchez Bonet
Tomasz Stachowiak
/* @h3r2tic */
Deferred rendering – pros and cons
• Pros ( some )
– Very scalable
– No shader permutation explosion
– G-Buffer useful in other techniques
• SSAO, SRAA, decals, …
• Cons ( some )
– Difficult to use multiple shading models
– Does not handle translucent geometry
• Some variants do, but may be impractical
• The BRDF defines the look of a surface
– Bidirectional Reflectance Distribution Function
𝐿 𝑜 = 𝐿 𝑒 +
Ω
𝐿𝑖 ∙ ∙ cos 𝜃 ∙ 𝛿𝜔
• Typically games use just one ( Blinn-Phong )
– Simple, but inaccurate
• Very important in physically based rendering
– Want more: Oren-Nayar, Kajiya-Kay, Penner, Cook-Torrance, …
Reflectance models
BRDFs vs. rendering
• Forward rendering
– Material shader directly evaluates the BRDF
• Trivial
• Deferred rendering
– Light shaders decoupled from materials
– No obvious solution
Material G-Buffer Light
BRDF ???
BRDFs vs. deferred – branching?
• Read shading model ID in the lighting shader, branch
• Might be the way to go on next-gen
• Expensive on current consoles
– Tax for branches never taken
• Don’t want to pay it for every light
Three different BRDFs, only one used
( branch always yields the first one )
Platform 1 BRDF 2 BRDFs 3 BRDFs
360 1.85 ms 2.1 ms
( + 0.25 ms )
2.35 ms
( + 0.5 ms )
PS3 1.9 ms 2.48 ms
( + 0.58 ms )
2.8 ms
( + 0.9 ms )
BRDFs vs. deferred – LUTs?
• Pre-calculate BRDF look-up tables
• Might be shippable enough
– See: S.T.A.L.K.E.R.
• Limited control over parameters
– Roughness
– Anisotropy, etc.
• BRDFs highly dimensional
– Isotropic with roughness control → 3D LUT
BRDFs vs. deferred – our approach
• One default BRDF
– Others a relatively rare case
• Shading model ID in stencil
• Multi-pass light rendering
• Mask out parts of the scene in each pass
Multi-pass – tax avoidance
• For each light
– Find all affected BRDFs
– Render the light volume once for each model
• Analogous to multi-pass forward rendering!
• Store bounding volumes of objects with non-standard
BRDFs
– Intersect with light volumes
Making it practical
• Needs to work with depth culling of lights
• Hierarchical stencil on 360 and PS3
Depth culling of lights
• Assuming viewer is outside
the light volume
• Render back faces of light
volume
– Increment stencil; no color
output
• Render front faces
– Only where stencil = 0; write
color
• Render back faces
– Clear stencil; no color output
Depth culling of lights
• Assuming viewer is outside
the light volume
• Start with stencil = 0
• Render front faces
– Only where stencil = 0; write
color
• Render back faces
– Clear stencil; no color output
Depth culling of lights
• Assuming viewer is outside
the light volume
• Start with stencil = 0
• Render back faces of light
volume
– Increment stencil; no color
output
• Render back faces
– Clear stencil; no color output
Culling with BRDFs
• Pack the culling bit and BRDF together
• Use masks to read/affect required parts
• Assuming 8 supported BRDFs:
Unused BRDF ID
Culling
bit
7 6 5 4 3 2 1 0
culling_mask = 0x01
brdf_mask = 0x0E
brdf_shift = 1
Light volume rendering passes
Handling miscellaneous data in stencil
• Stencil value may contain extra data
– Used in earlier / later rendering passes
– Need to ignore it somehow
– Stencil read mask?
• Doesn’t work with the 360’s hi-stencil
Garbage BRDF ID
Culling
bit
7 6 5 4 3 2 1 0
Stencil operation
Read Read mask Comparison Operator
Write
(masked )
Result
<, <=, >, ==, …
++, --, = 0, …
Hierarchical stencil operation
Read Read mask Comparison Operator
Write
(masked )
Result
Hi-stencil
comparison
Hi-stencil
Hi-stencil
comparison
Hi-stencil
<, <=, >, ==, … ==, !=
PS3 360
<, <=, >, ==, …
++, --, = 0, …
Spanner in the works
Breaks if stencil
contains garbage
we can’t mask out
Handling stencil garbage
• Can’t do it in a non-destructive manner
– Take off and nuke the entire site from orbit
– It’s the only way to be sure
• Extra cleaning pass?
– Don’t want to pay for it!
• Do it as we go!
• Save your stencil if you need it
– Sorry for calling it garbage :`(
– We were already restoring it later on the 360
– Don’t need to destroy it on the PS3, use a read mask!
Performance
Platform 1 BRDF 2 BRDFs 3 BRDFs
360
( branching )
1.85 ms
2.1 ms
( + 0.25 ms )
2.35 ms
( + 0.5 ms )
360
( stencil )
1.85 ms
1.99 ms
( + 0.14 ms )
2.13 ms
( + 0.28 ms )
PS3
( branching )
1.9 ms
2.48 ms
( + 0.58 ms )
2.8 ms
( + 0.9 ms )
PS3
( stencil )
1.9 ms
2.13 ms
( + 0.23 ms )
2.31 ms
( + 0.41 ms )
For each BRDF
Platform Initial setup Mask Render Cleanup
360 0.03 ms 0.1 ms >= 0.036 ms 0.022 ms
PS3 0.03 ms 0.1 ms >= 0.06 ms 0.14 ms
Multi-pass light rendering – final notes
• No change in single-BRDF rendering
– Use your madly optimized routines
• No need for a ‘default’ shading model
– It’s just our use case
– As long as you efficiently find influenced BRDFs
• Flush your hi-stencil
• Tiny lights? Try branching instead.
– Performance figures only from huge lights!
– With tiny lights, hi-stencil juggling becomes inefficient
Lighting alpha objects in deferred
rendering engines
• Classic solutions:
– Forward rendering.
– CPU based, one light probe per each object.
• Our solution:
– GPU based.
– More than one light probe.
– Calculate a lightmap for each object each frame.
– Used for objects and particle systems.
– Fits perfectly into a deferred rendering pipeline.
• Object space map:
Our solution for alpha objects
Every pixel stores the local space
position on the object’s surface
Image attribution: Zephyris at en.wikipedia.
• For each object:
– Use baked positions as light probes
• Transform object space map into world space
– Render lights, reusing deferred shading code
– Accumulate into lightmap
– Render object in alpha pass using lightmap
Our solution for alpha objects
Image attribution: Zephyris at en.wikipedia.
• Camera oriented quad fitted around and
centered in the particle system.
Our solution for particle systems
• For each particle system:
– Allocate a texture quad and fill it with interpolated
positions as light probes
– Render lights and accumulate into lightmap
– Render particles in alpha pass, converting from
clip space to lightmap coordinates.
Our solution for particle systems
Implementation details
• For performance reasons we pack all position
maps to a single texture.
• Every entity that needs alpha lighting will
allocate and use a region inside the texture.
World space
position
Light maps
Integration with deferred rendering
Fill G-Buffer
(Solid pass)
Render Lights Render Alpha
Deferred rendering
Our solution
Fill G-Buffer
(Solid pass)
Fill world
space light
probes
position map
Render lights
Render lights using world
space light probes map as
input and calculate alpha
lightmap
Render alpha using
alpha lightmap
Improvements
• Calculate a second texture with light direction
information.
• Other parameterizations for particle systems:
– Dust (one pixel per mote).
– Ribbons (a line of pixels).
• 3D volume slices for particle systems.
– Allocate a region for every slice
– Adds depth to the lighting solution.
3D volume slices
Slice n map
Slice 0 map
.
.
.
For each
slice we
allocate one
region
Demo
Demo
http://www.creative-assembly.com/jobs/
WE ARE HIRING!
Questions?
Jose Luis Sanchez Bonet
jose.sanchez@creative-assembly.com
Tomasz Stachowiak
tomasz.stachowiak@creative-assembly.com
twitter: h3r2tic

More Related Content

What's hot

Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Tiago Sousa
 
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time RaytracingCEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time RaytracingElectronic Arts / DICE
 
Terrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemTerrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemElectronic Arts / DICE
 
Deferred shading
Deferred shadingDeferred shading
Deferred shadingFrank Chao
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Tiago Sousa
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationGuerrilla
 
SIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
SIGGRAPH 2010 - Style and Gameplay in the Mirror's EdgeSIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
SIGGRAPH 2010 - Style and Gameplay in the Mirror's EdgeElectronic Arts / DICE
 
The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2Guerrilla
 
Modern Graphics Pipeline Overview
Modern Graphics Pipeline OverviewModern Graphics Pipeline Overview
Modern Graphics Pipeline Overviewslantsixgames
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3guest11b095
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteElectronic Arts / DICE
 
Epic_GDC2011_Samaritan
Epic_GDC2011_SamaritanEpic_GDC2011_Samaritan
Epic_GDC2011_SamaritanMinGeun Park
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Johan Andersson
 
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)Philip Hammer
 
Practical Occlusion Culling in Killzone 3
Practical Occlusion Culling in Killzone 3Practical Occlusion Culling in Killzone 3
Practical Occlusion Culling in Killzone 3Guerrilla
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Ki Hyunwoo
 
Rendering Tech of Space Marine
Rendering Tech of Space MarineRendering Tech of Space Marine
Rendering Tech of Space MarinePope Kim
 
Crysis Next-Gen Effects (GDC 2008)
Crysis Next-Gen Effects (GDC 2008)Crysis Next-Gen Effects (GDC 2008)
Crysis Next-Gen Effects (GDC 2008)Tiago Sousa
 
Killzone Shadow Fall: Creating Art Tools For A New Generation Of Games
Killzone Shadow Fall: Creating Art Tools For A New Generation Of GamesKillzone Shadow Fall: Creating Art Tools For A New Generation Of Games
Killzone Shadow Fall: Creating Art Tools For A New Generation Of GamesGuerrilla
 

What's hot (20)

Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3
 
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time RaytracingCEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
 
Terrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemTerrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable System
 
Deferred shading
Deferred shadingDeferred shading
Deferred shading
 
Lighting the City of Glass
Lighting the City of GlassLighting the City of Glass
Lighting the City of Glass
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next Generation
 
SIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
SIGGRAPH 2010 - Style and Gameplay in the Mirror's EdgeSIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
SIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
 
The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2
 
Modern Graphics Pipeline Overview
Modern Graphics Pipeline OverviewModern Graphics Pipeline Overview
Modern Graphics Pipeline Overview
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in Frostbite
 
Epic_GDC2011_Samaritan
Epic_GDC2011_SamaritanEpic_GDC2011_Samaritan
Epic_GDC2011_Samaritan
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
 
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
 
Practical Occlusion Culling in Killzone 3
Practical Occlusion Culling in Killzone 3Practical Occlusion Culling in Killzone 3
Practical Occlusion Culling in Killzone 3
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
 
Rendering Tech of Space Marine
Rendering Tech of Space MarineRendering Tech of Space Marine
Rendering Tech of Space Marine
 
Crysis Next-Gen Effects (GDC 2008)
Crysis Next-Gen Effects (GDC 2008)Crysis Next-Gen Effects (GDC 2008)
Crysis Next-Gen Effects (GDC 2008)
 
Killzone Shadow Fall: Creating Art Tools For A New Generation Of Games
Killzone Shadow Fall: Creating Art Tools For A New Generation Of GamesKillzone Shadow Fall: Creating Art Tools For A New Generation Of Games
Killzone Shadow Fall: Creating Art Tools For A New Generation Of Games
 

Viewers also liked

Twilio smart communicationaward2016_syatabe
Twilio smart communicationaward2016_syatabeTwilio smart communicationaward2016_syatabe
Twilio smart communicationaward2016_syatabeかえる 矢田部
 
Cómo es tu día activo
Cómo es tu día activoCómo es tu día activo
Cómo es tu día activomada20
 
PORÍFEROS (Andrés y Paco)
PORÍFEROS (Andrés y Paco)PORÍFEROS (Andrés y Paco)
PORÍFEROS (Andrés y Paco)Maruja Ruiz
 
La realidad virtual
La realidad virtualLa realidad virtual
La realidad virtualmonicopana
 
Failure analysis for dummies
Failure analysis for dummiesFailure analysis for dummies
Failure analysis for dummiesDomenico Fama
 
Tudo que-c3a9-sc3b3lido-desmancha-no-ar-introduc3a7c3a3o
Tudo que-c3a9-sc3b3lido-desmancha-no-ar-introduc3a7c3a3oTudo que-c3a9-sc3b3lido-desmancha-no-ar-introduc3a7c3a3o
Tudo que-c3a9-sc3b3lido-desmancha-no-ar-introduc3a7c3a3oPedro Lima
 
2006 alfa1 614727_b_ad01
2006 alfa1 614727_b_ad012006 alfa1 614727_b_ad01
2006 alfa1 614727_b_ad01Anam
 
C18 Regulasi Ekspresi Gen
C18 Regulasi Ekspresi GenC18 Regulasi Ekspresi Gen
C18 Regulasi Ekspresi GenCatatan Medis
 
Kerajinan dari fiberglass
Kerajinan dari fiberglassKerajinan dari fiberglass
Kerajinan dari fiberglassDini33
 
Kerajinan bahan lunak dan wirausaha pdf
Kerajinan bahan lunak dan wirausaha pdfKerajinan bahan lunak dan wirausaha pdf
Kerajinan bahan lunak dan wirausaha pdfEndang Rahayu
 
Screen Space Decals in Warhammer 40,000: Space Marine
Screen Space Decals in Warhammer 40,000: Space MarineScreen Space Decals in Warhammer 40,000: Space Marine
Screen Space Decals in Warhammer 40,000: Space MarinePope Kim
 
Gyan Sept.17, 2009
Gyan Sept.17, 2009Gyan Sept.17, 2009
Gyan Sept.17, 2009gyan1999
 

Viewers also liked (19)

Receta mejorada natillas chocolate
Receta mejorada natillas chocolateReceta mejorada natillas chocolate
Receta mejorada natillas chocolate
 
Twilio smart communicationaward2016_syatabe
Twilio smart communicationaward2016_syatabeTwilio smart communicationaward2016_syatabe
Twilio smart communicationaward2016_syatabe
 
Cómo es tu día activo
Cómo es tu día activoCómo es tu día activo
Cómo es tu día activo
 
99.SAR 2014
99.SAR 201499.SAR 2014
99.SAR 2014
 
PORÍFEROS (Andrés y Paco)
PORÍFEROS (Andrés y Paco)PORÍFEROS (Andrés y Paco)
PORÍFEROS (Andrés y Paco)
 
Proteinas
ProteinasProteinas
Proteinas
 
Práctica 1
Práctica 1Práctica 1
Práctica 1
 
La realidad virtual
La realidad virtualLa realidad virtual
La realidad virtual
 
Failure analysis for dummies
Failure analysis for dummiesFailure analysis for dummies
Failure analysis for dummies
 
Tudo que-c3a9-sc3b3lido-desmancha-no-ar-introduc3a7c3a3o
Tudo que-c3a9-sc3b3lido-desmancha-no-ar-introduc3a7c3a3oTudo que-c3a9-sc3b3lido-desmancha-no-ar-introduc3a7c3a3o
Tudo que-c3a9-sc3b3lido-desmancha-no-ar-introduc3a7c3a3o
 
2006 alfa1 614727_b_ad01
2006 alfa1 614727_b_ad012006 alfa1 614727_b_ad01
2006 alfa1 614727_b_ad01
 
Katalis
KatalisKatalis
Katalis
 
C18 Regulasi Ekspresi Gen
C18 Regulasi Ekspresi GenC18 Regulasi Ekspresi Gen
C18 Regulasi Ekspresi Gen
 
Kerajinan dari fiberglass
Kerajinan dari fiberglassKerajinan dari fiberglass
Kerajinan dari fiberglass
 
Kerajinan bahan lunak dan wirausaha pdf
Kerajinan bahan lunak dan wirausaha pdfKerajinan bahan lunak dan wirausaha pdf
Kerajinan bahan lunak dan wirausaha pdf
 
Screen Space Decals in Warhammer 40,000: Space Marine
Screen Space Decals in Warhammer 40,000: Space MarineScreen Space Decals in Warhammer 40,000: Space Marine
Screen Space Decals in Warhammer 40,000: Space Marine
 
8D : Méthode de résolution de problèmes
8D : Méthode de résolution de problèmes8D : Méthode de résolution de problèmes
8D : Méthode de résolution de problèmes
 
Gyan Sept.17, 2009
Gyan Sept.17, 2009Gyan Sept.17, 2009
Gyan Sept.17, 2009
 
Administração Horizontal
Administração HorizontalAdministração Horizontal
Administração Horizontal
 

Similar to Develop2012 deferred sanchez_stachowiak

The Next Generation of PhyreEngine
The Next Generation of PhyreEngineThe Next Generation of PhyreEngine
The Next Generation of PhyreEngineSlide_N
 
Paris Master Class 2011 - 01 Deferred Lighting, MSAA
Paris Master Class 2011 - 01 Deferred Lighting, MSAAParis Master Class 2011 - 01 Deferred Lighting, MSAA
Paris Master Class 2011 - 01 Deferred Lighting, MSAAWolfgang Engel
 
Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Jean-Philippe Doiron
 
Tessellation on any_budget-gdc2011
Tessellation on any_budget-gdc2011Tessellation on any_budget-gdc2011
Tessellation on any_budget-gdc2011basisspace
 
Rendering basics
Rendering basicsRendering basics
Rendering basicsicedmaster
 
Clean architecture for shaders unite2019
Clean architecture for shaders unite2019Clean architecture for shaders unite2019
Clean architecture for shaders unite2019Abhilash Majumder
 
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unity Technologies
 
Paris Master Class 2011 - 05 Post-Processing Pipeline
Paris Master Class 2011 - 05 Post-Processing PipelineParis Master Class 2011 - 05 Post-Processing Pipeline
Paris Master Class 2011 - 05 Post-Processing PipelineWolfgang Engel
 
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate GuideДмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate GuideUA Mobile
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicschangehee lee
 
Making a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonMaking a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonJean-Philippe Doiron
 
Overview of graphics systems.ppt
Overview of graphics systems.pptOverview of graphics systems.ppt
Overview of graphics systems.pptMalleshBettadapura1
 
OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading LanguageJungsoo Nam
 
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" DemoThe Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demodrandom
 
Overview of graphics systems
Overview of  graphics systemsOverview of  graphics systems
Overview of graphics systemsJay Nagar
 
Photo echance. Problems. Solutions. Ideas
Photo echance. Problems. Solutions. Ideas Photo echance. Problems. Solutions. Ideas
Photo echance. Problems. Solutions. Ideas Andrew Nikishaev
 
Video Compression Basics by sahil jain
Video Compression Basics by sahil jainVideo Compression Basics by sahil jain
Video Compression Basics by sahil jainSahil Jain
 

Similar to Develop2012 deferred sanchez_stachowiak (20)

Deferred shading
Deferred shadingDeferred shading
Deferred shading
 
The Next Generation of PhyreEngine
The Next Generation of PhyreEngineThe Next Generation of PhyreEngine
The Next Generation of PhyreEngine
 
Paris Master Class 2011 - 01 Deferred Lighting, MSAA
Paris Master Class 2011 - 01 Deferred Lighting, MSAAParis Master Class 2011 - 01 Deferred Lighting, MSAA
Paris Master Class 2011 - 01 Deferred Lighting, MSAA
 
Light prepass
Light prepassLight prepass
Light prepass
 
Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill)
 
Tessellation on any_budget-gdc2011
Tessellation on any_budget-gdc2011Tessellation on any_budget-gdc2011
Tessellation on any_budget-gdc2011
 
Rendering basics
Rendering basicsRendering basics
Rendering basics
 
Clean architecture for shaders unite2019
Clean architecture for shaders unite2019Clean architecture for shaders unite2019
Clean architecture for shaders unite2019
 
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
 
Paris Master Class 2011 - 05 Post-Processing Pipeline
Paris Master Class 2011 - 05 Post-Processing PipelineParis Master Class 2011 - 05 Post-Processing Pipeline
Paris Master Class 2011 - 05 Post-Processing Pipeline
 
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate GuideДмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
 
Making a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonMaking a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie Tycoon
 
Overview of graphics systems.ppt
Overview of graphics systems.pptOverview of graphics systems.ppt
Overview of graphics systems.ppt
 
OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading Language
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" DemoThe Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
 
Overview of graphics systems
Overview of  graphics systemsOverview of  graphics systems
Overview of graphics systems
 
Photo echance. Problems. Solutions. Ideas
Photo echance. Problems. Solutions. Ideas Photo echance. Problems. Solutions. Ideas
Photo echance. Problems. Solutions. Ideas
 
Video Compression Basics by sahil jain
Video Compression Basics by sahil jainVideo Compression Basics by sahil jain
Video Compression Basics by sahil jain
 

Recently uploaded

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Recently uploaded (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Develop2012 deferred sanchez_stachowiak

  • 1.
  • 2. Solving Some Common Problems in a Modern Deferred Rendering Engine Jose Luis Sanchez Bonet Tomasz Stachowiak /* @h3r2tic */
  • 3. Deferred rendering – pros and cons • Pros ( some ) – Very scalable – No shader permutation explosion – G-Buffer useful in other techniques • SSAO, SRAA, decals, … • Cons ( some ) – Difficult to use multiple shading models – Does not handle translucent geometry • Some variants do, but may be impractical
  • 4. • The BRDF defines the look of a surface – Bidirectional Reflectance Distribution Function 𝐿 𝑜 = 𝐿 𝑒 + Ω 𝐿𝑖 ∙ ∙ cos 𝜃 ∙ 𝛿𝜔 • Typically games use just one ( Blinn-Phong ) – Simple, but inaccurate • Very important in physically based rendering – Want more: Oren-Nayar, Kajiya-Kay, Penner, Cook-Torrance, … Reflectance models
  • 5. BRDFs vs. rendering • Forward rendering – Material shader directly evaluates the BRDF • Trivial • Deferred rendering – Light shaders decoupled from materials – No obvious solution Material G-Buffer Light BRDF ???
  • 6. BRDFs vs. deferred – branching? • Read shading model ID in the lighting shader, branch • Might be the way to go on next-gen • Expensive on current consoles – Tax for branches never taken • Don’t want to pay it for every light Three different BRDFs, only one used ( branch always yields the first one ) Platform 1 BRDF 2 BRDFs 3 BRDFs 360 1.85 ms 2.1 ms ( + 0.25 ms ) 2.35 ms ( + 0.5 ms ) PS3 1.9 ms 2.48 ms ( + 0.58 ms ) 2.8 ms ( + 0.9 ms )
  • 7. BRDFs vs. deferred – LUTs? • Pre-calculate BRDF look-up tables • Might be shippable enough – See: S.T.A.L.K.E.R. • Limited control over parameters – Roughness – Anisotropy, etc. • BRDFs highly dimensional – Isotropic with roughness control → 3D LUT
  • 8. BRDFs vs. deferred – our approach • One default BRDF – Others a relatively rare case • Shading model ID in stencil • Multi-pass light rendering • Mask out parts of the scene in each pass
  • 9. Multi-pass – tax avoidance • For each light – Find all affected BRDFs – Render the light volume once for each model • Analogous to multi-pass forward rendering! • Store bounding volumes of objects with non-standard BRDFs – Intersect with light volumes
  • 10. Making it practical • Needs to work with depth culling of lights • Hierarchical stencil on 360 and PS3
  • 11. Depth culling of lights • Assuming viewer is outside the light volume • Render back faces of light volume – Increment stencil; no color output • Render front faces – Only where stencil = 0; write color • Render back faces – Clear stencil; no color output
  • 12. Depth culling of lights • Assuming viewer is outside the light volume • Start with stencil = 0 • Render front faces – Only where stencil = 0; write color • Render back faces – Clear stencil; no color output
  • 13. Depth culling of lights • Assuming viewer is outside the light volume • Start with stencil = 0 • Render back faces of light volume – Increment stencil; no color output • Render back faces – Clear stencil; no color output
  • 14. Culling with BRDFs • Pack the culling bit and BRDF together • Use masks to read/affect required parts • Assuming 8 supported BRDFs: Unused BRDF ID Culling bit 7 6 5 4 3 2 1 0 culling_mask = 0x01 brdf_mask = 0x0E brdf_shift = 1
  • 16. Handling miscellaneous data in stencil • Stencil value may contain extra data – Used in earlier / later rendering passes – Need to ignore it somehow – Stencil read mask? • Doesn’t work with the 360’s hi-stencil Garbage BRDF ID Culling bit 7 6 5 4 3 2 1 0
  • 17. Stencil operation Read Read mask Comparison Operator Write (masked ) Result <, <=, >, ==, … ++, --, = 0, …
  • 18. Hierarchical stencil operation Read Read mask Comparison Operator Write (masked ) Result Hi-stencil comparison Hi-stencil Hi-stencil comparison Hi-stencil <, <=, >, ==, … ==, != PS3 360 <, <=, >, ==, … ++, --, = 0, …
  • 19. Spanner in the works Breaks if stencil contains garbage we can’t mask out
  • 20. Handling stencil garbage • Can’t do it in a non-destructive manner – Take off and nuke the entire site from orbit – It’s the only way to be sure • Extra cleaning pass? – Don’t want to pay for it! • Do it as we go! • Save your stencil if you need it – Sorry for calling it garbage :`( – We were already restoring it later on the 360 – Don’t need to destroy it on the PS3, use a read mask!
  • 21. Performance Platform 1 BRDF 2 BRDFs 3 BRDFs 360 ( branching ) 1.85 ms 2.1 ms ( + 0.25 ms ) 2.35 ms ( + 0.5 ms ) 360 ( stencil ) 1.85 ms 1.99 ms ( + 0.14 ms ) 2.13 ms ( + 0.28 ms ) PS3 ( branching ) 1.9 ms 2.48 ms ( + 0.58 ms ) 2.8 ms ( + 0.9 ms ) PS3 ( stencil ) 1.9 ms 2.13 ms ( + 0.23 ms ) 2.31 ms ( + 0.41 ms ) For each BRDF Platform Initial setup Mask Render Cleanup 360 0.03 ms 0.1 ms >= 0.036 ms 0.022 ms PS3 0.03 ms 0.1 ms >= 0.06 ms 0.14 ms
  • 22. Multi-pass light rendering – final notes • No change in single-BRDF rendering – Use your madly optimized routines • No need for a ‘default’ shading model – It’s just our use case – As long as you efficiently find influenced BRDFs • Flush your hi-stencil • Tiny lights? Try branching instead. – Performance figures only from huge lights! – With tiny lights, hi-stencil juggling becomes inefficient
  • 23. Lighting alpha objects in deferred rendering engines • Classic solutions: – Forward rendering. – CPU based, one light probe per each object. • Our solution: – GPU based. – More than one light probe. – Calculate a lightmap for each object each frame. – Used for objects and particle systems. – Fits perfectly into a deferred rendering pipeline.
  • 24. • Object space map: Our solution for alpha objects Every pixel stores the local space position on the object’s surface Image attribution: Zephyris at en.wikipedia.
  • 25. • For each object: – Use baked positions as light probes • Transform object space map into world space – Render lights, reusing deferred shading code – Accumulate into lightmap – Render object in alpha pass using lightmap Our solution for alpha objects Image attribution: Zephyris at en.wikipedia.
  • 26. • Camera oriented quad fitted around and centered in the particle system. Our solution for particle systems
  • 27. • For each particle system: – Allocate a texture quad and fill it with interpolated positions as light probes – Render lights and accumulate into lightmap – Render particles in alpha pass, converting from clip space to lightmap coordinates. Our solution for particle systems
  • 28. Implementation details • For performance reasons we pack all position maps to a single texture. • Every entity that needs alpha lighting will allocate and use a region inside the texture. World space position Light maps
  • 29. Integration with deferred rendering Fill G-Buffer (Solid pass) Render Lights Render Alpha Deferred rendering
  • 30. Our solution Fill G-Buffer (Solid pass) Fill world space light probes position map Render lights Render lights using world space light probes map as input and calculate alpha lightmap Render alpha using alpha lightmap
  • 31. Improvements • Calculate a second texture with light direction information. • Other parameterizations for particle systems: – Dust (one pixel per mote). – Ribbons (a line of pixels). • 3D volume slices for particle systems. – Allocate a region for every slice – Adds depth to the lighting solution.
  • 32. 3D volume slices Slice n map Slice 0 map . . . For each slice we allocate one region
  • 33. Demo
  • 34. Demo
  • 36. Questions? Jose Luis Sanchez Bonet jose.sanchez@creative-assembly.com Tomasz Stachowiak tomasz.stachowiak@creative-assembly.com twitter: h3r2tic

Editor's Notes

  1. Good news everyone! I'm Tom, this is Jose, and we're going to talk about deferred rendering. The focus is on current generation consoles, but the presented techniques can be used on just about any platform, so we hope anyone can benefit from them.
  2. Deferred rendering has been very popular lately due to its scalability, and because it plays nicely with other techniques, which can reuse the G-Buffer. At the same time, it doesn’t come without downsides. We are going to cover two of them in this presentation, and propose the custom solutions we've developed for our upcoming console title. The two problems are: handling many shading models, and rendering translucent geometry. I'm going to cover the former in the first half of the presentation, and then Jose will talk about translucency.
  3. In graphics rendering, we use simple mathematical formulas, to approximate the look of some classes of surfaces. The most commonly used model, or Bidirectional Reflectance Distribution Function, is Blinn-Phong, which works reasonably well as an approximation of some dielectrics. It is used due to its simplicity, but for the same reason, it cannot reproduce the look of many surfaces accurately. You might want to render your plastics with Blinn, skin with Eric Penner's pre-integrated model, hair with Kajiya-Kay or Marschner, brushed metal with anisotropic Ward, and so on. The visual properties of these surfaces are vastly different, and can not be covered with just a single, simple mathematical model.
  4. So how do we render with multiple shading models? If you use forward rendering, this is trivial. Because the BRDF is combined with the material in the same shader, it just works. However, in deferred rendering, we need to evaluate the reflectance model in the light shader, and these don't bear any connection to material shaders that the BRDFs are associated with.
  5. One approach would be to branch in the light shader. That is, the solid pass emits an identifier of the BRDF into the G-Buffer. The light shader reads it and branches upon its value. This solution might be viable on next-gen hardware, but it doesn't fare quite well on current consoles. In a small test case we did with a single full-screen light, branching brough the rendering cost from 1.85 to 2.1 milliseconds for just a single extra shading model. This is the tax you pay for not even taking the branch. That is, our test case is synthetic, and only the first BRDF is ever used. And it gets much worse on the PS3, which doesn't even have control flow instructions.
  6. One could also tabulate the BRDF data, and sample it using a combination of an ID, as well as some geometric parameters, such as N dot L and N dot H. One such approach has been used successfully in the game S.T.A.L.K.E.R., so it might be enough for your title as well. The trouble is, BRDFs are highly dimensional functions, so tabulation might be difficult; for example, the data for an isotropic BRDF parameterized by surface roughness, is already at least a 3-dimensional function. /* See Michael Ashikhmin’s "Distribution-Based BRDFs“. */
  7. We decided to use a single reflectance model for most of our scene geometry, and then special-case rendering in rare instances, such as skin and hair. The core of the idea is pretty simple: when rendering the solid pass, we store the ID of the shading model in the stencil buffer. Then in the lighting pass, we draw light geometry once for each BRDF, using the ID as a mask.
  8. Implemented like this, the idea would be inefficient. We would be multiplying the number of draw calls and shader switches by the number of supported BRDFs. However, when rendering a light, we can detect which BRDFs it can potentially use, and skip any extra processing. If you think of it, this is a very similar idea to multi-pass forward rendering. Here's a scene with two objects, both of which use different shading models. We have two lights influencing them. The light on the left is interesting, in that it will only affect just one object, hence only one BRDF. Therefore it doesn't need to run the multi-BRDF code path at all. To accomplish this optimization, we store the bounding boxes of all objects which use non-standard shading models. During light rendering, we intersect light volumes with these bounds, and conservatively find a list of all BRDFs which a light may potentially touch. /* We could in theory detect which BRDFs a light may affect and only use dynamic branching there, but then we either always pay a high cost, or we would need to create lots of shader permutations, for example “shading model A and B, A with B and C, A with C, B with C, et cetera.” For this reason we are just going to use multi-pass rendering. */
  9. Now, there are two more bits to the algorithm, needed to make it practical. Firstly, it needs to work with the commonly used stencil and depth-based light culling trick. Secondly, it must play well with the hierarchical stencil buffer. Let's start with a quick reminder of depth culling for lights. Consider a surface rendered into the G-buffer, and three lights. The left one is completely in front of the surface, so cannot influence it. The right one is behind the surface, so cannot influence it either. Only the middle one contributes to lighting, because its volume intersects the surface in the G-buffer.
  10. So how do we accomplish that using stencil testing? Let's consider the case when the viewer is outside of the light's volume. The stencil is initially clear.
  11. We start by writing the value of one into the stencil by using the back faces of the light volume. This will result in the stencil being set where the light is completely in front of the surface. Therefore we only want to render where the stencil is zero …
  12. … and we do so using the front faces with stencil testing enabled. Note that this is a vanilla version of the algorithm, and you may be using an optimized one.
  13. Extending this idea to selectively rendering multiple shading models, we need to pack both the culling bit and the shading model identifier in the stencil buffer. Because stencil testing supports read and write masks, we can act upon and affect portions of the stencil value. Here’s a sample layout assuming a maximum of eight supported BRDFs. Note that the BRDF bits can be placed at any offset in the byte.
  14. OK, let's get down to the actual rendering passes. First of all, we will be using the hierarchical stencil buffer, so that the GPU may reject entire rasterization tiles. This is where the bulk of our time savings actually comes from, as the regular stencil test happens after you’ve already paid the pixel shading cost. We start the same as with just depth-based culling. We draw back faces of the light volume with the stencil set to Increment. Once again, this marks areas we don’t want to render to. At this point, we have determined the list of BRDFs the light can potentially influence. For each of them, we create a hi-stencil mask first, then we render the volume again with the actual shader. Creating the mask is fairly cheap, so even though we render twice, we typically save time by hi-stencil culling the expensive shader. Finally, the last step restores the affected stencil area, so that the next light can render.
  15. We have been assuming that the stencil values are clear of any unrelated data. Yet in practice, they will carry multiple meanings, and rendering engines will have their own 'magic' stencil encodings. /* One example would be using a single bit of stencil to mask out dynamic objects from being affected by deferred decals. */ Unfortunately, such extra bits turn out to be garbage from the point of view of the proposed algorithm, and we cannot simply ignore them with read masks, at least not on the XBox 360.
  16. Let's take a look at the stencil operation to figure out why. The GPU first reads the original value and applies a user-specified mask to it. This value is then compared with a reference constant using one of several predicates, such as Greater, Less, Equal, et cetera. Upon the result of this comparison as well as the the depth test, an operator may be applied to the stencil value, such as incrementing or zeroing it. Finally, the resulting value is written back into the stencil buffer.
  17. How does the hi-stencil integrate with this pipeline? On the PS3, we get to specify a mask and a comparison function for the hi-stencil test, very much like in the regular one. This means that we can ignore any bits we don’t like. The 360 however, takes its hi-stencil value from the completely opposite end of the pipe, from the final value written back to the stencil buffer. Furthermore, we may only specify a trivial equality or inequality predicate against a reference value.
  18. Unfortunately, this throws a spanner in our hi-stencil mask creation. Since the 360 can only create its mask from the full value, any garbage bits will cause the corresponding tiles to be culled.
  19. Well, if we can’t ignore the extra bits, I say we nuke them from orbit. The easiest way would be to have a separate pass which cleans the stencil buffer, removing any garbage bits. On the other hand, we don't want to add any more fixed cost steps into our rendering, especially at the end of the current hardware generation, when everyone is battling for the last microseconds. Fortunately, we can clear out the garbage bits as we go. When creating the hi-stencil mask, we will set the regular stencil operator to do so, while skipping over the ID of the shading model. Now, I've been calling these "garbage bits", but you may have good reasons for extra information in your stencil buffer. Chances are that on the 360 you restore them at a later point anyway, due to limited EDRAM resources. On the PS3 we don't need to clobber the bits at all, due to its more flexible hi-stencil buffer creation process.
  20. How’s performance then? Let’s recall the figures from one of the first slides. With the dynamic branching approach, we had to pay a pretty hefty tax, especially on the PS3. How does the proposed algorithm stack against that? We still pay a slight tax, but only for the lights which render with multiple shading models, and only for the models we actually use. This is especially important if we support many shading models, but each light affects very few on average. Then we end up paying a considerably smaller cost for the extra shading models
  21. That's pretty much the whole algorithm. I'd just like to emphasize a few extra points. First of all, nothing is changed for single-BRDF rendering! If you conservatively figure out that a light only influences geometry with a single reflectance model, you can reuse your old light rendering code! Secondly, you don't really need to have a 'default' shading model for the whole level. As long as you can quickly classify which BRDFs a light can potentially influence, then you're golden. Next, remember to flush your hi-stencil when changing the reference value or the comparison function, otherwise you’ll get false culling. Finally, we’ve only given performance figures for lights taking a up significant portion of the screen. When a light is small and rendered with multiple BRDFs, the cost will be dominated by hi-stencil juggling. It might be worthwhile to use dynamic branching in the light shader below a certain size threshold. Okay, that’s all for me, now Jose is going to tell you about lighting translucent geometry!
  22. Classic solutions: Forward rendering. Best quality solution, it calculates lighting for every pixel. Problems: Too expensive, especially if a lot of alpha layers are used. Shader permutation explosion if you want to support a lot of light types and combinations. Completely different than deferred rendering, we need to support two pipelines. We can use Forward+, but we are aiming to X360 and PS3. Calculated in CPU, one light probe (intensity, SH, etc) for each object. Problems: Only one light probe per object, it means same light configuration for all of the objects, a lot of issues with big ones. It is not easy to support shadow map casting lights. Our solution: GPU based. More than one light probe per object. Quality between the two classic solutions. It is just a lightmap for every object updated every frame. Lighting is calculated in object space. It can be used for objects and particle systems. It fits perfectly into a deferred engine pipeline.
  23. For each alpha object we will create a distribution of light probes on the surface. Artists will define an UV channel with an unwrapped version of the object (like lightmaps), during export we will create a texture (we call it object space map, the size will depend of the surface area of the object). Every pixel in the object space map will represent a local space position on the surface of the object.
  24. We convert every probe from the object space map to world space using the world matrix of the object. Render lights: We render a pass with a very similar shader that in deferred rendering. The input is a texture with world space light probe positions (calculated from the object space map) and the output will be a lightmap with the light that the light probes receive. It can reuse a lot of functions from deferred rendering code, like shadowmapping. Render object in alpha pass using lightmap. We use the UV channel for the object space map to access the lightmap.
  25. For each particle system we need a set of light probes distributed around it. As the particles are camera oriented, we are going to use a camera oriented quad fitted around and centered in the particle system. It is not a perfect representation, but it is really fast and it is simple, and it works in practice. If the particle system intersects the camera frustum we can just fit our quad, so we can improve the quality when the particle system fills the screen.
  26. For recovering the lighting information we just use a 2D matrix that converts from clip space coordinates (our quad is screen space orientated) to lightmap texture space.
  27. The two solution have a lot in common. For performance reasons we pack all the world space position maps to one single texture, so we can calculate the lighting of all the objects at the same time. Two GPU textures: Input: World space position texture, similar to the gbuffer in deferred rendering. Output: Accumulated light. Every object that needs calculate lighting will allocate a region inside the textures and fill it with the positions of the light probes. The size of the region can depend on the screen space size of the object to improve performance and scalability. For improving performance, we check on CPU every light against every object, so we only apply the light shader to the regions that are inside the light.
  28. Deferred rendering engine. Fill gbuffer Render lights Render alpha
  29. Added two extra steps in our deferred engine.
  30. Having light direction information will allow bump mapping, occlusion and scattering effects.
  31. For performance reasons, we can disable 3D volume slices when the particle system is far from the camera.
  32. Thanks to Howard Rayner, our technical artist and vfx magician for preparing these demos!