SlideShare a Scribd company logo
A 2.5D CULLING FOR FORWARD+
AMD
Takahiro Harada
2 | A 2.5D culling for Forward+ | Takahiro Harada
AGENDA
 Forward+
–  Forward, Deferred, Forward+
–  Problem description
 2.5D culling
 Results
3 | A 2.5D culling for Forward+ | Takahiro Harada
FORWARD+
4 | A 2.5D culling for Forward+ | Takahiro Harada
REAL-TIME SOLUTION COMPARISON
 Rendering equation
 Forward
 Deferred
 Forward+
5 | A 2.5D culling for Forward+ | Takahiro Harada
FORWARD RENDERING PIPELINE
 Depth prepass
–  Fills z buffer
 Prevent overdraw for shading
 Shading
–  Geometry is rendered
–  Pixel shader
 Iterate through light list set for each object
 Evaluates materials for the lights
6 | A 2.5D culling for Forward+ | Takahiro Harada
FORWARD+ RENDERING PIPELINE
 Depth prepass
–  Fills z buffer
 Prevent overdraw for shading
 Used for pixel position reconstruction for light culling
 Light culling
–  Culls light per tile basis
–  Input: z buffer, light buffer
–  Output: light list per tile
 Shading
–  Geometry is rendered
–  Pixel shader
 Iterate through light list calculated in light culling
 Evaluates materials for the lights
1
2
3
[1,2,3][1] [2,3]
7 | A 2.5D culling for Forward+ | Takahiro Harada
CREATING A FRUSTUM FOR A TILE
 An edge @SS == A plane @VS
 A tile (4 edges) @SS == 4 planes @VS
–  Open frustum (no bound in Z direction)
 Max and min Z is used to cap
8 | A 2.5D culling for Forward+ | Takahiro Harada
LONG FRUSTUM
 Screen space culling is not always sufficient
–  Create a frustum from max and min depth values
–  Edge of objects
–  Captures a lot of unnecessary lights
9 | A 2.5D culling for Forward+ | Takahiro Harada
LONG FRUSTUM
 Screen space culling is not always sufficient
–  Create a frustum from max and min depth values
–  Edge of objects
–  Captures a lot of unnecessary lights
0 lights
25 lights
50 lights
10 | A 2.5D culling for Forward+ | Takahiro Harada
GET WORSE IN A COMPLEX SCENE
0 lights
100 lights
200 lights
11 | A 2.5D culling for Forward+ | Takahiro Harada
QUESTION
 Want to reduce false positives
 Can we improve the culling without adding much overhead?
–  Computation time, memory
–  Culling itself is an optimization
–  Spending a lot of resources for it does not make sense
 Using a 3D grid is a natural extension
–  Uses too much memory
12 | A 2.5D culling for Forward+ | Takahiro Harada
2.5D CULLING
13 | A 2.5D culling for Forward+ | Takahiro Harada
2.5D CULLING
 Additional memory usage
–  0B global memory
–  4B local memory per WG (can compress more if you want)
 Additional computation complexity
–  A few bit and arithmetic instructions
–  A few lines of codes for light culling
–  No changes for other stages
 Additional runtime overhead
–  < 10% compared to the original light culling
14 | A 2.5D culling for Forward+ | Takahiro Harada
IDEA
 Split frustum in z direction
–  Uniform split for a frustum
–  Varying split among frustums
(a) (b)
15 | A 2.5D culling for Forward+ | Takahiro Harada
FRUSTUM CONSTRUCTION
 Calculate depth bound
–  max and min values of depth
 Split depth direction into 32 cells
–  Min value and cell size
 Flag occupied cell
 A 32bit depth mask per work group
A tile
16 | A 2.5D culling for Forward+ | Takahiro Harada
FRUSTUM CONSTRUCTION
 Calculate depth bound
–  max and min values of depth
 Split depth direction into 32 cells
–  Min value and cell size
 Flag occupied cell
 A 32bit depth mask per work group
7 7 7 7
7 7 7 2
7 7 2 1
7 2 1 0
Depth mask = 11100001
A tile
0 1 2 3 4 5 6 7
17 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING
 If a light overlaps to the frustum
–  Calculate depth mask for the light
–  Check overlap using the depth mask of the frustum
 Depth mask & Depth mask
–  11100001 & 00011000 = 00000000
Depth mask = 11100001
Depth mask = 00011000
18 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING
 If a light overlaps to the frustum
–  Calculate depth mask for the light
–  Check overlap using the depth mask of the frustum
 Depth mask & Depth mask
–  11100001 & 00110000 = 00100000
Depth mask = 11100001
Depth mask = 00110000
19 | A 2.5D culling for Forward+ | Takahiro Harada
CODE
Original With 2.5D culling
20 | A 2.5D culling for Forward+ | Takahiro Harada
RESULTS
21 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING
22 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING + 2.5D CULLING
23 | A 2.5D culling for Forward+ | Takahiro Harada
COMPARISON
1"
10"
100"
1000"
10000"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17" 18" 19" 20" 21" 22" 23"
Number'of'*les'
Number'of'lights'(x10)'
With"2.5D"culling"
Without"2.5D"culling"
220 lights/frustum -> 120 lights/frustum
24 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING
25 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING + 2.5D CULLING
26 | A 2.5D culling for Forward+ | Takahiro Harada
COMPARISON
1"
10"
100"
1000"
10000"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17"
Number'of'*les'
Number'of'lights'(x10)'
With"2.5D"culling"
Without"2.5D"culling"
27 | A 2.5D culling for Forward+ | Takahiro Harada
PERFORMANCE
0"
1"
2"
3"
4"
5"
6"
1024" 2048" 3072" 4096"
!me$(ms)$
Number$of$lights$
Forward+"w."frustum"culling"
Forward+"w."2.5D"
Deferred"
28 | A 2.5D culling for Forward+ | Takahiro Harada
CONCLUSION
 Proposed 2.5D culling which
–  Additional memory usage
  0B global memory
  4B local memory per WG (can compress more if you want)
–  Additional compute complexity
  3 lines of pseudo codes for light culling
  No changes for other stages
–  Additional runtime overhead
  < 10% compared to the original light culling
 Showed that 2.5D culling reduces false positives

More Related Content

What's hot

Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
Tiago Sousa
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
Electronic Arts / DICE
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
Electronic Arts / DICE
 
Forward+ (EUROGRAPHICS 2012)
Forward+ (EUROGRAPHICS 2012)Forward+ (EUROGRAPHICS 2012)
Forward+ (EUROGRAPHICS 2012)Takahiro Harada
 
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time RaytracingCEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
Electronic Arts / DICE
 
Physically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbitePhysically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in Frostbite
Electronic Arts / DICE
 
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Electronic Arts / DICE
 
Lighting the City of Glass
Lighting the City of GlassLighting the City of Glass
Lighting the City of Glass
Electronic Arts / DICE
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Tiago Sousa
 
Lighting of Killzone: Shadow Fall
Lighting of Killzone: Shadow FallLighting of Killzone: Shadow Fall
Lighting of Killzone: Shadow Fall
Guerrilla
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
Wolfgang Engel
 
Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-rendering
mistercteam
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
Ki Hyunwoo
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering Pipeline
Narann29
 
Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3
Electronic Arts / DICE
 
Z Buffer Optimizations
Z Buffer OptimizationsZ Buffer Optimizations
Z Buffer Optimizationspjcozzi
 
A Real-time Radiosity Architecture
A Real-time Radiosity ArchitectureA Real-time Radiosity Architecture
A Real-time Radiosity Architecture
Electronic Arts / DICE
 
Unreal Summit 2016 Seoul Lighting the Planetary World of Project A1
Unreal Summit 2016 Seoul Lighting the Planetary World of Project A1Unreal Summit 2016 Seoul Lighting the Planetary World of Project A1
Unreal Summit 2016 Seoul Lighting the Planetary World of Project A1
Ki Hyunwoo
 
Rendering Tech of Space Marine
Rendering Tech of Space MarineRendering Tech of Space Marine
Rendering Tech of Space Marine
Pope Kim
 

What's hot (20)

Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
 
Forward+ (EUROGRAPHICS 2012)
Forward+ (EUROGRAPHICS 2012)Forward+ (EUROGRAPHICS 2012)
Forward+ (EUROGRAPHICS 2012)
 
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time RaytracingCEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
 
Light prepass
Light prepassLight prepass
Light prepass
 
Physically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbitePhysically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in Frostbite
 
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
 
Lighting the City of Glass
Lighting the City of GlassLighting the City of Glass
Lighting the City of Glass
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)
 
Lighting of Killzone: Shadow Fall
Lighting of Killzone: Shadow FallLighting of Killzone: Shadow Fall
Lighting of Killzone: Shadow Fall
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
 
Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-rendering
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering Pipeline
 
Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3
 
Z Buffer Optimizations
Z Buffer OptimizationsZ Buffer Optimizations
Z Buffer Optimizations
 
A Real-time Radiosity Architecture
A Real-time Radiosity ArchitectureA Real-time Radiosity Architecture
A Real-time Radiosity Architecture
 
Unreal Summit 2016 Seoul Lighting the Planetary World of Project A1
Unreal Summit 2016 Seoul Lighting the Planetary World of Project A1Unreal Summit 2016 Seoul Lighting the Planetary World of Project A1
Unreal Summit 2016 Seoul Lighting the Planetary World of Project A1
 
Rendering Tech of Space Marine
Rendering Tech of Space MarineRendering Tech of Space Marine
Rendering Tech of Space Marine
 

Similar to A 2.5D Culling for Forward+ (SIGGRAPH ASIA 2012)

new_age_graphics_android_x86
new_age_graphics_android_x86new_age_graphics_android_x86
new_age_graphics_android_x86Droidcon Berlin
 
SIGGRAPH 2018 - PICA PICA and NVIDIA Turing
SIGGRAPH 2018 - PICA PICA and NVIDIA TuringSIGGRAPH 2018 - PICA PICA and NVIDIA Turing
SIGGRAPH 2018 - PICA PICA and NVIDIA Turing
Electronic Arts / DICE
 
Linear Programming
Linear ProgrammingLinear Programming
Linear Programming
syed_shahzad786
 
Linear Programming
Linear ProgrammingLinear Programming
Linear Programming
syed_shahzad786
 
Linear Programming
Linear ProgrammingLinear Programming
Linear Programming
syed_shahzad786
 
Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Matthias Trapp
 

Similar to A 2.5D Culling for Forward+ (SIGGRAPH ASIA 2012) (6)

new_age_graphics_android_x86
new_age_graphics_android_x86new_age_graphics_android_x86
new_age_graphics_android_x86
 
SIGGRAPH 2018 - PICA PICA and NVIDIA Turing
SIGGRAPH 2018 - PICA PICA and NVIDIA TuringSIGGRAPH 2018 - PICA PICA and NVIDIA Turing
SIGGRAPH 2018 - PICA PICA and NVIDIA Turing
 
Linear Programming
Linear ProgrammingLinear Programming
Linear Programming
 
Linear Programming
Linear ProgrammingLinear Programming
Linear Programming
 
Linear Programming
Linear ProgrammingLinear Programming
Linear Programming
 
Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)
 

More from Takahiro Harada

201907 Radeon ProRender2.0@Siggraph2019
201907 Radeon ProRender2.0@Siggraph2019201907 Radeon ProRender2.0@Siggraph2019
201907 Radeon ProRender2.0@Siggraph2019
Takahiro Harada
 
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
Takahiro Harada
 
Introduction to OpenCL (Japanese, OpenCLの基礎)
Introduction to OpenCL (Japanese, OpenCLの基礎)Introduction to OpenCL (Japanese, OpenCLの基礎)
Introduction to OpenCL (Japanese, OpenCLの基礎)
Takahiro Harada
 
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
Takahiro Harada
 
確率的ライトカリング 理論と実装 (CEDEC2016)
確率的ライトカリング 理論と実装 (CEDEC2016)確率的ライトカリング 理論と実装 (CEDEC2016)
確率的ライトカリング 理論と実装 (CEDEC2016)
Takahiro Harada
 
Introducing Firerender for 3DS Max
Introducing Firerender for 3DS MaxIntroducing Firerender for 3DS Max
Introducing Firerender for 3DS Max
Takahiro Harada
 
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
Takahiro Harada
 
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Takahiro Harada
 
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Takahiro Harada
 
Foveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUsFoveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUs
Takahiro Harada
 
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Takahiro Harada
 
Physics Tutorial, GPU Physics (GDC2010)
Physics Tutorial, GPU Physics (GDC2010)Physics Tutorial, GPU Physics (GDC2010)
Physics Tutorial, GPU Physics (GDC2010)Takahiro Harada
 
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...Takahiro Harada
 
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)Takahiro Harada
 
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)Takahiro Harada
 
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)Takahiro Harada
 

More from Takahiro Harada (16)

201907 Radeon ProRender2.0@Siggraph2019
201907 Radeon ProRender2.0@Siggraph2019201907 Radeon ProRender2.0@Siggraph2019
201907 Radeon ProRender2.0@Siggraph2019
 
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
 
Introduction to OpenCL (Japanese, OpenCLの基礎)
Introduction to OpenCL (Japanese, OpenCLの基礎)Introduction to OpenCL (Japanese, OpenCLの基礎)
Introduction to OpenCL (Japanese, OpenCLの基礎)
 
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
 
確率的ライトカリング 理論と実装 (CEDEC2016)
確率的ライトカリング 理論と実装 (CEDEC2016)確率的ライトカリング 理論と実装 (CEDEC2016)
確率的ライトカリング 理論と実装 (CEDEC2016)
 
Introducing Firerender for 3DS Max
Introducing Firerender for 3DS MaxIntroducing Firerender for 3DS Max
Introducing Firerender for 3DS Max
 
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
 
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
 
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
 
Foveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUsFoveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUs
 
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
 
Physics Tutorial, GPU Physics (GDC2010)
Physics Tutorial, GPU Physics (GDC2010)Physics Tutorial, GPU Physics (GDC2010)
Physics Tutorial, GPU Physics (GDC2010)
 
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
 
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
 
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
 
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
 

Recently uploaded

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 

Recently uploaded (20)

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 

A 2.5D Culling for Forward+ (SIGGRAPH ASIA 2012)

  • 1. A 2.5D CULLING FOR FORWARD+ AMD Takahiro Harada
  • 2. 2 | A 2.5D culling for Forward+ | Takahiro Harada AGENDA  Forward+ –  Forward, Deferred, Forward+ –  Problem description  2.5D culling  Results
  • 3. 3 | A 2.5D culling for Forward+ | Takahiro Harada FORWARD+
  • 4. 4 | A 2.5D culling for Forward+ | Takahiro Harada REAL-TIME SOLUTION COMPARISON  Rendering equation  Forward  Deferred  Forward+
  • 5. 5 | A 2.5D culling for Forward+ | Takahiro Harada FORWARD RENDERING PIPELINE  Depth prepass –  Fills z buffer  Prevent overdraw for shading  Shading –  Geometry is rendered –  Pixel shader  Iterate through light list set for each object  Evaluates materials for the lights
  • 6. 6 | A 2.5D culling for Forward+ | Takahiro Harada FORWARD+ RENDERING PIPELINE  Depth prepass –  Fills z buffer  Prevent overdraw for shading  Used for pixel position reconstruction for light culling  Light culling –  Culls light per tile basis –  Input: z buffer, light buffer –  Output: light list per tile  Shading –  Geometry is rendered –  Pixel shader  Iterate through light list calculated in light culling  Evaluates materials for the lights 1 2 3 [1,2,3][1] [2,3]
  • 7. 7 | A 2.5D culling for Forward+ | Takahiro Harada CREATING A FRUSTUM FOR A TILE  An edge @SS == A plane @VS  A tile (4 edges) @SS == 4 planes @VS –  Open frustum (no bound in Z direction)  Max and min Z is used to cap
  • 8. 8 | A 2.5D culling for Forward+ | Takahiro Harada LONG FRUSTUM  Screen space culling is not always sufficient –  Create a frustum from max and min depth values –  Edge of objects –  Captures a lot of unnecessary lights
  • 9. 9 | A 2.5D culling for Forward+ | Takahiro Harada LONG FRUSTUM  Screen space culling is not always sufficient –  Create a frustum from max and min depth values –  Edge of objects –  Captures a lot of unnecessary lights 0 lights 25 lights 50 lights
  • 10. 10 | A 2.5D culling for Forward+ | Takahiro Harada GET WORSE IN A COMPLEX SCENE 0 lights 100 lights 200 lights
  • 11. 11 | A 2.5D culling for Forward+ | Takahiro Harada QUESTION  Want to reduce false positives  Can we improve the culling without adding much overhead? –  Computation time, memory –  Culling itself is an optimization –  Spending a lot of resources for it does not make sense  Using a 3D grid is a natural extension –  Uses too much memory
  • 12. 12 | A 2.5D culling for Forward+ | Takahiro Harada 2.5D CULLING
  • 13. 13 | A 2.5D culling for Forward+ | Takahiro Harada 2.5D CULLING  Additional memory usage –  0B global memory –  4B local memory per WG (can compress more if you want)  Additional computation complexity –  A few bit and arithmetic instructions –  A few lines of codes for light culling –  No changes for other stages  Additional runtime overhead –  < 10% compared to the original light culling
  • 14. 14 | A 2.5D culling for Forward+ | Takahiro Harada IDEA  Split frustum in z direction –  Uniform split for a frustum –  Varying split among frustums (a) (b)
  • 15. 15 | A 2.5D culling for Forward+ | Takahiro Harada FRUSTUM CONSTRUCTION  Calculate depth bound –  max and min values of depth  Split depth direction into 32 cells –  Min value and cell size  Flag occupied cell  A 32bit depth mask per work group A tile
  • 16. 16 | A 2.5D culling for Forward+ | Takahiro Harada FRUSTUM CONSTRUCTION  Calculate depth bound –  max and min values of depth  Split depth direction into 32 cells –  Min value and cell size  Flag occupied cell  A 32bit depth mask per work group 7 7 7 7 7 7 7 2 7 7 2 1 7 2 1 0 Depth mask = 11100001 A tile 0 1 2 3 4 5 6 7
  • 17. 17 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING  If a light overlaps to the frustum –  Calculate depth mask for the light –  Check overlap using the depth mask of the frustum  Depth mask & Depth mask –  11100001 & 00011000 = 00000000 Depth mask = 11100001 Depth mask = 00011000
  • 18. 18 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING  If a light overlaps to the frustum –  Calculate depth mask for the light –  Check overlap using the depth mask of the frustum  Depth mask & Depth mask –  11100001 & 00110000 = 00100000 Depth mask = 11100001 Depth mask = 00110000
  • 19. 19 | A 2.5D culling for Forward+ | Takahiro Harada CODE Original With 2.5D culling
  • 20. 20 | A 2.5D culling for Forward+ | Takahiro Harada RESULTS
  • 21. 21 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING
  • 22. 22 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING + 2.5D CULLING
  • 23. 23 | A 2.5D culling for Forward+ | Takahiro Harada COMPARISON 1" 10" 100" 1000" 10000" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17" 18" 19" 20" 21" 22" 23" Number'of'*les' Number'of'lights'(x10)' With"2.5D"culling" Without"2.5D"culling" 220 lights/frustum -> 120 lights/frustum
  • 24. 24 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING
  • 25. 25 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING + 2.5D CULLING
  • 26. 26 | A 2.5D culling for Forward+ | Takahiro Harada COMPARISON 1" 10" 100" 1000" 10000" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17" Number'of'*les' Number'of'lights'(x10)' With"2.5D"culling" Without"2.5D"culling"
  • 27. 27 | A 2.5D culling for Forward+ | Takahiro Harada PERFORMANCE 0" 1" 2" 3" 4" 5" 6" 1024" 2048" 3072" 4096" !me$(ms)$ Number$of$lights$ Forward+"w."frustum"culling" Forward+"w."2.5D" Deferred"
  • 28. 28 | A 2.5D culling for Forward+ | Takahiro Harada CONCLUSION  Proposed 2.5D culling which –  Additional memory usage   0B global memory   4B local memory per WG (can compress more if you want) –  Additional compute complexity   3 lines of pseudo codes for light culling   No changes for other stages –  Additional runtime overhead   < 10% compared to the original light culling  Showed that 2.5D culling reduces false positives