SlideShare a Scribd company logo
1 of 90
/* * SPU Assisted Rendering. */ Steven Tovey & Stephen McAuley Graphics Programmers, Bizarre Creations Ltd. steven.tovey@bizarrecreations.com stephen.mcauley@bizarrecreations.com http://www.bizarrecreations.com
/* Welcome! */ ,[object Object],[object Object]
Car Lighting
Part II (w/ Stephen McAuley):Fragment Shading Parallelisation Case Study Pre-pass Lighting on SPUs ,[object Object],/* Agenda */
/* * Part I w/ Steven Tovey  */ SPU Acceleration of Car Rendering in Blur
[object Object]
Why do this?Free up RSX™ to do other things. Enable otherwise unfeasible techniques. Optimise rendering. /* What is SPU AR? I */
[object Object]
Synchronisation.
Optimising SPU modules.
Memory considerations:
Local store
Resource allocation
Etc./* What is SPU AR? II */
[object Object]
Totally GPU-based.
2xVTF (volume & 2D) for damage.
Large amount of work in vertex shader, making cars in Blur heavily vertex-bound.
All lighting in pixel shader./* Case Study: Cars I */
[object Object],/* Case Study: Cars II */
[object Object],/* Case Study: Cars III */
[object Object],/* Case Study: Cars IV */
[object Object],/* Case Study: Cars IV */
[object Object]
Increase rendering speed of cars.
Maintain same quality./* Case Study: Cars VI */
[object Object]
Large parts are SPU based.
On demand.
Sync-free.
Deferred.
Work split between GPU/SPU./* Damage: Solution */
[object Object]
Read-only car vertex data.
Shared between similar cars.
SPU-modified damage vertex data.
Per instance.
One-to-one mapping of vertices.
Control points:
Crude approximation of volume preservation.
Dent/scratch blend levels./* Damage: Data I */
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
[object Object]
If vertex format is 16 bytes exactly can atomically change a vertex from SPU.
If you can live with the odd vertex being wrong for a frame, this could be a huge win!/* Damage: Data III */
/* Damage: Data IV */ SPU RSX Local Main Write-only Vertices Read-only Vertices
[object Object]
Note: There is no link to the player health, purely superficial./* Damage: Events */ Impact Impact Game Code Impact Impact Impact Impact
/* Damage: Data V */ Impact Impact Impact Constants Impact Impact Impact SPU GPU Write-only Vertices* Read-only Vertices* * - w.r.t to SPU
/* Damage: Data VI */ SPU GPU Write-only Vertices* Read-only Vertices* * - w.r.t to SPU
Kick off SPU tasks ,[object Object],/* Damage: Control */ Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Other Work(1) Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Flag Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Flag Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(2) Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Flag Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(2) Other Work(1) PPU Damage PPU Damage
[object Object],/* Damage: Control */ Flag Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(2) Other Work(1) PPU Damage PPU Damage
[object Object]
We favour si style for simplicity and ease./* de-code into IEEE754-ish 32bit float (meh): */ qword sign_bit     = si_and(result, sign_bit_mask); sign_bit     = si_shli(sign_bit, 0x10);      /* move 16 bits into correct place. */ qword significand  = si_and(result, mant_bit_mask); significand  = si_shli(significand, 0xd); qword is_zero_mask = si_cgti(significand, 0x0);    /* all bits set if non-zero. */ expo_bias	   = si_and(is_zero_mask, expo_bias); qword exponent_bias= si_a(significand, expo_bias); /* move expo up range, 						     0x07800000=>0x3f800000. */ exponent_bias= si_or(exponent_bias, sign_bit); /* Damage: SPU I */
[object Object]
GPU version relied on bilinear filtering of volume texture to smooth damage.
Filtering on SPU is a bit of a pain.
Working out which events affect which vertices?/* Damage: SPU II */
[object Object]
Two-stage x-form:
1. Get data in volume texture-ish format.
2. Apply x-form to all vertices./* Damage: SPU III */
[object Object]
Software bilinear filtering.
Some interesting instructions in ISA will help here./* Damage: SPU IV */
[object Object],Process in 16KB chunks. Multi-buffer input and output. ,[object Object],/* Damage: Lessons I */
[object Object],/* Damage: Lessons II */ x y z w x x x x x y z w y y y y x y z w z z z z x y z w w w w w
[object Object]
We added some of the per-vertex lighting calculations for brake lights, for example./* Damage: Lessons III */
/* Damage: Results */
[object Object]
SPU-generated cube maps.
40 in total (accounting for double buffer).
8x8 per face.
Deferred.
Work split between GPU/SPU.
Cars are lit with a mixture of things:
SH (world + dynamic)
Cube map lighting

More Related Content

What's hot

Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-renderingmistercteam
 
Modern Graphics Pipeline Overview
Modern Graphics Pipeline OverviewModern Graphics Pipeline Overview
Modern Graphics Pipeline Overviewslantsixgames
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...Johan Andersson
 
GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11smashflt
 
Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeWuBinbo
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteElectronic Arts / DICE
 
Deferred shading
Deferred shadingDeferred shading
Deferred shadingFrank Chao
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteElectronic Arts / DICE
 
Clustered defered and forward shading
Clustered defered and forward shadingClustered defered and forward shading
Clustered defered and forward shadingWuBinbo
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Johan Andersson
 
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringStable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringElectronic Arts / DICE
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
 
Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Tiago Sousa
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
 
Star Ocean 4 - Flexible Shader Managment and Post-processing
Star Ocean 4 - Flexible Shader Managment and Post-processingStar Ocean 4 - Flexible Shader Managment and Post-processing
Star Ocean 4 - Flexible Shader Managment and Post-processingumsl snfrzb
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
 
Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3stevemcauley
 
5 Major Challenges in Interactive Rendering
5 Major Challenges in Interactive Rendering5 Major Challenges in Interactive Rendering
5 Major Challenges in Interactive RenderingElectronic Arts / DICE
 

What's hot (20)

Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-rendering
 
Modern Graphics Pipeline Overview
Modern Graphics Pipeline OverviewModern Graphics Pipeline Overview
Modern Graphics Pipeline Overview
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
 
GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11
 
Rendering Battlefield 4 with Mantle
Rendering Battlefield 4 with MantleRendering Battlefield 4 with Mantle
Rendering Battlefield 4 with Mantle
 
Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with compute
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in Frostbite
 
Deferred shading
Deferred shadingDeferred shading
Deferred shading
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in Frostbite
 
Clustered defered and forward shading
Clustered defered and forward shadingClustered defered and forward shading
Clustered defered and forward shading
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringStable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666
 
Star Ocean 4 - Flexible Shader Managment and Post-processing
Star Ocean 4 - Flexible Shader Managment and Post-processingStar Ocean 4 - Flexible Shader Managment and Post-processing
Star Ocean 4 - Flexible Shader Managment and Post-processing
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!
 
Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3
 
5 Major Challenges in Interactive Rendering
5 Major Challenges in Interactive Rendering5 Major Challenges in Interactive Rendering
5 Major Challenges in Interactive Rendering
 

Similar to SPU Assisted Rendering

CUDA by Example : Constant Memory and Events : Notes
CUDA by Example : Constant Memory and Events : NotesCUDA by Example : Constant Memory and Events : Notes
CUDA by Example : Constant Memory and Events : NotesSubhajit Sahu
 
The mag pi-issue-28-en
The mag pi-issue-28-enThe mag pi-issue-28-en
The mag pi-issue-28-enNguyen Nam
 
Vision Based Autonomous Mobile Robot Navigation
Vision Based Autonomous Mobile Robot NavigationVision Based Autonomous Mobile Robot Navigation
Vision Based Autonomous Mobile Robot NavigationNiaz Mohammad
 
Scottish Ruby Conference 2010 Arduino, Ruby RAD
Scottish Ruby Conference 2010 Arduino, Ruby RADScottish Ruby Conference 2010 Arduino, Ruby RAD
Scottish Ruby Conference 2010 Arduino, Ruby RADlostcaggy
 
Virtual Reality & Sim Racing in Assetto Corsa - Romagnoli
Virtual Reality & Sim Racing in Assetto Corsa - RomagnoliVirtual Reality & Sim Racing in Assetto Corsa - Romagnoli
Virtual Reality & Sim Racing in Assetto Corsa - RomagnoliCodemotion
 
Ch_2_8,9,10.pptx
Ch_2_8,9,10.pptxCh_2_8,9,10.pptx
Ch_2_8,9,10.pptxyosikit826
 
ARUDINO UNO and RasberryPi with Python
 ARUDINO UNO and RasberryPi with Python ARUDINO UNO and RasberryPi with Python
ARUDINO UNO and RasberryPi with PythonJayanthi Kannan MK
 
OV7670 Camera interfacing-with-arduino-microcontroller
OV7670 Camera interfacing-with-arduino-microcontrollerOV7670 Camera interfacing-with-arduino-microcontroller
OV7670 Camera interfacing-with-arduino-microcontrollerSomnath Sharma
 
Bootstrap process of u boot (NDS32 RISC CPU)
Bootstrap process of u boot (NDS32 RISC CPU)Bootstrap process of u boot (NDS32 RISC CPU)
Bootstrap process of u boot (NDS32 RISC CPU)Macpaul Lin
 
Lab 2_ Programming an Arduino.pdf
Lab 2_ Programming an Arduino.pdfLab 2_ Programming an Arduino.pdf
Lab 2_ Programming an Arduino.pdfssuser0e9cc4
 
TP_Webots_7mai2021.pdf
TP_Webots_7mai2021.pdfTP_Webots_7mai2021.pdf
TP_Webots_7mai2021.pdfkiiway01
 
Game Programming I - Introduction
Game Programming I - IntroductionGame Programming I - Introduction
Game Programming I - IntroductionFrancis Seriña
 
2 Level Guitar Hero Final Report
2 Level Guitar Hero Final Report2 Level Guitar Hero Final Report
2 Level Guitar Hero Final ReportCem Recai Çırak
 
Syed IoT - module 5
Syed  IoT - module 5Syed  IoT - module 5
Syed IoT - module 5Syed Mustafa
 
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdfAdvanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdfWiseNaeem
 
Android Things Linux Day 2017
Android Things Linux Day 2017 Android Things Linux Day 2017
Android Things Linux Day 2017 Stefano Sanna
 
GDGPH Hack Fair Presentation
GDGPH Hack Fair PresentationGDGPH Hack Fair Presentation
GDGPH Hack Fair PresentationMithi Sevilla
 
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter boardSerial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter boardVincent Claes
 

Similar to SPU Assisted Rendering (20)

CUDA by Example : Constant Memory and Events : Notes
CUDA by Example : Constant Memory and Events : NotesCUDA by Example : Constant Memory and Events : Notes
CUDA by Example : Constant Memory and Events : Notes
 
The mag pi-issue-28-en
The mag pi-issue-28-enThe mag pi-issue-28-en
The mag pi-issue-28-en
 
How to Hack Edison
How to Hack EdisonHow to Hack Edison
How to Hack Edison
 
Vision Based Autonomous Mobile Robot Navigation
Vision Based Autonomous Mobile Robot NavigationVision Based Autonomous Mobile Robot Navigation
Vision Based Autonomous Mobile Robot Navigation
 
Scottish Ruby Conference 2010 Arduino, Ruby RAD
Scottish Ruby Conference 2010 Arduino, Ruby RADScottish Ruby Conference 2010 Arduino, Ruby RAD
Scottish Ruby Conference 2010 Arduino, Ruby RAD
 
Graphics processing unit
Graphics processing unitGraphics processing unit
Graphics processing unit
 
Virtual Reality & Sim Racing in Assetto Corsa - Romagnoli
Virtual Reality & Sim Racing in Assetto Corsa - RomagnoliVirtual Reality & Sim Racing in Assetto Corsa - Romagnoli
Virtual Reality & Sim Racing in Assetto Corsa - Romagnoli
 
Ch_2_8,9,10.pptx
Ch_2_8,9,10.pptxCh_2_8,9,10.pptx
Ch_2_8,9,10.pptx
 
ARUDINO UNO and RasberryPi with Python
 ARUDINO UNO and RasberryPi with Python ARUDINO UNO and RasberryPi with Python
ARUDINO UNO and RasberryPi with Python
 
OV7670 Camera interfacing-with-arduino-microcontroller
OV7670 Camera interfacing-with-arduino-microcontrollerOV7670 Camera interfacing-with-arduino-microcontroller
OV7670 Camera interfacing-with-arduino-microcontroller
 
Bootstrap process of u boot (NDS32 RISC CPU)
Bootstrap process of u boot (NDS32 RISC CPU)Bootstrap process of u boot (NDS32 RISC CPU)
Bootstrap process of u boot (NDS32 RISC CPU)
 
Lab 2_ Programming an Arduino.pdf
Lab 2_ Programming an Arduino.pdfLab 2_ Programming an Arduino.pdf
Lab 2_ Programming an Arduino.pdf
 
TP_Webots_7mai2021.pdf
TP_Webots_7mai2021.pdfTP_Webots_7mai2021.pdf
TP_Webots_7mai2021.pdf
 
Game Programming I - Introduction
Game Programming I - IntroductionGame Programming I - Introduction
Game Programming I - Introduction
 
2 Level Guitar Hero Final Report
2 Level Guitar Hero Final Report2 Level Guitar Hero Final Report
2 Level Guitar Hero Final Report
 
Syed IoT - module 5
Syed  IoT - module 5Syed  IoT - module 5
Syed IoT - module 5
 
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdfAdvanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
 
Android Things Linux Day 2017
Android Things Linux Day 2017 Android Things Linux Day 2017
Android Things Linux Day 2017
 
GDGPH Hack Fair Presentation
GDGPH Hack Fair PresentationGDGPH Hack Fair Presentation
GDGPH Hack Fair Presentation
 
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter boardSerial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

SPU Assisted Rendering

Editor's Notes

  1. Local space position of vertex
  2. Normal
  3. Couple of sets of Uvs.
  4. Morph-targets.
  5. Control point index into an array of curves.
  6. Spherical Harmonic
  7. Damage data... Position offset, Normal offset, scratch and dent levels.
  8. Explain why 16KB chunks, MFC max per transfer.
  9. Don’t want lumpiness if parallel read/write.
  10. Don’t want lumpiness if parallel read/write.
  11. Rim lighting here.
  12. Tyres use low-power specular.
  13. Brake lights.
  14. Used on alloys for low-power specular.
  15. Used for the scratch lighting, again for low-power specular.
  16. We need to look at the pipeline of the graphics card to work out how we can move more of our GPU work onto the SPUs. Two main areas we can insert data – either through vertices at the top, or textures at the fragment stage. Sadly, we can’t hook into the rasteriser, which would be ace.
  17. Of course, these look-up textures end up being screen-space look-up textures, which means some sort of deferred rendering…
  18. I have a problem with forward rendering. I think most people traditionally design their engine this way, especially on 360 and PC. But all the work is done in the fragment shader, so when you port to the PS3 with a slower fragment shader unit, your whole game runs slower. Although you can use EDGE to speed up your vertex processing and your post processing, they both only step around the core of the issue that you’re fragment shader bound and there’s no easy way of solving it.
  19. We found a light pre-pass renderer suited our goals pretty well. It’s a halfway house between traditional and deferred rendering.
  20. We render a rear-view mirror, cube map reflections for the cars and planar reflections for the road and water in addition to the pre-pass and main views. Multi-threaded rendering helps a lot!
  21. Deferring by a frame isn’t ideal. Either you just use the previous frame’s lighting buffer for the next frame, with obvious artefacts (especially if you’re doing a racing game like us), or you have to add a frame of latency.I don’t think adding frames of latency is ideal, especially for cross-platform games. If you add a frame of latency on the PS3, are you going to do the same on the 360? If you’re not, then game play could be different between both platforms.I’m not saying this is something I’d never do, I think in lots of circumstances you’ll have to. But avoid it where you can, and this is one instance.
  22. If we wanted to take this further for future projects, we could add shadow maps in at the start of our pipeline, then do an exponential blur on the SPUs whilst we’re rendering the pre-pass geometry…
  23. This is real multi-threaded graphics processing, with multiple processors doing different jobs at the same time. Therefore, architect your engine accordingly!Having small graphics jobs allows you to spread the workload. Obviously, not everything can be done like this. Some things will most likely have to be deferred a frame, adding a frame of latency, such as post-processing or MLAA. But there’s lots of tasks, smaller tasks, that don’t have to be, from SSAO to blurring exponential shadow maps. You have to find things to parallelise with!Think about the data again! Rendering has lots of stages, each with its own inputs and outputs. What could sync with what?
  24. We combine the normals and depth into one 32-bit buffer. This is an optimisation as it halves the inputs into the SPU program, but also allows us to keep the depth buffer in local memory which is good for performance.
  25. The first step, but the biggest stumbling block!
  26. No blocking! Our jobs are optionally dependent on a label.
  27. To be accurate, we have a jump-to-self per SPU.
  28. When we load in a tile, we quickly iterate over every pixel and calculate minimum and maximum depth.No need to use a stencil buffer to cull out the sky as depth min and max will do it for us. (Remember, we don’t have the stencil buffer as we’re not using the depth buffer!)This technique is really useful for a variety of things, including depth of field (check out Matt Swoboda’s optimisation in PhyreEngine).
  29. This is actually the easiest bit. Just write the lighting equations in intrinsics! However, they really have to be fast otherwise performance just won’t be good enough. Next is some helpful tips for optimisation.
  30. So we triple buffer. It ends up that we have plenty of local store left as it’s simple job and our job size was relatively small. Another reason to write in siintrinsics though as it keeps the code size down!
  31. Just like Ste said earlier, this is a big win. Probably a good rule of thumb for most SPU jobs!
  32. Just like Ste said earlier, this is a big win. Probably a good rule of thumb for most SPU jobs!
  33. When kicking SPU jobs off on the RSX, you have to be careful as you can interfere with jobs the PPU is running. This is where sync-free systems are a win! We’re lucky as we just avoided the physics, but also, running only on 3 SPUs was a good idea so we had 3 free for other tasks. See how quick the rendering is even though we’re rendering so many views!
  34. Apologies for the shameless self-promotion!