Deferred rendering decouples lighting from scene complexity by calculating lighting in a post-processing step. The Leadwerks Engine previously used forward rendering but had issues with shader variations and light limits. Version 2.1 introduced deferred rendering to address these problems. Deferred rendering provided easier control of per-light settings, smaller shaders, no light limits, and a 50% performance improvement per light. While alpha blending and anti-aliasing were difficult, the benefits of deferred rendering outweighed the few disadvantages.
This talk presents the approach Frostbite took to add support for HDR displays. It will summarize Frostbite's previous post processing pipeline and what the issues were. Attendees will learn the decisions made to fix these issues, improve the color grading workflow and support high quality HDR and SDR output. This session will detail the display mapping used to implement the"grade once, output many" approach to targeting any display and why an ad-hoc approach as opposed to filmic tone mapping was chosen. Frostbite retained 3D LUT-based grading flexibility and the accuracy differences of computing these in decorrelated color spaces will be shown. This session will also include the main issues found on early adopter games, differences between HDR standards, optimizations to achieve performance parity with the legacy path and why supporting HDR can also improve the SDR version.
Takeaway
Attendees will learn how and why Frostbite chose to support High Dynamic Range [HDR] displays. They will understand the issues faced and how these were resolved. This talk will be useful for those still to support HDR and provide discussion points for those who already do.
Intended Audience
The intended audience is primarily rendering engineers, technical artists and artists; specifically those who focus on grading and lighting and those interested in HDR displays. Ideally attendees will be familiar with color grading and tonemapping.
A technical deep dive into the DX11 rendering in Battlefield 3, the first title to use the new Frostbite 2 Engine. Topics covered include DX11 optimization techniques, efficient deferred shading, high-quality rendering and resource streaming for creating large and highly-detailed dynamic environments on modern PCs.
Technical talk from the AMD GPU14 Tech Day by Johan Andersson in the Frostbite team at DICE/EA about Battlefield 4 on PC which is the first title that will use 'Mantle' - a very high-performance low-level graphics API being in close collaboration by AMD and DICE/EA to get the absolute best performance and experience in Frostbite games on PC.
This talk presents the approach Frostbite took to add support for HDR displays. It will summarize Frostbite's previous post processing pipeline and what the issues were. Attendees will learn the decisions made to fix these issues, improve the color grading workflow and support high quality HDR and SDR output. This session will detail the display mapping used to implement the"grade once, output many" approach to targeting any display and why an ad-hoc approach as opposed to filmic tone mapping was chosen. Frostbite retained 3D LUT-based grading flexibility and the accuracy differences of computing these in decorrelated color spaces will be shown. This session will also include the main issues found on early adopter games, differences between HDR standards, optimizations to achieve performance parity with the legacy path and why supporting HDR can also improve the SDR version.
Takeaway
Attendees will learn how and why Frostbite chose to support High Dynamic Range [HDR] displays. They will understand the issues faced and how these were resolved. This talk will be useful for those still to support HDR and provide discussion points for those who already do.
Intended Audience
The intended audience is primarily rendering engineers, technical artists and artists; specifically those who focus on grading and lighting and those interested in HDR displays. Ideally attendees will be familiar with color grading and tonemapping.
A technical deep dive into the DX11 rendering in Battlefield 3, the first title to use the new Frostbite 2 Engine. Topics covered include DX11 optimization techniques, efficient deferred shading, high-quality rendering and resource streaming for creating large and highly-detailed dynamic environments on modern PCs.
Technical talk from the AMD GPU14 Tech Day by Johan Andersson in the Frostbite team at DICE/EA about Battlefield 4 on PC which is the first title that will use 'Mantle' - a very high-performance low-level graphics API being in close collaboration by AMD and DICE/EA to get the absolute best performance and experience in Frostbite games on PC.
Talk by Graham Wihlidal (Frostbite Labs) at GDC 2017.
Checkerboard rendering is a relatively new technique, popularized recently by the introduction of the PlayStation 4 Pro. Many modern game engines are adding support for it right now, and in this talk, Graham will present an in-depth look at the new implementation in Frostbite, which is used in shipping titles like 'Battlefield 1' and 'Mass Effect Andromeda'. Despite being conceptually simple, checkerboard rendering requires a deep integration into the post-processing chain, in particular temporal anti-aliasing, dynamic resolution scaling, and poses various challenges to existing effects. This presentation will cover the basics of checkerboard rendering, explain the impact on a game engine that powers a wide range of titles, and provide a detailed look at how the current implementation in Frostbite works, including topics like object id, alpha unrolling, gradient adjust, and a highly efficient depth resolve.
Presentation by Andrew Hamilton and Ken Brown from DICE at GDC 2016.
Photogrammetry has started to gain steam within the Games Industry in recent years. At DICE, this technique was first used on Battlefield and they fully embraced the technology and workflow for Star Wars: Battlefront. This talk will cover their research and development, planning and production, techniques, key takeaways and plans for the future. The speakers will cover photogrammetry as a technology, but more than that, show that it's not a magic bullet but instead a tool like any other that can be used to help achieve your artistic vision and craft.
Takeaway
Come and learn how (and why) photogrammetry was used to create the world of Star Wars. This talk will cover Battlefront's use of of the technology from pre-production to launch as well as some of their philosophies around photogrammetry as a tool. Many visuals will be included!
Intended Audience
A content creator friendly talk intended for pretty much any developer, especially those involved in 3D content creation. It is not a technical talk focused on the code or engineering of photogrammetry. The speakers will quickly cover all basics, so absolutely no prerequisite knowledge required.
Game engines have long been in the forefront of taking advantage of the ever
increasing parallel compute power of both CPUs and GPUs. This talk is about how the
parallel compute is utilized in practice on multiple platforms today in the Frostbite game
engine and how we think the parallel programming models, hardware and software in
the industry should look like in the next 5 years to help us make the best games possible.
Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che...Colin Barré-Brisebois
Presentation from Game Developers Conference 2011 (GDC2011), presented by Colin Barre-Brisebois and Marc Bouchard. A rendering technique for real-time translucency rendering, implemented in DICE's Frostbite 2 engine, featured in Battlefield 3.
The rendering technology of 'lords of the fallen' philip hammerMary Chan
This session is about some important aspects of the rendering pipeline of the upcoming Action-RPG "Lords of the Fallen", developed by Deck13 Interactive and CI Games for PS4, Xbox One, and PC. The topic covers several closely related areas like the deferred rendering system, image-based lighting using deferred cubemaps, deferred decals, and an approach for transparent object lighting and shadowing. More specifically, the lecture will cover several strategies to keep the G-Buffer as small and efficient as possible. This includes the description of a G-Buffer attribute-packing scheme and how per-material attributes can be exposed using special parameter lookup tables. Furthermore, a traditional problem of most deferred rendering systems is the seamless integration of transparent objects into the lighting. The lecture will present several ways to approach this problem, for example multi-pass deferred rendering, coloured transparent shadows, and a novel method for deferred particle lighting.
This session presents a detailed programmer oriented overview of our SPU based shading system implemented in DICE's Frostbite 2 engine and how it enables more visually rich environments in BATTLEFIELD 3 and better performance over traditional GPU-only based renderers. We explain in detail how our SPU Tile-based deferred shading system is implemented, and how it supports rich material variety, High Dynamic Range Lighting, and large amounts of light sources of different types through an extensive set of culling, occlusion and optimization techniques.
Talk by Graham Wihlidal (Frostbite Labs) at GDC 2017.
Checkerboard rendering is a relatively new technique, popularized recently by the introduction of the PlayStation 4 Pro. Many modern game engines are adding support for it right now, and in this talk, Graham will present an in-depth look at the new implementation in Frostbite, which is used in shipping titles like 'Battlefield 1' and 'Mass Effect Andromeda'. Despite being conceptually simple, checkerboard rendering requires a deep integration into the post-processing chain, in particular temporal anti-aliasing, dynamic resolution scaling, and poses various challenges to existing effects. This presentation will cover the basics of checkerboard rendering, explain the impact on a game engine that powers a wide range of titles, and provide a detailed look at how the current implementation in Frostbite works, including topics like object id, alpha unrolling, gradient adjust, and a highly efficient depth resolve.
Presentation by Andrew Hamilton and Ken Brown from DICE at GDC 2016.
Photogrammetry has started to gain steam within the Games Industry in recent years. At DICE, this technique was first used on Battlefield and they fully embraced the technology and workflow for Star Wars: Battlefront. This talk will cover their research and development, planning and production, techniques, key takeaways and plans for the future. The speakers will cover photogrammetry as a technology, but more than that, show that it's not a magic bullet but instead a tool like any other that can be used to help achieve your artistic vision and craft.
Takeaway
Come and learn how (and why) photogrammetry was used to create the world of Star Wars. This talk will cover Battlefront's use of of the technology from pre-production to launch as well as some of their philosophies around photogrammetry as a tool. Many visuals will be included!
Intended Audience
A content creator friendly talk intended for pretty much any developer, especially those involved in 3D content creation. It is not a technical talk focused on the code or engineering of photogrammetry. The speakers will quickly cover all basics, so absolutely no prerequisite knowledge required.
Game engines have long been in the forefront of taking advantage of the ever
increasing parallel compute power of both CPUs and GPUs. This talk is about how the
parallel compute is utilized in practice on multiple platforms today in the Frostbite game
engine and how we think the parallel programming models, hardware and software in
the industry should look like in the next 5 years to help us make the best games possible.
Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che...Colin Barré-Brisebois
Presentation from Game Developers Conference 2011 (GDC2011), presented by Colin Barre-Brisebois and Marc Bouchard. A rendering technique for real-time translucency rendering, implemented in DICE's Frostbite 2 engine, featured in Battlefield 3.
The rendering technology of 'lords of the fallen' philip hammerMary Chan
This session is about some important aspects of the rendering pipeline of the upcoming Action-RPG "Lords of the Fallen", developed by Deck13 Interactive and CI Games for PS4, Xbox One, and PC. The topic covers several closely related areas like the deferred rendering system, image-based lighting using deferred cubemaps, deferred decals, and an approach for transparent object lighting and shadowing. More specifically, the lecture will cover several strategies to keep the G-Buffer as small and efficient as possible. This includes the description of a G-Buffer attribute-packing scheme and how per-material attributes can be exposed using special parameter lookup tables. Furthermore, a traditional problem of most deferred rendering systems is the seamless integration of transparent objects into the lighting. The lecture will present several ways to approach this problem, for example multi-pass deferred rendering, coloured transparent shadows, and a novel method for deferred particle lighting.
This session presents a detailed programmer oriented overview of our SPU based shading system implemented in DICE's Frostbite 2 engine and how it enables more visually rich environments in BATTLEFIELD 3 and better performance over traditional GPU-only based renderers. We explain in detail how our SPU Tile-based deferred shading system is implemented, and how it supports rich material variety, High Dynamic Range Lighting, and large amounts of light sources of different types through an extensive set of culling, occlusion and optimization techniques.
Filmic Tonemapping for Real-time Rendering - Siggraph 2010 Color Coursehpduiker
Filmic Tonemapping for Real-time Rendering, a presentation from the Siggraph 2010 Course on Color, on a technique developed from film that became very applicable to games with the addition of support for HDR lighting and rendering in graphics cards.
Reusable Video IP Cores give software engineering service providers flexibility and less time to market while catering to the ever increasing demands of customers. Read on to know more about the Reusable IP Cores developed by NeST Software.
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
In this talk, the authors will describe an overview of a different method for deferred lighting approach used in CryENGINE 3, along with an in-depth description of the many techniques used. Original file and videos at http://crytek.com/cryengine/presentations
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
The complexity of Medical image reconstruction requires tens to hundreds of billions of computations per second. Until few years ago, special purpose processors designed especially for such applications were used. Such processors require significant design effort and are thus difficult to change as new algorithms in reconstructions evolve and have limited parallelism. Hence the demand for flexibility in medical applications motivated the use of stream processors with massively parallel architecture. Stream processing architectures offers data parallel kind of parallelism.
Optimization Deep Dive: Unreal Engine 4 on IntelIntel® Software
This talk covers the work Intel and Epic Games have done together to enable improved performance of UE4 on Intel platforms, including DirectX 12 and Android. Many techniques presented are general and apply to all games and engines.
Patch-Based Image Learned Codec using Overlappingsipij
End-to-end learned image and video codecs, based on auto-encoder architecture, adapt naturally to image resolution, thanks to their convolutional aspect. However, while coding high resolution images, these codecs face hardware problems such as memory saturation. This paper proposes a patch-based image coding solution based on an end-to-end learned model, which aims to remedy to the hardware limitation while maintaining the same quality as full resolution image coding. Our method consists in coding overlapping patches of the image and reconstructing them into a decoded image using a weighting function. This approach manages to be on par with the performance of full resolution image coding using an endto-end learned model, and even slightly outperforms it, while being adaptable to different memory sizes. Moreover, this work undertakes a full study on the effect of the patch size on this solution’s performance, and consequently determines the best patch resolution in terms of coding time and coding efficiency. Finally, the method introduced in this work is also compatible with any learned codec based
on a conv/deconvolutional autoencoder architecture without having to retrain the model.
Similar to Deferred rendering in_leadwerks_engine[1] (20)
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...ozlael ozlael
그래픽 최적화를 위해서 필수로 알아야하는 드로우콜과 배칭을 심화하여 다룹니다. 드로우콜의 개념, Batch와 SetPass Call의 차이, 드로우콜 감소 방법 등등 기초 개념부터 실무적인 깊이까지 다룹니다. 기반지식 여부 상관 없이 모두 들으실 수 있습니다. 특히 아티스트와 프로그래머에게 도움이 될 것입니다.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
2. 1.1 Introduction
Deferred lighting is a relatively new rendering concept that performs lighting as a kind of
post-processing step. The immediate advantage of this is that lighting is decoupled from
scene complexity. Lighting is processed at the same speed whether one or 1,000,000
polygons are onscreen.
Whereas a forward renderer will usually draw to a color and depth buffer, deferred shading
uses an additional screen space normal buffer. The combination of normal and depth data
is sufficient to calculate lighting in a post-processing step, which is then combined with the
color buffer.
Figure 1.0 Deferred lighting performance is independent from scene complexity.
1.2 Forward Rendering in Leadwerks Engine 2.0
Leadwerks Engine 2.0 utilized a forward renderer. In order to optimize the engine to render
only the visible lights onscreen, we implemented a concept we call “shader groups”.
Instead of storing a single shader per material, the engine stores an array of versions of
that shader, each auto-generated using GLSL defines to control the number of lights and
other settings. This approach suffered from several problems that became apparent once
we started using the engine in a production environment:
Each shader group had over 100 potential versions. The constant shader compiling
during runtime when entering a new area caused unacceptable pauses. Although
3. only a few variations ever got compiled and used, it was impossible to guess which
ones would be needed. Because there were so many potential variations it was
impossible to pre-compile all of these.
The length of the shaders became quite long, causing slower compiling times and
risked exceeding the instruction limit on older hardware.
Because materials used shader groups and not single shaders, it was difficult or
impossible for our end users to alter a shader uniform for material shaders.
It was difficult to control per-light settings. A single global binary setting would
double the number of potential shaders in a group. A per-light binary setting would
multiply the shader count by 16 (two settings times a maximum of eight lights).
The maximum number of lights visible at any given time was capped at eight due to
hardware limitations and practicality.
These issues motivated us to research a deferred approach for version 2.1 of our engine.
1.3 Deferred Rendering in Leadwerks Engine 2.1
We set out to implement a deferred renderer with a few conditions in mind. We wanted to
create a minimal frame buffer format to minimize bandwidth costs, while supporting all the
lighting features of our forward renderer. Additionally, we chose to perform lighting
calculations in screen space rather than the world space methods used by the forward
renderer. Although we did not know how performance would compare, it was certain from
the beginning that deferred lighting would offer a simpler renderer that was easier to
control. Additionally, the following advantages applied:
Easy control of per-light settings. Does this light cast shadows? Does it create a
specular reflection?
Smaller shader sources and faster compiling. There are 32 potential light shaders
that still get compiled on-the-fly, but the pauses this causes are much less frequent
and barely perceptible.
No limit to the number of lights rendered.
Easy addition of new light types.
Because deferred lighting only performs lighting on the final screen pixels, the same
characteristic that makes deferred lighting so desirable creates a few new problems:
Alpha blending becomes difficult.
No MSAA hardware anti-aliasing.
We found that a modulate (darken) blend can be used for most transparencies, without
disrupting the lighting environment. Alpha blended materials can be drawn in a second
pass, as we already do for refractive surfaces. A sort of pseudo anti-alias can be performed
with a blur post-filter weighted with an edge detection filter.
4. 1.31 The Frame Buffer Format
It is often suggested that a deferred renderer should use a 128-bit GL_RGBA32F float buffer
for recording fragment positions. We saw no need for transferring this extra data. Instead,
the light shaders reconstruct the fragment screen space position from the screen coordinate
and depth value. We found that a 24-bit depth buffer provided good precision for use with
shadow map lookups.
The fragment coordinate was used to calculate the screen space position. Here, the
buffersize uniform is used to pass the screen width and height to the shader:
vec3 screencoord;
screencoord = vec3(((gl_FragCoord.x/buffersize.x)-0.5) * 2.0,((-gl_FragCoord.y/buffersize.y)+0.5) * 2.0 /
(buffersize.x/buffersize.y),DepthToZPosition( depth ));
screencoord.x *= screencoord.z;
screencoord.y *= -screencoord.z;
The depth was converted to a screen space z position with the following function. The
camerarange uniform stores the camera near and far clipping distances:
float DepthToZPosition(in float depth) {
return camerarange.x / (camerarange.y - depth * (camerarange.y - camerarange.x)) *
camerarange.y;
}
This approach allowed us to save about half the bandwidth costs a frame buffer format with
a 128-bit position buffer would have required.
We found that for diffuse lighting, a 32-bit GL_RGBA8 normal buffer yielded good results.
However, surfaces with a high specular factor exhibited banding caused by the limited
resolution. We tried the GL_RGB10_A2 format, which was precise enough for the normal,
but lacked sufficient precision to store the specular value in the alpha channel. We settled
on using a 64-bit GL_RGBA16F buffer on supported hardware, with a fallback for a 32-bit
GL_RGBA8 buffer.
Specular intensity was packed into the unused alpha channel of the normal buffer. Since
graphics hardware prefers data in 32-bit packets, it is likely that the RGBA format is the
same internally as the RGB format on most hardware, so this extra data transfer came at no
cost. The size of the frame buffer format displayed here is either 120 or 88 bits per pixel,
depending on the normal buffer format.
Figure 1.1 Frame buffer format:
Buffer Format Bits Values
color GL_RGBA8 32 red green blue alpha
depth GL_DEPTH_COMPONENT24 24 depth
normal GL_RGBA16F or 64 or 32 x y z specular factor
GL_RGBA8
5. Since we were already using multiple render targets, we decided to add support for
attaching additional color buffers. This allowed the end user to create an additional buffer
for rendering selective bloom or other effects, and allows for extended rendering features
without altering the engine source code. The light source for the orange point light pictured
here is a good candidate for selective bloom, as our full-bright and lens flare effects did not
translate well to the deferred renderer.
Figure 1.2 Depth, normal, and specular factor are used to calculate diffuse lighting and specular reflection,
which is then combined with the albedo color for the final image.
9. Figure 1.9 Final image
1.32 Optimization
Our engine performs hierarchal culling based on the scene graph, followed by per-object
frustum culling. Hardware occlusion queries are performed on the remaining objects. This
ensures that only visible lights are updated and drawn. For the remaining visible lights, it is
important to process only the fragments that are actually affected by the light source.
The most efficient way to eliminate fragments is to discard them before they are run
through the fragment program. We copied the depth value from the depth buffer to the
next render target during the first full-screen pass, which is either the ambient pass or the
combined ambient and first directional light pass. Point and spot light volumes were then
rendered with depth testing enabled. An intersection test detected whether the camera was
inside or outside the light volume, and the polygon order and depth function were switched
accordingly. Although writing to the depth buffer may disable hierarchal depth culling, an
overall performance improvement was observed with depth writing and testing enabled.
Further research using the stencil buffer to discard fragments may prove beneficial.
1.4 Results
Our results disprove some common assumptions about deferred rendering. It is often
suggested that deferred rendering is only advantageous in scenes with a high degree of
10. overdraw. In every test run we found our deferred renderer to be at least 20% faster than
forward lighting. Due to our compact frame buffer format, the increased bandwidth cost
was only about four frames per second lost on ATI Radeon HD 3870 and NVidia GEForce
8800 graphics cards. On average we saw a performance improvement of about 50% per
light onscreen. A scene with eight visible lights ran four times faster in our deferred
renderer.
The speed improvements of our deferred renderer were not limited to high-end hardware.
We saw similar performance gains on ATI Radeon X1550 and NVidia GEForce 7200 graphics
cards.
Except for the aforementioned issues with MSAA and alpha blending, the appearance of our
deferred renderer was identical to that of our forward renderer. Our lens flare effect was
abandoned, since it interfered with lighting and could be replaced with more advanced
depth-based lighting effects.
1.5 Conclusion
It is inevitable that a fundamental shift in rendering paradigms will cause some
incompatibility with previously utilized techniques. However, the simplicity and speed of
deferred rendering far outweigh the few disadvantages. Not only did deferred lighting
simplify our renderer and eliminate the problems our forward renderer exhibited, but it also
yielded a significant performance improvement, even in situations where it is usually
suggested a forward renderer would be faster.
1.6 References
Foley, James D., Andries van Dam, Steven K. Feiner, and John F. Hughes. 1990. Computer
Graphics: Principles and Practice. Addison-Wesley.
Shishkovtso, Oles 2005. “GPU Gems 2. Chaper 9 - Deferred Rendering in S.T.A.L.K.E.R.”
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter09.html
Valient, Michael. 2007. "Deferred Rendering in Killzone 2." Presentation at Develop
Conference in Brighton, July 2007.
http://www.guerrilla-games.com/publications/dr_kz2_rsx_dev07.pdf
1.7 Acknowledgements
Thanks to Arne Reiners, Rich DiGiovanni, and Emil Persson.