We present the technology and ideas behind the unique lighting in MIRRORS EDGE from DICE. Covering how DICE adopted Global illumination into their lighting process and Illuminate Labs current toolbox of state of the art lighting technology.
For this year's keynote at High Performance Graphics 2018, Colin Barré-Brisebois from SEED discussed the state of the art in real-time game ray tracing. He explored some of the connections between offline and real-time game ray tracing, and presented some of the open problems. Colin exposed a few potential solutions to those problems, and also proposed a call-to-arms on topics where the ray tracing research community and the games industry should unite in order to solve such open problems.
Talk from SIGGRAPH 2010 and the <a />Beyond Programmable Shading course</a>
Also see <a />publications.dice.se</a> for more material and other DICE talks.
Checkerboard Rendering in Dark Souls: Remastered by QLOCQLOC
This is a talk on checkerboard rendering Markus & Andreas held at Digital Dragons 2019.
In it they quickly go through the history of Checkerboard Rendering before taking a deep dive into how it works and how it is implemented in Dark Souls: Remastered. Lastly, they present the quality and performance improvements they got from using it and their conclusion.
PS: The PDF. file includes useful in-depth notes from both authors.
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray TracingElectronic Arts / DICE
Modern Graphics Abstractions & Real-Time Ray Tracing discusses Halcyon, a graphics rendering system built from scratch using modern graphics APIs. Halcyon uses render handles, commands, backends, devices, and graphs to provide an efficient and flexible rendering system that works across APIs. It also details virtual multi-GPU capabilities that allow developers to test multi-GPU code even on single-GPU machines.
This talk is about our experiences gained during making of the Killzone Shadow Fall announcement demo.
We’ve gathered all the hard data about our assets, memory, CPU and GPU usage and a whole bunch of tricks.
The goal of talk is to help you to form a clear picture of what’s already possible to achieve on PS4.
We present the technology and ideas behind the unique lighting in MIRRORS EDGE from DICE. Covering how DICE adopted Global illumination into their lighting process and Illuminate Labs current toolbox of state of the art lighting technology.
For this year's keynote at High Performance Graphics 2018, Colin Barré-Brisebois from SEED discussed the state of the art in real-time game ray tracing. He explored some of the connections between offline and real-time game ray tracing, and presented some of the open problems. Colin exposed a few potential solutions to those problems, and also proposed a call-to-arms on topics where the ray tracing research community and the games industry should unite in order to solve such open problems.
Talk from SIGGRAPH 2010 and the <a />Beyond Programmable Shading course</a>
Also see <a />publications.dice.se</a> for more material and other DICE talks.
Checkerboard Rendering in Dark Souls: Remastered by QLOCQLOC
This is a talk on checkerboard rendering Markus & Andreas held at Digital Dragons 2019.
In it they quickly go through the history of Checkerboard Rendering before taking a deep dive into how it works and how it is implemented in Dark Souls: Remastered. Lastly, they present the quality and performance improvements they got from using it and their conclusion.
PS: The PDF. file includes useful in-depth notes from both authors.
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray TracingElectronic Arts / DICE
Modern Graphics Abstractions & Real-Time Ray Tracing discusses Halcyon, a graphics rendering system built from scratch using modern graphics APIs. Halcyon uses render handles, commands, backends, devices, and graphs to provide an efficient and flexible rendering system that works across APIs. It also details virtual multi-GPU capabilities that allow developers to test multi-GPU code even on single-GPU machines.
This talk is about our experiences gained during making of the Killzone Shadow Fall announcement demo.
We’ve gathered all the hard data about our assets, memory, CPU and GPU usage and a whole bunch of tricks.
The goal of talk is to help you to form a clear picture of what’s already possible to achieve on PS4.
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...Electronic Arts / DICE
Global illumination (GI) has been an ongoing quest in games. The perpetual tug-of-war between visual quality and performance often forces developers to take the latest and greatest from academia and tailor it to push the boundaries of what has been realized in a game product. Many elements need to align for success, including image quality, performance, scalability, interactivity, ease of use, as well as game-specific and production challenges.
First we will paint a picture of the current state of global illumination in games, addressing how the state of the union compares to the latest and greatest research. We will then explore various GI challenges that game teams face from the art, engineering, pipelines and production perspective. The games industry lacks an ideal solution, so the goal here is to raise awareness by being transparent about the real problems in the field. Finally, we will talk about the future. This will be a call to arms, with the objective of uniting game developers and researchers on the same quest to evolve global illumination in games from being mostly static, or sometimes perceptually real-time, to fully real-time.
This presentation was given at SIGGRAPH 2017 by Colin Barré-Brisebois (EA SEED) as part of the Open Problems in Real-Time Rendering course.
Talk by Graham Wihlidal (Frostbite Labs) at GDC 2017.
Checkerboard rendering is a relatively new technique, popularized recently by the introduction of the PlayStation 4 Pro. Many modern game engines are adding support for it right now, and in this talk, Graham will present an in-depth look at the new implementation in Frostbite, which is used in shipping titles like 'Battlefield 1' and 'Mass Effect Andromeda'. Despite being conceptually simple, checkerboard rendering requires a deep integration into the post-processing chain, in particular temporal anti-aliasing, dynamic resolution scaling, and poses various challenges to existing effects. This presentation will cover the basics of checkerboard rendering, explain the impact on a game engine that powers a wide range of titles, and provide a detailed look at how the current implementation in Frostbite works, including topics like object id, alpha unrolling, gradient adjust, and a highly efficient depth resolve.
The rendering technology of 'lords of the fallen' philip hammerMary Chan
This session is about some important aspects of the rendering pipeline of the upcoming Action-RPG "Lords of the Fallen", developed by Deck13 Interactive and CI Games for PS4, Xbox One, and PC. The topic covers several closely related areas like the deferred rendering system, image-based lighting using deferred cubemaps, deferred decals, and an approach for transparent object lighting and shadowing. More specifically, the lecture will cover several strategies to keep the G-Buffer as small and efficient as possible. This includes the description of a G-Buffer attribute-packing scheme and how per-material attributes can be exposed using special parameter lookup tables. Furthermore, a traditional problem of most deferred rendering systems is the seamless integration of transparent objects into the lighting. The lecture will present several ways to approach this problem, for example multi-pass deferred rendering, coloured transparent shadows, and a novel method for deferred particle lighting.
The PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case StudyGuerrilla
This session describes many of the SPU techniques used in the engine used to develop KILLZONE 2 for the PlayStation 3. It first focuses on individual techniques for SPU's as well as covering how these techniques work together in the game engine each frame.
Philip Hammer of DECK13 Interactive GmbH presented techniques used in rendering The Surge. Key points included: using physically based rendering with GGX BRDF; clustered deferred rendering with lighting computed on GPU; deferred decals for details; and optimizing shaders for AMD GCN occupancy. Future work focuses on new deferred approaches like bindless decals, improved materials, and migrating to Vulkan and DX12.
Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...Electronic Arts / DICE
Leanings from creating soundscapes for online multiplayer games. With experiences from the Battlefield Series with an emphasis on Battlefield: Bad Company.
The document discusses screen space reflections implemented in the game The Surge. It describes using screen space ray marching against the depth buffer to find reflection points, convolving the scene to accumulate multiple bounces, and using asynchronous compute to overlap rendering passes and improve performance. Key techniques included interleaved rendering, temporal reprojection, and using local data storage. Performance gains were achieved through optimizations like lower resolution rendering and computing mip chains in-place.
Rendering Technologies from Crysis 3 (GDC 2013)Tiago Sousa
This talk covers changes in CryENGINE 3 technology during 2012, with DX11 related topics such as moving to deferred rendering while maintaining backward compatibility on a multiplatform engine, massive vegetation rendering, MSAA support and how to deal with its common visual artifacts, among other topics.
This presentation gives an overview of the rendering techniques used in KILLZONE 2. We put the main focus on the lighting and shadowing techniques of our deferred shading engine and how we made them play nicely with anti-aliasing.
Bindless Deferred Decals in The Surge 2Philip Hammer
These are the slides for my talk at Digital Dragons 2019 in Krakow.
Update: The recordings are online on youtube now:
https://www.youtube.com/watch?v=e2wPMqWETj8
Optimizing the graphics pipeline with computeWuBinbo
The document discusses optimizing graphics rendering by using compute shaders to cull triangles on the GPU before rendering. It begins by providing acronyms for AMD GPU concepts. It then describes how the author experimented with offloading hull shader work and triangle culling to compute shaders, which showed performance improvements. The document outlines opportunities for using compute shaders to preprocess geometry more efficiently than the traditional graphics pipeline approach.
Technical talk from the AMD GPU14 Tech Day by Johan Andersson in the Frostbite team at DICE/EA about Battlefield 4 on PC which is the first title that will use 'Mantle' - a very high-performance low-level graphics API being in close collaboration by AMD and DICE/EA to get the absolute best performance and experience in Frostbite games on PC.
Rendering AAA-Quality Characters of Project A1Ki Hyunwoo
The document discusses rendering techniques for high quality characters in an unannounced game project called A1. It covers skin rendering using subsurface scattering with multiple scattering approximations. It also covers hair rendering using ordered independent transparency with a linked list approach integrated into UE4, as well as a physically based shading model for hair. Future work discussed includes improvements to subsurface scattering, lighting, and shadowing for transparent and translucent materials.
Graham Wihlidal from SEED attended the Munich Khronos Meetup and presented some aspects of Halcyon's rendering architecture, as well as details of the Vulkan implementation. Graham presented components like high-level render command translation, render graph, and shader compilation.
The document provides an overview of OpenGL and computer graphics concepts. It discusses the basics of computer graphics including applications, the graphics pipeline, primitives like vertices and polygons, attributes like color, and an example of drawing a shaded triangle. The graphics pipeline involves steps like vertex operations, primitive assembly, rasterization, and fragment operations. Primitives are specified using vertices and attributes remain in effect until changed. The OpenGL API is used to program 3D graphics and interfaces with the graphics driver.
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio Owen Wu
The document discusses Mali GPU architecture and Arm Mobile Studio. It provides details on Mali GPU components like Bifrost shader cores and tile-based rendering. It also describes features such as index-driven vertex shading, forward pixel kill, and efficient render passes. The document concludes with an overview of the Arm Mobile Studio tools for profiling GPU and CPU performance on mobile devices.
This talk presents the approach Frostbite took to add support for HDR displays. It will summarize Frostbite's previous post processing pipeline and what the issues were. Attendees will learn the decisions made to fix these issues, improve the color grading workflow and support high quality HDR and SDR output. This session will detail the display mapping used to implement the"grade once, output many" approach to targeting any display and why an ad-hoc approach as opposed to filmic tone mapping was chosen. Frostbite retained 3D LUT-based grading flexibility and the accuracy differences of computing these in decorrelated color spaces will be shown. This session will also include the main issues found on early adopter games, differences between HDR standards, optimizations to achieve performance parity with the legacy path and why supporting HDR can also improve the SDR version.
Takeaway
Attendees will learn how and why Frostbite chose to support High Dynamic Range [HDR] displays. They will understand the issues faced and how these were resolved. This talk will be useful for those still to support HDR and provide discussion points for those who already do.
Intended Audience
The intended audience is primarily rendering engineers, technical artists and artists; specifically those who focus on grading and lighting and those interested in HDR displays. Ideally attendees will be familiar with color grading and tonemapping.
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...Owen Wu
This document discusses code submissions to Unreal Engine 3 to enhance graphics capabilities. It covers additions of phong tessellation and optimizations for tessellation. It also discusses support for multi-monitor configurations through Eyefinity and improvements to bokeh depth of field and post-process anti-aliasing techniques. The presentation provides information on implementation details and performance comparisons for these techniques.
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...Intel® Software
Explore the proposed Metadata for Immersive Video (MIV) standard specification. MIV enables real-world content captured by cameras to be viewed by users with Six Degrees of Freedom (6DoF) movement, similar to a VR experience with synthetic content.
Tessellation is the covering of a flat surface without gaps using repeating shapes. The document discusses how to create shapes that tessellate by starting with a random shape and modifying it to fill spaces. Four example shapes are given that tessellate on their own and with each other. The rest of the document shows different tessellation patterns the author designed using software to tile surfaces or act as pavers.
This document discusses several teaching strategies for math: Lecture-Discussion Method, Cooperative and Collaborative Learning, Jigsaw Method, and Think-Pair-Share. It provides details on how each strategy works, including applying the Lecture-Discussion Method with its nine events of instruction, the emphasis of cooperative/collaborative learning, and examples of applying the Jigsaw Method and Think-Pair-Share in a classroom.
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...Electronic Arts / DICE
Global illumination (GI) has been an ongoing quest in games. The perpetual tug-of-war between visual quality and performance often forces developers to take the latest and greatest from academia and tailor it to push the boundaries of what has been realized in a game product. Many elements need to align for success, including image quality, performance, scalability, interactivity, ease of use, as well as game-specific and production challenges.
First we will paint a picture of the current state of global illumination in games, addressing how the state of the union compares to the latest and greatest research. We will then explore various GI challenges that game teams face from the art, engineering, pipelines and production perspective. The games industry lacks an ideal solution, so the goal here is to raise awareness by being transparent about the real problems in the field. Finally, we will talk about the future. This will be a call to arms, with the objective of uniting game developers and researchers on the same quest to evolve global illumination in games from being mostly static, or sometimes perceptually real-time, to fully real-time.
This presentation was given at SIGGRAPH 2017 by Colin Barré-Brisebois (EA SEED) as part of the Open Problems in Real-Time Rendering course.
Talk by Graham Wihlidal (Frostbite Labs) at GDC 2017.
Checkerboard rendering is a relatively new technique, popularized recently by the introduction of the PlayStation 4 Pro. Many modern game engines are adding support for it right now, and in this talk, Graham will present an in-depth look at the new implementation in Frostbite, which is used in shipping titles like 'Battlefield 1' and 'Mass Effect Andromeda'. Despite being conceptually simple, checkerboard rendering requires a deep integration into the post-processing chain, in particular temporal anti-aliasing, dynamic resolution scaling, and poses various challenges to existing effects. This presentation will cover the basics of checkerboard rendering, explain the impact on a game engine that powers a wide range of titles, and provide a detailed look at how the current implementation in Frostbite works, including topics like object id, alpha unrolling, gradient adjust, and a highly efficient depth resolve.
The rendering technology of 'lords of the fallen' philip hammerMary Chan
This session is about some important aspects of the rendering pipeline of the upcoming Action-RPG "Lords of the Fallen", developed by Deck13 Interactive and CI Games for PS4, Xbox One, and PC. The topic covers several closely related areas like the deferred rendering system, image-based lighting using deferred cubemaps, deferred decals, and an approach for transparent object lighting and shadowing. More specifically, the lecture will cover several strategies to keep the G-Buffer as small and efficient as possible. This includes the description of a G-Buffer attribute-packing scheme and how per-material attributes can be exposed using special parameter lookup tables. Furthermore, a traditional problem of most deferred rendering systems is the seamless integration of transparent objects into the lighting. The lecture will present several ways to approach this problem, for example multi-pass deferred rendering, coloured transparent shadows, and a novel method for deferred particle lighting.
The PlayStation®3’s SPUs in the Real World: A KILLZONE 2 Case StudyGuerrilla
This session describes many of the SPU techniques used in the engine used to develop KILLZONE 2 for the PlayStation 3. It first focuses on individual techniques for SPU's as well as covering how these techniques work together in the game engine each frame.
Philip Hammer of DECK13 Interactive GmbH presented techniques used in rendering The Surge. Key points included: using physically based rendering with GGX BRDF; clustered deferred rendering with lighting computed on GPU; deferred decals for details; and optimizing shaders for AMD GCN occupancy. Future work focuses on new deferred approaches like bindless decals, improved materials, and migrating to Vulkan and DX12.
Audio for Multiplayer & Beyond - Mixing Case Studies From Battlefield: Bad Co...Electronic Arts / DICE
Leanings from creating soundscapes for online multiplayer games. With experiences from the Battlefield Series with an emphasis on Battlefield: Bad Company.
The document discusses screen space reflections implemented in the game The Surge. It describes using screen space ray marching against the depth buffer to find reflection points, convolving the scene to accumulate multiple bounces, and using asynchronous compute to overlap rendering passes and improve performance. Key techniques included interleaved rendering, temporal reprojection, and using local data storage. Performance gains were achieved through optimizations like lower resolution rendering and computing mip chains in-place.
Rendering Technologies from Crysis 3 (GDC 2013)Tiago Sousa
This talk covers changes in CryENGINE 3 technology during 2012, with DX11 related topics such as moving to deferred rendering while maintaining backward compatibility on a multiplatform engine, massive vegetation rendering, MSAA support and how to deal with its common visual artifacts, among other topics.
This presentation gives an overview of the rendering techniques used in KILLZONE 2. We put the main focus on the lighting and shadowing techniques of our deferred shading engine and how we made them play nicely with anti-aliasing.
Bindless Deferred Decals in The Surge 2Philip Hammer
These are the slides for my talk at Digital Dragons 2019 in Krakow.
Update: The recordings are online on youtube now:
https://www.youtube.com/watch?v=e2wPMqWETj8
Optimizing the graphics pipeline with computeWuBinbo
The document discusses optimizing graphics rendering by using compute shaders to cull triangles on the GPU before rendering. It begins by providing acronyms for AMD GPU concepts. It then describes how the author experimented with offloading hull shader work and triangle culling to compute shaders, which showed performance improvements. The document outlines opportunities for using compute shaders to preprocess geometry more efficiently than the traditional graphics pipeline approach.
Technical talk from the AMD GPU14 Tech Day by Johan Andersson in the Frostbite team at DICE/EA about Battlefield 4 on PC which is the first title that will use 'Mantle' - a very high-performance low-level graphics API being in close collaboration by AMD and DICE/EA to get the absolute best performance and experience in Frostbite games on PC.
Rendering AAA-Quality Characters of Project A1Ki Hyunwoo
The document discusses rendering techniques for high quality characters in an unannounced game project called A1. It covers skin rendering using subsurface scattering with multiple scattering approximations. It also covers hair rendering using ordered independent transparency with a linked list approach integrated into UE4, as well as a physically based shading model for hair. Future work discussed includes improvements to subsurface scattering, lighting, and shadowing for transparent and translucent materials.
Graham Wihlidal from SEED attended the Munich Khronos Meetup and presented some aspects of Halcyon's rendering architecture, as well as details of the Vulkan implementation. Graham presented components like high-level render command translation, render graph, and shader compilation.
The document provides an overview of OpenGL and computer graphics concepts. It discusses the basics of computer graphics including applications, the graphics pipeline, primitives like vertices and polygons, attributes like color, and an example of drawing a shaded triangle. The graphics pipeline involves steps like vertex operations, primitive assembly, rasterization, and fragment operations. Primitives are specified using vertices and attributes remain in effect until changed. The OpenGL API is used to program 3D graphics and interfaces with the graphics driver.
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio Owen Wu
The document discusses Mali GPU architecture and Arm Mobile Studio. It provides details on Mali GPU components like Bifrost shader cores and tile-based rendering. It also describes features such as index-driven vertex shading, forward pixel kill, and efficient render passes. The document concludes with an overview of the Arm Mobile Studio tools for profiling GPU and CPU performance on mobile devices.
This talk presents the approach Frostbite took to add support for HDR displays. It will summarize Frostbite's previous post processing pipeline and what the issues were. Attendees will learn the decisions made to fix these issues, improve the color grading workflow and support high quality HDR and SDR output. This session will detail the display mapping used to implement the"grade once, output many" approach to targeting any display and why an ad-hoc approach as opposed to filmic tone mapping was chosen. Frostbite retained 3D LUT-based grading flexibility and the accuracy differences of computing these in decorrelated color spaces will be shown. This session will also include the main issues found on early adopter games, differences between HDR standards, optimizations to achieve performance parity with the legacy path and why supporting HDR can also improve the SDR version.
Takeaway
Attendees will learn how and why Frostbite chose to support High Dynamic Range [HDR] displays. They will understand the issues faced and how these were resolved. This talk will be useful for those still to support HDR and provide discussion points for those who already do.
Intended Audience
The intended audience is primarily rendering engineers, technical artists and artists; specifically those who focus on grading and lighting and those interested in HDR displays. Ideally attendees will be familiar with color grading and tonemapping.
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...Owen Wu
This document discusses code submissions to Unreal Engine 3 to enhance graphics capabilities. It covers additions of phong tessellation and optimizations for tessellation. It also discusses support for multi-monitor configurations through Eyefinity and improvements to bokeh depth of field and post-process anti-aliasing techniques. The presentation provides information on implementation details and performance comparisons for these techniques.
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...Intel® Software
Explore the proposed Metadata for Immersive Video (MIV) standard specification. MIV enables real-world content captured by cameras to be viewed by users with Six Degrees of Freedom (6DoF) movement, similar to a VR experience with synthetic content.
Tessellation is the covering of a flat surface without gaps using repeating shapes. The document discusses how to create shapes that tessellate by starting with a random shape and modifying it to fill spaces. Four example shapes are given that tessellate on their own and with each other. The rest of the document shows different tessellation patterns the author designed using software to tile surfaces or act as pavers.
This document discusses several teaching strategies for math: Lecture-Discussion Method, Cooperative and Collaborative Learning, Jigsaw Method, and Think-Pair-Share. It provides details on how each strategy works, including applying the Lecture-Discussion Method with its nine events of instruction, the emphasis of cooperative/collaborative learning, and examples of applying the Jigsaw Method and Think-Pair-Share in a classroom.
M.C. Escher was a graphic artist known for his tessellating artworks inspired by geometric patterns. He was fascinated by tile patterns he saw in the Alhambra palace in Spain, which used repeating Islamic geometric designs. Escher's artworks featured shapes that tessellate by covering a plane without gaps or overlaps, often distorting basic shapes like squares and hexagons. Though self-taught in mathematics, Escher's works were admired by mathematicians for their demonstration of mathematical concepts through tessellations and impossible constructions.
The document discusses tessellations, which are shapes that can cover a surface without gaps by repeating copies of the shape. Common shapes like squares and triangles can tessellate, but unusual shapes or combinations of different shapes may also tessellate. Specific tessellation shapes are discussed, including pentominoes (shapes made of 5 connected squares) and 7-pin polygons. Instructions are provided for creating original tessellations by modifying basic shapes that tessellate.
Maurits Cornelius Escher was a Dutch graphic artist born in 1898 in the Netherlands. He attended art school where he learned printmaking techniques and developed his style of optical illusions. After finishing school, Escher traveled extensively in Italy which inspired his artwork. In the late 1950s and 1960s, Escher gained recognition for prints like Ascending and Descending and Knots. He spent his later years in an artist retirement home in the Netherlands until his death in 1972 at age 73.
This document provides biographical information about the artist M.C. Escher and discusses his artworks and style. It notes that Escher was born in 1898 in the Netherlands and died in 1972. Although he disliked formal schooling, he enjoyed geometry and art. The document then discusses Escher's tessellations and how he was a master of geometric designs that play with dimensions and perspective in thought-provoking ways. It provides some basics on tessellation techniques using translation, rotation, and reflection of basic shapes.
1) OpenGL provides low-level control over graphics rendering which allows for customization and optimization compared to higher-level APIs.
2) The OpenGL rendering process in Android involves setting up an OpenGL context, initial setup, loading resources, compiling/linking shaders, uploading resources to the GPU, clearing the screen, and drawing.
3) Shaders written in the OpenGL Shading Language (GLSL) control how vertices and fragments are processed in the rendering pipeline.
The document discusses optimizing the graphics rendering pipeline by identifying bottlenecks and balancing workload across stages. It describes techniques for locating bottlenecks, such as making certain stages do less work and observing performance impacts. If the bottleneck is the application stage, optimizations include efficient code, memory access, and math functions. If the bottleneck is the geometry stage, optimizations include using triangle strips and reducing lighting computations. If the bottleneck is the rasterizer stage, optimizations include culling unused pixels and disabling features. The overall goal is to optimize the bottleneck and balance work across stages when other stages have idle cycles.
The document discusses solutions for common problems that arise in deferred rendering engines, such as handling multiple shading models and lighting translucent geometry. It proposes using multiple light rendering passes where the scene is masked in each pass to render only for specific shading models, avoiding expensive branching. It also details using object space light probes to efficiently light alpha objects and particle systems directly on the GPU within the deferred rendering framework.
- The document discusses advanced rendering techniques for virtual reality. It outlines Valve's research into VR hardware and software over the past 3 years.
- Key topics covered include stereo rendering methods, timing techniques like prediction and avoiding GPU bubbles, reducing specular aliasing using normal maps and roughness values, and geometric specular aliasing. The goal is high quality rendering at low GPU specifications to support widespread adoption of VR.
This document describes a primitive processing and advanced shading architecture for embedded systems. It features a vertex cache and programmable primitive engine that can process fixed and variable size primitives with reduced memory bandwidth requirements. The architecture includes a configurable per-fragment shader that supports various shading models using dot products and lookup tables stored on-chip. This hybrid design aims to bring appealing shading to embedded applications while meeting limitations on gate size, power consumption, and memory traffic growth.
This document discusses GPU-accelerated video encoding. It begins by motivating the use of GPUs for video encoding due to the large processing requirements and potential for parallelization. It then outlines the key components of a hybrid CPU-GPU encoding pipeline including partitioning workloads and minimizing data transfers. Several algorithm primitives like reduce, scan and compact that are useful for encoding are also described at a high level. The document dives deeper into specific encoding techniques like intra prediction and discusses their GPU implementations.
review of factors affecting IoT system selection. for MVP phase and later phases. Computation, price, connectivity, open source support, development SDKs
Dissecting and fixing Vulkan rendering issues in drivers with RenderDocIgalia
Broken and flickering geometry, corrupted textures, and even hangs in
real-world games and apps are common issues in open-source graphics driver
development. While conformance tests are mostly narrow and confined, finding
driver problems when running triple-A games can be a challenging task.
This talk will show a major misrendering example when running a game and the
steps taken to pinpoint the underlying problem in shader compilation using
RenderDoc. We will briefly touch the taxonomy of different issues, typical
causes, and generic methods to try.
(c) X.Org Developer Conference (XDC) 2021
Sep 15 - 17, 2021
http://xdc2021.x.org
This document summarizes key points about Apple's Metal API presented at WWDC 2014. It discusses Metal's goals of low CPU overhead, more predictable performance, and better programmability compared to OpenGL. Key ideas for Metal include up-front state validation, offline shader compilation, multi-threading support, and explicit synchronization. While Metal could enable more complex games on the A7, its availability is currently limited to the A7.
OpenCL & the Future of Desktop High Performance Computing in CADDesign World
Modern desktop computers have more compute capabilities than ever before. Most of these systems include both a central processing unit (CPU) and a graphics processing unit (GPU), each consisting of multiple computing cores providing tremendous processing power. To date, harnessing the total processing power of a desktop workstation, fully utilizing both the CPU and GPU, has proven difficult for software developers. CPUs and GPUs have few similarities in both design and programming models. OpenCL is the tool that bridges the gap for software developers and enables them to fully tap into the power of both processors with a single software programming interface.
This presentation will examine the details of CPUs and GPUs, explore their differences and similarities, and highlight the computing power they can provide. We will also take a look OpenCL, what it is, what it does, and how this new computing interface will change the way software developers create software and help end users fully realize the compute power contained within today’s modern desktop computers.
The document discusses GPU-accelerated video encoding using CUDA. It outlines the motivation for GPU encoding and describes the basic principles of a hybrid CPU-GPU encoding pipeline. It then discusses various algorithm primitives like reduce, scan and compact that are useful building blocks for encoding algorithms. The document dives deeper into how intra prediction and motion estimation can be implemented on the GPU. It provides pseudocode examples for intra prediction of 16x16 blocks using CUDA threads and shared memory.
This will be a talk about a few past projects which, at least at first glance, might not seem like the best fit for Scala. A combination of language features and libraries enabled rapid development on each of these projects, while keeping the rate of bugs (relatively) low, and performance (relatively) high.
Past, Present and Future Challenges of Global Illumination in GamesColin Barré-Brisebois
Global illumination (GI) has been an ongoing quest in games. The perpetual tug-of-war between visual quality and performance often forces developers to take the latest and greatest from academia and tailor it to push the boundaries of what has been realized in a game product. Many elements need to align for success, including image quality, performance, scalability, interactivity, ease of use, as well as game-specific and production challenges.
First we will paint a picture of the current state of global illumination in games, addressing how the state of the union compares to the latest and greatest research. We will then explore various GI challenges that game teams face from the art, engineering, pipelines and production perspective. The games industry lacks an ideal solution, so the goal here is to raise awareness by being transparent about the real problems in the field. Finally, we will talk about the future. This will be a call to arms, with the objective of uniting game developers and researchers on the same quest to evolve global illumination in games from being mostly static, or sometimes perceptually real-time, to fully real-time.
Computer Graphics - Lecture 01 - 3D Programming I💻 Anton Gerdelan
Here are a few key points about adding vertex colors to the example:
- Storing the color data in a separate buffer is cleaner than concatenating or interleaving it with the position data. This keeps the data layout simple.
- The vertex shader now has inputs for both the position (vp) and color (vc) attributes.
- The color is passed through as an output (fcolour) to the fragment shader.
- The position is still used to set gl_Position for transformation.
- The color input has to start in the vertex shader because that is where per-vertex attributes like color are interpolated across the primitive before being sampled in the fragment shader. The vertex shader interpolates the color value
Jeff Johnson, Research Engineer, Facebook at MLconf NYCMLconf
Hacking GPUs for Deep Learning: GPUs have revolutionized machine learning in recent years, and have made both massive and deep multi-layer neural networks feasible. However, misunderstandings on why they seem to be winning persist. Many of deep learning’s workloads are in fact “too small” for GPUs, and require significantly different approaches to take full advantage of their power. There are many differences between traditional high-performance computing workloads, long the domain of GPUs, and those used in deep learning. This talk will cover these issues by looking into various quirks of GPUs, how they are exploited (or not) in current model architectures, and how Facebook AI Research is approaching deep learning programming through our recent work.
The document discusses the PlayStation Graphics Library (PSGL), an industry standard graphics library for PlayStation 3. It provides precision tools for graphics programming on PS3, including support for OpenGL ES, Cg shaders, and COLLADA. PSGL aims to leverage existing development tools and expertise while guaranteeing quality through conformance testing. It covers topics like the choice of OpenGL ES and Cg over alternatives, PSGL extensions, and the use of COLLADA for content import/export.
Game engines have long been in the forefront of taking advantage of the ever
increasing parallel compute power of both CPUs and GPUs. This talk is about how the
parallel compute is utilized in practice on multiple platforms today in the Frostbite game
engine and how we think the parallel programming models, hardware and software in
the industry should look like in the next 5 years to help us make the best games possible.
The document discusses NVIDIA graphics hardware over seven years, the Cg programming language, and transparency techniques. It describes the evolution of NVIDIA GPUs and features like GeForce cards, increased processing power, and support for DirectX. It promotes Cg as a cross-platform language for GPU programming. It also explains the depth peeling algorithm for rendering transparency in real-time using multiple rendering passes.
Similar to Tessellation on any_budget-gdc2011 (20)
This talk, delivered at GDC 2014, describes a method to detect CPU-GPU sync points. CPU-GPU sync points rob applications of performance and often go undetected. As a single CPU-GPU sync point can halve an application's frame rate, it is important that they be understood and detected as quickly as possible.
Porting the Source Engine to Linux: Valve's Lessons Learnedbasisspace
These slides discuss the techniques applied to porting a large, commercial AAA engine from Windows to Linux. It includes the lessons learned along the way, and pitfalls we ran into to help serve as a warning to other developers.
This presentation demonstrates how to efficiently manage GPU buffers using today's APIs. It describes why buffer management is so important, and how inefficient buffer management can cut frame rates in half. Finally, it demonstrates a couple of new techniques; the first being discard-free circular buffers and the second transient buffers.
GDC talk on how to render Per-Face Texture Mapping (PTEX) datasets on commodity GPUs in Realtime in a much simpler fashion than the earlier proposed techniques. This method is heavily suitable for real time consumption, and can be altered to support other texture techniques.
A talk given to students at the University of Texas's Game Development program. General information about my experiences in the game industry (from ~10 years ago), as well as more recent work around the game industry.
Realtime Per Face Texture Mapping (PTEX)basisspace
This presentation shows the original method for implementing Per-Face Texture Mapping (PTEX) in real-time on commodity hardware. PTEX is used throughout the film industry to handle texture seams robustly while simultaneously easing artist workflow.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes.
Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions.
Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻
The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️
Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution.
The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfflufftailshop
When it comes to unit testing in the .NET ecosystem, developers have a wide range of options available. Among the most popular choices are NUnit, XUnit, and MSTest. These unit testing frameworks provide essential tools and features to help ensure the quality and reliability of code. However, understanding the differences between these frameworks is crucial for selecting the most suitable one for your projects.
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
4. New Direct3D 11 StagesNew Direct3D 11 Stages
Brief Recap
IAIA VS HS TSTS DS GS RASRAS PS OMOM
Programmable (Shader)
Fixed Function
5. Canonical Work Breakdown
• VS: conversion to camera space, control point
animation
• HS (CP): Compute Control Point locations,
compute per-control point culling info
• HS (PC): Use info from HS (CP) to compute
per-edge LOD; cull patches outside frustum
7. Techniques
• All techniques (except vanilla Flat Dicing) will
improve silhouettes and lighting
• But don’t incur the corresponding increase in
memory consumption
• And continuous LOD!
8. Flat Dicing
• Simplest form of tessellation
• Merely adds triangles where fewer were
previously
• Does not improve silhouettes alone
– Usually paired with displacement mapping
– Can also be used to reduce PS complexity
13. PN
• Originally proposed by Alex Vlachos, et al, in
Curved PN Triangles.
• Treats primitives as descriptions of Bezier
Surfaces, using the location as a position and
normal as a description of tangent of surface
at that position
14. PN Details
• Uses existing vertex and index buffer without
modification
15. PN Modifications
• PN calls for quadratic interpolation of normals
• This allows for inflection points while lighting
curved triangles
• The inflection points will show up
geometrically
• Skip the quadratic normal for lighting
16. Quadratic Normals
• Per-pixel lighting would require quadratic
tangents and binormals as well
– Lots of extra math
– Potential for gimbal lock in lighting!
• While correct according to the surface, this
does not match artist intent for lighting
21. PN – Cons
• Meshes can become
“Stay Puft”, particularly
around the feet
• C1 discontinuities (same position, different
normal) in input result in C0 discontinuity!
• Fixing discontinuities requires artist
involvement
22. PN-AEN
• Proposed by John McDonald and Mark Kilgard
in Crack-Free Point-Normal Triangles Using
Adjacent Edge Normals
• Uses PN with a twist—neighbor information
determined during a preprocess step to avoid
cracking
23. PN-AEN Details
• Uses existing VB without modification, but a
second IB must be generated
• Tool available from NVIDIA to
generate second IB
automatically (works for all
vendors)
27. PN-AEN – Pros
• Completely Procedural
• Small memory overhead
Cube with normals PN PN-AEN
28. PN-AEN – Cons
• More expensive (runtime cost) than PN
• Still can result in some ‘Stay Pufting’ of
meshes
• No artist involvement means less artistic
control
• Requires second index buffer, 9 indices pp
29. Displacement Mapping
• Used together with another
tessellation technique (often
Flat Dicing)
• Texture controls displacement
at each generated vertex
during Domain Shading
Wretch used courtesy Epic Games, Inc
30. Displacement Mapping Details
• Requires displacement map to be authored
– Although tools exist to generate from
normal maps
31. Displacement Mapping – Pros
• High impact silhouette and lighting
adjustments
• “Pay as you go”: Easy to add displacement to
“key” assets without adding to all assets
32. Displacement Mapping – Cons
• Care must be taken to avoid:
– Mesh swimming when LOD changes
– Cracks between patches
• Artist involvement means money being spent
33. Continuity
• Games have had continuity errors forever
• Normal/lighting discontinuities break C1.
• Tessellation, particularly displacement
mapping, makes breaking C0 easy.
• This is undesirable
34. Discontinuity
• What causes discontinuities?
– Vertices with same position pre-tessellation, but
different position post-tessellation
– Math Errors (Edge LOD calculation for triangles)
– Displacing along normals when normals are
disjoint
– Sampling Errors
36. Sampling Errors?!
• Impossible to ensure bit-accurate samples
across texture discontinuities
• With normal maps, this causes a lighting seam
• With tessellation, this causes a surface
discontinuity
37. Discontinuity Solution
• For each patch, store dominant
edge/dominant vertex information
• Detect that you’re at an edge or corner
• If so, sample UVs from dominant information
instead of self.
• Everyone agrees, cracks are gone!
38. Discontinuity Solution cont’d
• Orthogonal to choice of tessellation (works
everywhere!)
a b c d e f g h i j k l
Dominant Edges
Dominant Verts
42. Other Tessellation Techniques
• Phong Tessellation
– Works with existing assets
– No artist intervention required
– Suffers same issue as PN (C1 input discontinuity
result in C0 output discontinuity)
43. Other Tessellation Techniques
• Catmull Clark sub-d surfaces
– Great for the future
– Not great for fitting into existing engines
• NURBS
– Not a great fit for the GPU
– Definitely not a good fit for existing engines
44. Summary / Questions
Technique Production Cost Runtime Cost Value Add
Flat Dicing Free ~Free
May improve perf, good
basis for other
techniques
PN May require art fixes
Small runtime
overhead
Improved
silhouettes/lighting
PN-AEN Free
Additional indices
pulled versus PN
Crack free, better
silhouettes/lighting,
preserve hard edges
Displacement
Requires art, but pay
as you go
1 texture lookup
Works with other
techniques, allows very
fine detail
45. Debugging Techniques
• Verify your conventions
– Output Barycentric coordinates as
diffuse color
• Reduce shader to flat
tessellation, add pieces back
• Remove clipping, clever optimizations
Barycentric Coordinates as colors
46. Debugging Techniques cont’d
• Edge LOD specification for triangles is
surprising
• Parallel nSight – Version 1.51 is
available for free
47. Optimization Strategies
• Work at the lowest frequency appropriate
• Be aware that with poor LOD computation, DS
could run more than the PS.
• Shade Control Points in Vertex Shader to
leverage V$
• Clipping saves significant workload
48. Optimization Strategies cont’d
• Code should be written to maximize SIMD
parallelism
• Prefer shorter Patch Constant shaders (only
one thread per patch)
49. Optimization Strategies cont’d
• Avoid tessellation factors <2 if possible
– Paying for GPU to tessellate when expansion is
very low is just cost—no benefit
• Avoid tessellation factors that would result in
triangles smaller than 3 pix/side (PS will still
operate at quad-granularity)
51. NVIDIA Confidential
NVIDIA @ GDC 2011
CAN’T GET ENOUGH? MORE WAYS TO LEARN:
NVIDIA GAME TECHNOLOGY THEATER
Wed, March 2nd
and Fri, March 4th
@ NVIDIA Booth
Open to all attendees. Featuring talks and demos from leading developers at game studios and
more, covering a wide range of topics on the latest in GPU game technology.
NVIDIA DEVELOPER SESSIONS
All Day Thurs, March 3rd
@ Room 110, North Hall E
Open to all attendees. Full schedule on www.nvidia.com/gdc2011
MORE DEVELOPER TOOLS & RESOURCES
Available online 24/7 @ developer.nvidia.com
WE’RE HIRING
More info @ careers.nvidia.com
NVIDIA Booth
South Hall #1802
Details on schedule and to
download copies of
presentations visit
www.nvidia.com/gdc2011
53. Appendix
float ComputeClipping(float4x4 projMatrix, float3 cpA, float3 cpB, float3 cpC)
{
float4 projPosA = ApplyProjection(projMatrix, cpA),
projPosB = ApplyProjection(projMatrix, cpB),
projPosC = ApplyProjection(projMatrix, cpC);
return min(min(IsClipped(projPosA), IsClipped(projPosB)), IsClipped(projPosC));
}
Note: This isn’t quite correct for clipping—it will clip primitives that are so close to the camera that
the control points are all out of bounds. The correct clipping would actually store in-bounds/out-of
bounds for each plane, then determine if all points failed any one plane.