presented at SIGGRAPH 2014 in Vancouver during NVIDIA's "Best of GTC" sponsored sessions
http://www.nvidia.com/object/siggraph2014-best-gtc.html
Watch the replay that includes a demo of GPU-accelerated Illustrator and several OpenGL 4 demos running on NVIDIA's Tegra Shield tablet.
http://www.ustream.tv/recorded/51255959
Find out more about the OpenGL examples for GameWorks:
https://developer.nvidia.com/gameworks-opengl-samples
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadTristan Lorach
This presentation introduces a new NVIDIA extension called Command-list.
The purpose of this presentation is to explain the basic concepts on how to use it and show what are the benefits.
The sample I used for the talk is here: https://github.com/nvpro-samples/gl_commandlist_bk3d_models
The driver for trying should be PreRelease 347.09
http://www.nvidia.com/download/driverResults.aspx/80913/en-us
OpenGL 4.4 provides new features for accelerating scenes with many objects, which are typically found in professional visualization markets. This talk will provide details on the usage of the features and their effect on real-life models. Furthermore we will showcase how more work for rendering a scene can be off-loaded to the GPU, such as efficient occlusion culling or matrix calculations.
Video presentation here: http://on-demand.gputechconf.com/gtc/2014/video/S4379-opengl-44-scene-rendering-techniques.mp4
Ever wondered how to use modern OpenGL in a way that radically reduces driver overhead? Then this talk is for you.
John McDonald and Cass Everitt gave this talk at Steam Dev Days in Seattle on Jan 16, 2014.
A technical deep dive into the DX11 rendering in Battlefield 3, the first title to use the new Frostbite 2 Engine. Topics covered include DX11 optimization techniques, efficient deferred shading, high-quality rendering and resource streaming for creating large and highly-detailed dynamic environments on modern PCs.
Presented as a pre-conference tutorial at the GPU Technology Conference in San Jose on September 20, 2010.
Learn about NVIDIA's OpenGL 4.1 functionality available now on Fermi-based GPUs.
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadTristan Lorach
This presentation introduces a new NVIDIA extension called Command-list.
The purpose of this presentation is to explain the basic concepts on how to use it and show what are the benefits.
The sample I used for the talk is here: https://github.com/nvpro-samples/gl_commandlist_bk3d_models
The driver for trying should be PreRelease 347.09
http://www.nvidia.com/download/driverResults.aspx/80913/en-us
OpenGL 4.4 provides new features for accelerating scenes with many objects, which are typically found in professional visualization markets. This talk will provide details on the usage of the features and their effect on real-life models. Furthermore we will showcase how more work for rendering a scene can be off-loaded to the GPU, such as efficient occlusion culling or matrix calculations.
Video presentation here: http://on-demand.gputechconf.com/gtc/2014/video/S4379-opengl-44-scene-rendering-techniques.mp4
Ever wondered how to use modern OpenGL in a way that radically reduces driver overhead? Then this talk is for you.
John McDonald and Cass Everitt gave this talk at Steam Dev Days in Seattle on Jan 16, 2014.
A technical deep dive into the DX11 rendering in Battlefield 3, the first title to use the new Frostbite 2 Engine. Topics covered include DX11 optimization techniques, efficient deferred shading, high-quality rendering and resource streaming for creating large and highly-detailed dynamic environments on modern PCs.
Presented as a pre-conference tutorial at the GPU Technology Conference in San Jose on September 20, 2010.
Learn about NVIDIA's OpenGL 4.1 functionality available now on Fermi-based GPUs.
Presented September 30, 2009 in San Jose, California at GPU Technology Conference.
Describes the new features of OpenGL 3.2 and NVIDIA's extensions beyond 3.2 such as bindless graphics, direct state access, separate shader objects, copy image, texture barrier, and Cg 2.2.
Bill explains some of the ways that the Vertex Shader can be used to improve performance by taking a fast path through the Vertex Shader rather than generating vertices with other parts of the pipeline in this AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Check out more technical presentations at http://developer.amd.com/resources/documentation-articles/conference-presentations/
The goal of this session is to demonstrate techniques that improve GPU scalability when rendering complex scenes. This is achieved through a modular design that separates the scene graph representation from the rendering backend. We will explain how the modules in this pipeline are designed and give insights to implementation details, which leverage GPU''s compute capabilities for scene graph processing. Our modules cover topics such as shader generation for improved parameter management, synchronizing updates between scenegraph and rendering backend, as well as efficient data structures inside the renderer.
Video here: http://on-demand.gputechconf.com/gtc/2013/video/S3032-Advanced-Scenegraph-Rendering-Pipeline.mp4
This session presents a detailed programmer oriented overview of our SPU based shading system implemented in DICE's Frostbite 2 engine and how it enables more visually rich environments in BATTLEFIELD 3 and better performance over traditional GPU-only based renderers. We explain in detail how our SPU Tile-based deferred shading system is implemented, and how it supports rich material variety, High Dynamic Range Lighting, and large amounts of light sources of different types through an extensive set of culling, occlusion and optimization techniques.
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
With further advancement in the current console cycle, new tricks are being learned to squeeze the maximum performance out of the hardware. This talk will present how the compute power of the console and PC GPUs can be used to improve the triangle throughput beyond the limits of the fixed function hardware. The discussed method shows a way to perform efficient "just-in-time" optimization of geometry, and opens the way for per-primitive filtering kernels and procedural geometry processing.
Takeaway:
Attendees will learn how to preprocess geometry on-the-fly per frame to improve rendering performance and efficiency.
Intended Audience:
This presentation is targeting seasoned graphics developers. Experience with DirectX 12 and GCN is recommended, but not required.
Game engines have long been in the forefront of taking advantage of the ever increasing parallel compute power of both CPUs and GPUs. This talk is about how the parallel compute is utilized in practice on multiple platforms today in the Frostbite game engine and how we think the parallel programming models, hardware and software in the industry should look like in the next 5 years to help us make the best games possible
NVIDIA OpenGL and Vulkan Support for 2017Mark Kilgard
Learn how NVIDIA continues improving both Vulkan and OpenGL for cross-platform graphics and compute development. This high-level talk is intended for anyone wanting to understand the state of Vulkan and OpenGL in 2017 on NVIDIA GPUs. For OpenGL, the latest standard update maintains the compatibility and feature-richness you expect. For Vulkan, NVIDIA has enabled the latest NVIDIA GPU hardware features and now provides explicit support for multiple GPUs. And for either API, NVIDIA's SDKs and Nsight tools help you develop and debug your application faster.
NVIDIA booth theater presentation at SIGGRAPH in Los Angeles, August 1, 2017.
http://www.nvidia.com/object/siggraph2017-schedule.html?id=sig1732
Get your SIGGRAPH driver release with OpenGL 4.6 and the latest Vulkan functionality from
https://developer.nvidia.com/opengl-driver
Talk by Graham Wihlidal (Frostbite Labs) at GDC 2017.
Checkerboard rendering is a relatively new technique, popularized recently by the introduction of the PlayStation 4 Pro. Many modern game engines are adding support for it right now, and in this talk, Graham will present an in-depth look at the new implementation in Frostbite, which is used in shipping titles like 'Battlefield 1' and 'Mass Effect Andromeda'. Despite being conceptually simple, checkerboard rendering requires a deep integration into the post-processing chain, in particular temporal anti-aliasing, dynamic resolution scaling, and poses various challenges to existing effects. This presentation will cover the basics of checkerboard rendering, explain the impact on a game engine that powers a wide range of titles, and provide a detailed look at how the current implementation in Frostbite works, including topics like object id, alpha unrolling, gradient adjust, and a highly efficient depth resolve.
A description of the next-gen rendering technique called Triangle Visibility Buffer. It offers up to 10x - 20x geometry compared to Deferred rendering and much higher resolution. Generally it aligns better with memory access patterns in modern GPUs compared to Deferred Lighting like Clustered Deferred Lighting etc.
Bindless Deferred Decals in The Surge 2Philip Hammer
These are the slides for my talk at Digital Dragons 2019 in Krakow.
Update: The recordings are online on youtube now:
https://www.youtube.com/watch?v=e2wPMqWETj8
Rendering Technologies from Crysis 3 (GDC 2013)Tiago Sousa
This talk covers changes in CryENGINE 3 technology during 2012, with DX11 related topics such as moving to deferred rendering while maintaining backward compatibility on a multiplatform engine, massive vegetation rendering, MSAA support and how to deal with its common visual artifacts, among other topics.
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
In this talk, the authors will describe an overview of a different method for deferred lighting approach used in CryENGINE 3, along with an in-depth description of the many techniques used. Original file and videos at http://crytek.com/cryengine/presentations
Video replay: http://nvidia.fullviewmedia.com/siggraph2012/ondemand/SS104.html
Date: Wednesday, August 8, 2012
Time: 11:50 AM - 12:50 PM
Location: SIGGRAPH 2012, Los Angeles
Attend this session to get the most out of OpenGL on NVIDIA Quadro and GeForce GPUs. Learn about the new features in OpenGL 4.3, particularly Compute Shaders. Other topics include bindless graphics; Linux improvements; and how to best use the modern OpenGL graphics pipeline. Learn how your application can benefit from NVIDIA's leadership driving OpenGL as a cross-platform, open industry standard.
Get OpenGL 4.3 beta drivers for NVIDIA GPUs from http://www.nvidia.com/content/devzone/opengl-driver-4.3.html
Presented September 30, 2009 in San Jose, California at GPU Technology Conference.
Describes the new features of OpenGL 3.2 and NVIDIA's extensions beyond 3.2 such as bindless graphics, direct state access, separate shader objects, copy image, texture barrier, and Cg 2.2.
Bill explains some of the ways that the Vertex Shader can be used to improve performance by taking a fast path through the Vertex Shader rather than generating vertices with other parts of the pipeline in this AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Check out more technical presentations at http://developer.amd.com/resources/documentation-articles/conference-presentations/
The goal of this session is to demonstrate techniques that improve GPU scalability when rendering complex scenes. This is achieved through a modular design that separates the scene graph representation from the rendering backend. We will explain how the modules in this pipeline are designed and give insights to implementation details, which leverage GPU''s compute capabilities for scene graph processing. Our modules cover topics such as shader generation for improved parameter management, synchronizing updates between scenegraph and rendering backend, as well as efficient data structures inside the renderer.
Video here: http://on-demand.gputechconf.com/gtc/2013/video/S3032-Advanced-Scenegraph-Rendering-Pipeline.mp4
This session presents a detailed programmer oriented overview of our SPU based shading system implemented in DICE's Frostbite 2 engine and how it enables more visually rich environments in BATTLEFIELD 3 and better performance over traditional GPU-only based renderers. We explain in detail how our SPU Tile-based deferred shading system is implemented, and how it supports rich material variety, High Dynamic Range Lighting, and large amounts of light sources of different types through an extensive set of culling, occlusion and optimization techniques.
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
With further advancement in the current console cycle, new tricks are being learned to squeeze the maximum performance out of the hardware. This talk will present how the compute power of the console and PC GPUs can be used to improve the triangle throughput beyond the limits of the fixed function hardware. The discussed method shows a way to perform efficient "just-in-time" optimization of geometry, and opens the way for per-primitive filtering kernels and procedural geometry processing.
Takeaway:
Attendees will learn how to preprocess geometry on-the-fly per frame to improve rendering performance and efficiency.
Intended Audience:
This presentation is targeting seasoned graphics developers. Experience with DirectX 12 and GCN is recommended, but not required.
Game engines have long been in the forefront of taking advantage of the ever increasing parallel compute power of both CPUs and GPUs. This talk is about how the parallel compute is utilized in practice on multiple platforms today in the Frostbite game engine and how we think the parallel programming models, hardware and software in the industry should look like in the next 5 years to help us make the best games possible
NVIDIA OpenGL and Vulkan Support for 2017Mark Kilgard
Learn how NVIDIA continues improving both Vulkan and OpenGL for cross-platform graphics and compute development. This high-level talk is intended for anyone wanting to understand the state of Vulkan and OpenGL in 2017 on NVIDIA GPUs. For OpenGL, the latest standard update maintains the compatibility and feature-richness you expect. For Vulkan, NVIDIA has enabled the latest NVIDIA GPU hardware features and now provides explicit support for multiple GPUs. And for either API, NVIDIA's SDKs and Nsight tools help you develop and debug your application faster.
NVIDIA booth theater presentation at SIGGRAPH in Los Angeles, August 1, 2017.
http://www.nvidia.com/object/siggraph2017-schedule.html?id=sig1732
Get your SIGGRAPH driver release with OpenGL 4.6 and the latest Vulkan functionality from
https://developer.nvidia.com/opengl-driver
Talk by Graham Wihlidal (Frostbite Labs) at GDC 2017.
Checkerboard rendering is a relatively new technique, popularized recently by the introduction of the PlayStation 4 Pro. Many modern game engines are adding support for it right now, and in this talk, Graham will present an in-depth look at the new implementation in Frostbite, which is used in shipping titles like 'Battlefield 1' and 'Mass Effect Andromeda'. Despite being conceptually simple, checkerboard rendering requires a deep integration into the post-processing chain, in particular temporal anti-aliasing, dynamic resolution scaling, and poses various challenges to existing effects. This presentation will cover the basics of checkerboard rendering, explain the impact on a game engine that powers a wide range of titles, and provide a detailed look at how the current implementation in Frostbite works, including topics like object id, alpha unrolling, gradient adjust, and a highly efficient depth resolve.
A description of the next-gen rendering technique called Triangle Visibility Buffer. It offers up to 10x - 20x geometry compared to Deferred rendering and much higher resolution. Generally it aligns better with memory access patterns in modern GPUs compared to Deferred Lighting like Clustered Deferred Lighting etc.
Bindless Deferred Decals in The Surge 2Philip Hammer
These are the slides for my talk at Digital Dragons 2019 in Krakow.
Update: The recordings are online on youtube now:
https://www.youtube.com/watch?v=e2wPMqWETj8
Rendering Technologies from Crysis 3 (GDC 2013)Tiago Sousa
This talk covers changes in CryENGINE 3 technology during 2012, with DX11 related topics such as moving to deferred rendering while maintaining backward compatibility on a multiplatform engine, massive vegetation rendering, MSAA support and how to deal with its common visual artifacts, among other topics.
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
In this talk, the authors will describe an overview of a different method for deferred lighting approach used in CryENGINE 3, along with an in-depth description of the many techniques used. Original file and videos at http://crytek.com/cryengine/presentations
Video replay: http://nvidia.fullviewmedia.com/siggraph2012/ondemand/SS104.html
Date: Wednesday, August 8, 2012
Time: 11:50 AM - 12:50 PM
Location: SIGGRAPH 2012, Los Angeles
Attend this session to get the most out of OpenGL on NVIDIA Quadro and GeForce GPUs. Learn about the new features in OpenGL 4.3, particularly Compute Shaders. Other topics include bindless graphics; Linux improvements; and how to best use the modern OpenGL graphics pipeline. Learn how your application can benefit from NVIDIA's leadership driving OpenGL as a cross-platform, open industry standard.
Get OpenGL 4.3 beta drivers for NVIDIA GPUs from http://www.nvidia.com/content/devzone/opengl-driver-4.3.html
Getting Started with NV_path_renderingMark Kilgard
In this tutorial, you will learn how to GPU-accelerate path rendering within your OpenGL program with NVIDIA's NV_path_rendering extension. This tutorial assumes you are familiar with OpenGL programming in general and how to use OpenGL extensions.
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Mark Kilgard
Slides for SIGGRAPH paper presentation of "Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline".
Presented by Vineet Batra (Adobe) on Thursday, August 13, 2015 at 2:00 pm - 3:30 pm, Los Angeles Convention Center, Room 150/151.
SIGGRAPH Asia 2012 Exhibitor Talk: OpenGL 4.3 and BeyondMark Kilgard
Location: Conference Hall K, Singapore EXPO
Date: Thursday, November 29, 2012
Time: 11:00 AM - 11:50 PM
Presenter: Mark Kilgard (Principal Software Engineer, NVIDIA, Austin, Texas)
Abstract: Attend this session to get the most out of OpenGL on NVIDIA Quadro and GeForce GPUs. Learn about the new features in OpenGL 4.3, particularly Compute Shaders. Other topics include bindless graphics; Linux improvements; and how to best use the modern OpenGL graphics pipeline. Learn how your application can benefit from NVIDIA's leadership driving OpenGL as a cross-platform, open industry standard.
Topic Areas: Computer Graphics; Development Tools & Libraries; Visualization; Image and Video Processing
Level: Intermediate
Siggraph 2016 - Vulkan and nvidia : the essentialsTristan Lorach
This presentation introduces Vulkan components, what you must know to start using this new API. And what you must know when using it on NVIDIA hardware
Porting the Source Engine to Linux: Valve's Lessons Learnedbasisspace
These slides discuss the techniques applied to porting a large, commercial AAA engine from Windows to Linux. It includes the lessons learned along the way, and pitfalls we ran into to help serve as a warning to other developers.
State of GeoServer provides an update on our community and reviews the new and noteworthy features for the Project. The community keeps an aggressive six month release cycle with GeoServer 2.8 and 2.9 being released this year.
Each releases bring together exciting new features. This year a lot of work has been done on the user interface, clustering, security and compatibility with the latest Java platform. We will also take a look at community research into vector tiles, multi-resolution raster support and more.
Attend this talk for a cheerful update on what is happening with this popular OSGeo project. Whether you are an expert user, a developer, or simply curious what these projects can do for you, this talk is for you.
A long-time implementer of OpenGL (Mark Kilgard, NVIDIA) and the system's original architect (Kurt Akeley, Microsoft) explain OpenGL's design and evolution. OpenGL's state machine is now a complex data-flow with multiple programmable stages. OpenGL practitioners can expect candid design explanations, advice for programming modern GPUs, and insight into OpenGL's future.
These slides were presented at SIGGRAPH Asia 2008 for the "Modern OpenGL: Its Design and Evolution" course.
Course abstract: OpenGL was conceived in 1991 to provide an industry standard for programming the hardware graphics pipeline. The original design has evolved considerably over the last 17 years. Whereas capabilities mandated by OpenGL such as texture mapping and a stencil buffer were present only on the world's most expensive graphics hardware back in 1991, now these features are completely pervasive in PCs and now even available in several hand-held devices. Over that time, OpenGL's original fixed-function state machine has evolved into a complex data-flow including several application-programmable stages. And the performance of OpenGL has increased from 100x to over 1,000x in many important raw graphics operations.
In this course, a long-time implementer of OpenGL and the system's original architect explain OpenGL's design and evolution.
You will learn how the modern (post-2006) graphics hardware pipeline is exposed through OpenGL. You will hear Kurt Akeley's personal retrospective on OpenGL's development. You will learn nine ways to write better OpenGL programs. You will learn how modern OpenGL implementations operate. Finally we discuss OpenGL's future evolution.
Whether you program with OpenGL or program with another API such as Direct3D, this course will give you new insights into graphics hardware architecture, programmable shading, and how to best take advantage of modern GPUs.
While the Language Server Protocol (LSP) has quickly become an industry standard in the devtools domain and Eclipse IDE promptly got support for it with the Eclipse LSP4J and LSP4E projects, LSP is only targetting the code edition activity. However, code edition is just one activity amongst others for a developer, and some would argue that it's not the main use-case that justifies usage of an IDE over a simple text editor.
One of the most important activity (where IDEs are usually better than other tools) for a developer is debugging: watching a program run, digging hints of what could be wrong, experimenting things against the running application... Similarly to the Language Server Protocol, as part of Visual Studio Code, a JSON-based language and tool agnostic protocol was created to support typical debug interactions and facilitate the binding of a devtool with a debugger. Eclipse LSP4J and LSP4E enabled in early 2018 support for this Debug Adapter Protocol in Eclipse IDE.
Eclipse aCute, providing an IDE for C# and .NET Core, managed to use this Debug Adapter Protocol and the existing integrations with Eclipse IDE to relatively easily and quickly integrate support for netcoredbg, an open-source debugger for .NET Core.
In this presentation, we'll explain how the Debug Adapter Protocol works, how LSP4J can be used to support it in any Java application (either a debugger or a client), how LSP4E can be used to support it out-of-the-box in Eclipse IDE, and we'll use aCute example to show how plugin providers can extend LSP4E and provide the final steps of a good and simple UX.
Computer Graphics - Lecture 01 - 3D Programming I💻 Anton Gerdelan
Slides from when I was teaching CS4052 Computer Graphics at Trinity College Dublin in Ireland.
These slides aren't used any more so they may as well be available to the public!
There are some mistakes in the slides, I'll try to comment below these.
This is the second lecture, and introduces programming with OpenGL 4 and shaders.
D11: a high-performance, protocol-optional, transport-optional, window system...Mark Kilgard
Consider the dual pressures toward a more tightly integrated workstation window system: 1) the need to efficiently handle high bandwidth services such as video, audio, and three-dimensional graphics; and 2) the desire to achieve the under-realized potential for local window system performance in X11.
This paper proposes a new window system architecture called D11 that seeks higher performance while preserving compatibility with the industry-standard X11 window system. D11 reinvents the X11 client/server architecture using a new operating system facility similar in concept to the Unix kernel's traditional implementation but designed for user-level execution. This new architecture allows local D11 programs to execute within the D11 window system kernel without compromising the window sytem's integrity. This scheme minimizes context switching, eliminates protocol packing and unpacking, and greatly reduces data copying. D11 programs fall back to the X11 protocol when running remote or connecting to an X11 server. A special D11 program acts as an X11 protocol translator to allow X11 programs to utilize a D11 window system.
[The described system was never implemented.]
EXT_window_rectangles extends OpenGL with a new per-fragment test called the "window rectangles test" for use with FBOs that provides 8 or more inclusive or exclusive rectangles for rasterized fragments. Applications of this functionality include web browsers and virtual reality.
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineMark Kilgard
SIGGRAPH 2015 paper.
We describe our successful initiative to accelerate Adobe Illustrator with the graphics hardware pipeline of modern GPUs. Relying on OpenGL 4.4 plus recent OpenGL extensions for advanced blend modes and first-class GPU-accelerated path rendering, we accelerate the Adobe Graphics Model (AGM) layer responsible for rendering sophisticated Illustrator scenes. Illustrator documents render in either an RGB or CMYK color mode. While GPUs are designed and optimized for RGB rendering, we orchestrate OpenGL rendering of vector content in the proper CMYK color space and accommodate the 5+ color components required. We support both non-isolated and isolated transparency groups, knockout, patterns, and arbitrary path clipping. We harness GPU tessellation to shade paths smoothly with gradient meshes. We do all this and render complex Illustrator scenes 2 to 6x faster than CPU rendering at Full HD resolutions; and 5 to 16x faster at Ultra HD resolutions.
NV_path_rendering is an OpenGL extension for GPU-accelerated path rendering. Recent functionality improvements provide better performance, better typography, rounded rectangles, conics, and OpenGL ES support. This functionality is available today with NVIDIA's 337.88 drivers.
The latest NV_path_rendering specification documents these new functional improvements:
https://www.opengl.org/registry/specs/NV/path_rendering.txt
You can find sample code here:
https://github.com/markkilgard/NVprSDK
SIGGRAPH Asia 2012: GPU-accelerated Path RenderingMark Kilgard
Presented at SIGGRAPH Asia 2012 in Singapore on Friday, 30 November 14:15 - 16:00 during the "Points and Vectors" session.
Find the paper at http://developer.nvidia.com/game/gpu-accelerated-path-rendering or on Slideshare.
For thirty years, resolution-independent 2D standards (e.g. PostScript, SVG) have relied largely on CPU-based algorithms for the filling and stroking of paths. Learn about our approach to accelerate path rendering with our GPU-based "Stencil, then Cover" programming interface. We've built and productized our OpenGL-based system.
Preprint for SIGGRAPH Asia 2012
Copyright ACM, 2012
For thirty years, resolution-independent 2D standards (e.g. PostScript,
SVG) have depended on CPU-based algorithms for the filling and
stroking of paths. However advances in graphics hardware have largely
ignored the problem of accelerating resolution-independent 2D graphics
rendered from paths.
Our work builds on prior work to re-factor the path rendering task
to leverage existing capabilities of modern pipelined and massively
parallel GPUs. We introduce a two-step “Stencil, then Cover” (StC)
paradigm that explicitly decouples path rendering into one GPU
step to determine a path’s filled or stenciled coverage and a second
step to rasterize conservative geometry intended to test and reset the
coverage determinations of the first step while shading color samples
within the path. Our goals are completeness, correctness, quality, and
performance—but we go further to unify path rendering with OpenGL’s
established 3D rendering pipeline. We have built and productized our
approach to accelerate path rendering as an OpenGL extension.
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingMark Kilgard
Video replay: http://nvidia.fullviewmedia.com/siggraph2012/ondemand/SS106.html
Location: West Hall Meeting Room 503, Los Angeles Convention Center
Date: Wednesday, August 8, 2012
Time: 2:40 PM – 3:40 PM
The future of GPU-based visual computing integrates the web, resolution-independent 2D graphics, and 3D to maximize interactivity and quality while minimizing consumed power. See what NVIDIA is doing today to accelerate resolution-independent 2D graphics for web content. This presentation explains NVIDIA's unique "stencil, then cover" approach to accelerating path rendering with OpenGL and demonstrates the wide variety of web content that can be accelerated with this approach.
More information: http://developer.nvidia.com/nv-path-rendering
Presented at the GPU Technology Conference 2012 in San Jose, California.
Tuesday, May 15, 2012.
Standards such as Scalable Vector Graphics (SVG), PostScript, TrueType outline fonts, and immersive web content such as Flash depend on a resolution-independent 2D rendering paradigm that GPUs have not traditionally accelerated. This tutorial explains a new opportunity to greatly accelerate vector graphics, path rendering, and immersive web standards using the GPU. By attending, you will learn how to write OpenGL applications that accelerate the full range of path rendering functionality. Not only will you learn how to render sophisticated 2D graphics with OpenGL, you will learn to mix such resolution-independent 2D rendering with 3D rendering and do so at dynamic, real-time rates.
Presented at the GPU Technology Conference 2012 in San Jose, California.
Monday, May 14, 2012.
Attend this session to get the most out of OpenGL on NVIDIA Quadro and GeForce GPUs. Topics covered include the latest advances available for Cg 3.1, the OpenGL Shading Language (GLSL); programmable tessellation; improved support for Direct3D conventions; integration with Direct3D and CUDA resources; bindless graphics; and more. When you utilize the latest OpenGL innovations from NVIDIA in your graphics applications, you benefit from NVIDIA's leadership driving OpenGL as a cross-platform, open industry standard.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Neuro-symbolic is not enough, we need neuro-*semantic*
OpenGL 4.5 Update for NVIDIA GPUs
1. SG4121: OPENGL 4.5 UPDATE FOR
NVIDIA GPUS
Mark Kilgard
Principal System Software Engineer, NVIDIA
Piers Daniell
Senior Graphics Software Engineer, NVIDIA
2. Mark Kilgard
• Principal System Software Engineer
– OpenGL driver and API evolution
– Cg (“C for graphics”) shading language
– GPU-accelerated path rendering
• OpenGL Utility Toolkit (GLUT) implementer
• Author of OpenGL for the X Window System
• Co-author of Cg Tutorial
• Worked on OpenGL for 20+ years
3. Piers Daniell
• Senior Graphics Software Engineer
• NVIDIA’s Khronos OpenGL representative
– Since 2010
– Authored numerous OpenGL
extension specifications now core
• Leads OpenGL version updates
– Since OpenGL 4.1
• 10+ years with NVIDIA
5. Single 3D API for Every Platform
OS X
Linux
FreeBSD
Solaris
Android
Windows
6. Adobe Creative Cloud:
GPU-accelerated Illustrator
• 27 year old application
– World’s leading graphics
design application
• 6 million users
– Never used the GPU
• Until this June 2014
• Adobe and NVIDIA worked to
integrate NV_path_rendering
into Illustrator CC 2014
7.
8. OpenGL 4.x Evolution
Major revision of OpenGL every year since OpenGL 3.0, 2008
Maintained full backwards compatibility
2010 2011 2012 2013 2014
OpenGL 4.0: Tessellation
OpenGL 4.1: Shader mix-and-match, ES2 compatibility
OpenGL 4.2: GLSL upgrades and shader image load store
OpenGL 4.3: Compute shaders, SSBO, ES3 compatibility
OpenGL 4.4: Persistently mapped buffers, multi bind
???
9. Big News: OpenGL 4.5 Released Today!
Direct State Access (DSA) finally!
Robustness
OpenGL ES 3.1 compatibility
Faster MakeCurrent
DirectX 11 features for porting and emulation
SubImage variant of GetTexImage
Texture barriers
Sparse buffers (ARB extension)
10. So OpenGL Evolution Through 4.5
Major revision of OpenGL every year since 2008
Maintained full backwards compatibility
2010 2011 2012 2013 2014
OpenGL 4.0: Tessellation
OpenGL 4.1: Shader mix-and-match, ES2 compatibility
OpenGL 4.2: GLSL upgrades and shader image load store
OpenGL 4.3: Compute shaders, SSBO, ES3 compatibility
OpenGL 4.4: Persistently mapped buffers, multi bind
OpenGL 4.5: Direct state access, robustness, ES3.1
11. OpenGL Evolves Modularly
• Each core revision is specified as a set of extensions
– Example: ARB_ES3_1_compatibility
• Puts together all the functionality for ES 3.1 compatibility
• Describe in its own text file
– May have dependencies on other extensions
• Dependencies are stated explicitly
• A core OpenGL revision (such as OpenGL 4.5) “bundles” a set of agreed
extensions — and mandates their mutual support
– Note: implementations can also “unbundle” ARB extensions for hardware unable
to support the latest core revision
• So easiest to describe OpenGL 4.5 based on its bundled extensions…
4.5
ARB_direct_state_access
ARB_clip_control
many more …
12. OpenGL 4.5 as extensions
All new features to OpenGL 4.5 can be used with GL contexts
4.0 through 4.4 via extensions:
— ARB_clip_control
— ARB_conditional_render_inverted
— ARB_cull_distance
— ARB_shader_texture_image_samples
— ARB_ES3_1_compatibility
— ARB_direct_state_access
— KHR_context_flush_control
— ARB_get_texture_subimage
— KHR_robustness
— ARB_texture_barrier
API Compatibility
(Direct3D, OpenGL ES)
API Improvements
Browser security (WebGL)
Texture & framebuffer
memory consistency
13. Additional ARB extensions
Along with OpenGL 4.5, Khronos has released ARB extensions
ARB_sparse_buffer
DirectX 11 features
— ARB_pipeline_statistics_query
— ARB_transform_feedback_overflow_query
NVIDIA supports the above on all OpenGL 4.x hardware
— Fermi, Kepler and Maxwell
— GeForce, Quadro and Tegra K1
14. NVIDIA OpenGL 4.5 beta Driver
Available today!
https://developer.nvidia.com/opengl-driver
— Or just Google “opengl driver” – it’s the first hit!
— Windows and Linux
Supports all OpenGL 4.5 features and all ARB/KHR extensions
Available on Fermi, Kepler and Maxwell GPUs
— GeForce and Quadro
— Desktop and Laptop
15. Using OpenGL 4.5
OpenGL 4.5 has 118 New functions. Eek.
How do you deal with all that? The easy way…
Use the OpenGL Extension Wrangler (GLEW)
— Release 1.11.0 already has OpenGL 4.5 support
— http://glew.sourceforge.net/
16. Direct State Access (DSA)
Read and modify object state directly without bind-to-edit
Performance benefit in many cases
Context binding state unmodified
— Convenient for tools and middleware
— Avoids redundant state changes
Derived from EXT_direct_state_access
27. EXT_direct_state_access Differences
Only OpenGL 4.5 core functionality supported
Some minor name changes to some functions
— Mostly the same, but drops EXT suffix
TextureParameterfEXT -> TextureParameterf
— VAO function names shortened
glVertexArrayVertexBindingDivisorEXT -> glVertexArrayBindingDivisor
— Texture functions no longer require a target parameter
Target comes from glCreateTextures(<target>,)
Use “3D” functions with CUBE_MAP where z specifies the face
DSA functions can no longer create objects
— Use glCreate* functions to create name and object at once
28. Robustness
ARB_robustness functionality now part of OpenGL 4.5
— Called KHR_robustness for use with OpenGL ES too
— Does not include compatibility functions
Adds “safe” APIs for queries that return data to user pointers
Adds mechanism for app to learn about GPU resets
— Due to my app or some other misbehaving app
Stronger out-of-bounds behavior
— No more undefined behavior
Used by WebGL implementations to deal with Denial of
Service (DOS) attacks
29. Robustness API
Before Robustness
GLubyte tooSmall[NOT_BIG_ENOUGH];
glReadPixels(0, 0, H, W, GL_RGBA, GL_UNSIGNED_BYTE, tooSmall);
// CRASH!!
After Robustness
GLubyte tooSmall[NOT_BIG_ENOUGH];
glReadnPixels(0, 0, H, W, GL_RGBA, GL_UNSIGNED_BYTE, sizeof tooSmall, tooSmall);
// No CRASH, glGetError() returns INVALID_OPERATION
30. Robustness Reset Notification
Typical render loop with reset check
while (!quit) {
DrawStuff();
SwapBuffers();
if (glGetGraphicsResetStatus() != GL_NO_ERROR) {
quit = true;
}
}
DestroyContext(glrc);
Reset is asynchronous
— GL will behave as normal after a reset event but rendering commands
may not produce the right results
— The GL context should be destroyed
— Notify the user
31. OpenGL ES 3.1 Compatibility
Adds new ES 3.1 features not already in GL
Also adds #version 310 es GLSL shader support
Compatibility profile required for full superset
— ES 3.1 allows client-side vertex arrays
— Allows application generated object names
— Has default Vertex Array Object (VAO)
Desktop provides great development platform for ES 3.1
content
32. Desktop features in an ES profile
NVIDA GPUs provide all ANDROID_extension_pack_es31a
features in an ES profile
— Geometry, Tessellation, Advanced blending, etc.
Scene from Epic’s “Rivarly” OpenGL ES 3.1 + AEP demo running on Tegra K1
33. Using OpenGL ES 3.1 on Desktop
The Windows WGL way
int attribList[] = {
WGL_CONTEXT_MAJOR_VERSION_ARB, 3,
WGL_CONTEXT_MINOR_VERSION_ARB, 1,
WGL_CONTEXT_PROFILE_MASK_ARB, WGL_CONTEXT_ES_PROFILE_BIT_EXT,
0
};
HGLRC hglrc = wglCreateContextAttribsARB(wglGetCurrentDC(), NULL, attribList);
wglMakeCurrent(wglGetCurrentDC(), hglrc);
On NVIDIA GPUs this is a fully conformant OpenGL ES 3.1
implementation
— http://www.khronos.org/conformance/adopters/conformant-products
34. New OpenGL ES 3.1 features
glMemoryBarrierByRegion
— Like glMemoryBarrier, but potentially more efficient on tillers
GLSL functionality
— imageAtomicExchange() support for float32
— gl_HelperInvocation fragment shader input
Know which pixels won’t get output
Skip useless cycles or unwanted side-effects
— mix() function now supports int, uint and bool
— gl_MaxSamples
Implementation maximum sample count
35. Faster MakeCurrent
An implicit glFlush is called on MakeCurrent
— Makes switching contexts slow
New WGL and GLX extensions allow glFlush to be skipped
— Commands wait in context queue
— App has more control over flush
Provides 2x MakeCurrent performance boost
StartTimer();
for (int i = 0; i < iterations; ++i) {
DrawSimpleTriangle();
wglMakeCurrent(context[i % 2]);
}
StopTimer();
36. Disable Implicit glFlush on MakeCurrent
The Windows way with WGL
int attribList[] = {
WGL_CONTEXT_MAJOR_VERSION_ARB, 4,
WGL_CONTEXT_MINOR_VERSION_ARB, 5,
WGL_CONTEXT_RELEASE_BEHAVIOR_ARB, WGL_CONTEXT_RELEASE_BEHAVIOR_NONE_ARB,
0
};
HGLRC hglrc = wglCreateContextAttribsARB(wglGetCurrentDC(), NULL, attribList);
wglMakeCurrent(wglGetCurrentDC(), hglrc);
38. ARB_clip_control
glClipControl(origin, depthMode);
— y-origin can be flipped during viewport transformation
— Depth clip range can be [0,1] instead of [-1,1]
depthMode = GL_NEGATIVE_ONE_TO_ONE: Zw = ((f-n)/2) * Zd + (n+f)/2
depthMode = GL_ZERO_TO_ONE: Zw = (f-n) * Zd + n
— Provides direct mapping of [0,1] depth clip coordinates to [0,1] depth
buffer values when f=1 and n=0
No precision loss
origin=GL_LOWER_LEFT origin=GL_UPPER_LEFT
39. ARB_conditional_render_inverted
Allow conditional render to use the negated query result
Matches the DX11 ::SetPredication(, PredicateValue) option
Query result negation only happens to landed result
— Otherwise rendering takes place
GLuint predicate;
glCreateQueries(GL_SAMPLES_PASSED, 1, & predicate);
glBeginQuery(GL_SAMPLES_PASSED, predicate);
DrawNothing(); // Draws nothing
glEndQuery(GL_SAMPLES_PASSED);
glBeginConditionalRender(predicate, GL_QUERY_WAIT_INVERTED);
DrawStuff(); // Scene is rendered since SAMPLES_PASSED==0
glEndConditionalRender();
More useful with other query targets like
GL_TRANSFORM_FEEDBACK_OVERFLOW
40. ARB_cull_distance
Adds new gl_CullDistance[n] to Vertex, Tessellation, and
Geometry shaders (VS, TCS, TES and GS)
Like gl_ClipDistance except when any vertex has negative
distance whole primitive is culled
Matches DX11 SV_CullDistance[n]
Clipping
Plane
Negative
gl_ClipDistance
Positive
gl_ClipDistance
Clipped
Clipping
Plane
Negative
gl_CullDistance
Positive
gl_CullDistance
Culled
41. ARB_derivative_control
Adds “coarse” and “fine” variant of GLSL derivative functions
dFdxCoarse, dFdyCoarse
— Potentially faster performance
dFdxFine, dFdyFine
— More correct
— Default behavior of old dFdx and dFdy functions
fwidthCoarse and fwidthFine are also added
2x2 Quad Fragment
dFdxCoarse
=
=
2x2 Quad Fragment
dFdxFine=
= dFdxFine
42. ARB_shader_texture_image_samples
New GLSL built-ins to query the sample count of multi-sample
texture and image resources
— textureSamples
— imageSamples
Equivalent to the NumberOfSamples return with the
GetDimensions query in HLSL
#version 450 core
uniform sample2DMS tex;
out vec4 color;
void main() {
if (textureSamples(tex) > 2) {
color = DoFancyDownsample(tex);
} else {
color = DoSimpleDownsample(tex);
}
}
43. ARB_pipeline_statistics_query
New queries for profiling and DX11 compatibility
— GL_VERTICES_SUBMITTED
Number of vertices submitted to the GL
— GL_PRIMITIVES_SUBMITTED
Number of primitives submitted to the GL
— GL_VERTEX_SHADER_INVOCATIONS
Number of times the vertex shader has been invoked
— GL_TESS_CONTROL_SHADER_PATCHES
Number of patches processed by the tessellation control shader
— GL_TESS_EVALUATION_SHADER_INVOCATIONS
Number of times the tessellation control shader has been invoked
44. ARB_pipeline_statistics_query cont.
More queries
— GL_GEOMETRY_SHADER_INVOCATIONS
Number of times the geometry shader has been invoked
— GL_GEOMETRY_SHEDER_PRIMITIVES_EMITTED
Total number of primitives emitted by geometry shader
— GL_FRAGMENT_SHADER_INVOCATIONS
Number of times the fragment shader has been invoked
— GL_COMPUTE_SHADER_INVOCATIONS
Number of time the compute shader has been invoked
— GL_CLIPPING_INPUT_PRIMITIVES
— GL_CLIPPINT_OUTPUT_PRIMITIVES
Input and output primitives of the clipping stage
45. ARB_transform_feedback_overflow_query
Target queries to indicate Transform Feedback Buffer overflow
— GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB
— GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB
Use glBeginQueryIndex to specify specific stream
The result of which can be used with conditional render
GLuint predicate;
glCreateQueries(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB, 1, & predicate);
glBeginQuery(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB, predicate);
glBeginTransformFeedback(GL_TRIANGLES);
DrawLotsOfStuff();
glEndTransformFeedback();
glEndQuery(GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB);
glBeginConditionalRender(predicate, GL_QUERY_NO_WAIT_INVERTED);
DrawStuff(); // Scene not rendered if XFB overflowed buffers
glEndConditionalRender();
47. Texture Barrier
Allows rendering to a bound texture
— Use glTextureBarrier() to safely read previously written texels
— Behavior is now defined with use of texture barriers
Allows render-to-texture algorithms to ping-pong without
expensive Framebuffer Object (FBO) changes
— Bind 2D texture array for texturing and as a layered FBO attachment
Draw gl_Layer=0
glTextureBarrier()
texture
Draw gl_Layer=1
texture
48. Programmable Blending
Limited form of programmable blending with non-self-
overlapping draw calls
— Bind texture as a render target and for texturing
glBindTexture(GL_TEXTURE_2D, tex);
glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex, 0);
dirtybbox.empty();
foreach (object in scene) {
if (dirtybbox.overlaps(object.bbox())) {
glTextureBarrier();
dirtybbox.empty();
}
object.draw();
dirtybbox = bound(dirtybbox, object.bbox());
}
49. Advanced Blending
KHR_blend_equation_advanced created from
NV_blend_equation_advanced
Supported by NVIDIA since r340 – June, 2014
— GL and ES profiles
Supported natively on Maxwell and Tegra K1 GPUs
— Otherwise implemented seamlessly with shaders on Fermi and Kepler
Implements a subset of NV_blend_equation_advanced modes
Maxwell and Tegra K1 also provide
KHR_blend_equation_advanced_coherent
— Doesn’t require glBlendBarrierKHR between primitives that double-hit
color samples
51. Get Texture Sub Image
Like glGetTexImage, but now you can read a sub-region
glGetTextureSubImage
— DSA only variant
void GetTextureSubImage(uint texture, int level,
int xoffset, int yoffset, int zoffset, sizei width,
sizei height, sizei depth, enum format, enum type,
sizei bufSize, void * pixels);
Direct State Access
Robustness
pixels
yoffset
xoffset
width
height
For GL_TEXTURE_CUBE_MAP targets zoffset specifies face
52. ARB_sparse_buffer
Ability to have large buffer objects without the whole buffer
being resident
— Analogous to ARB_sparse_texture for buffer objects
Application controls page residency
1) Create uncommitted buffer: glBufferStorage(,SPARSE_STORAGE_BIT_ARB)
2) Make pages resident: glBufferPageCommitmentARB(, offset, size, GL_TRUE);
GL_SPARSE_BUFFER_PAGE_SIZE_ARB
offset size
53. Summary of GLSL 450 additions
dFdxFine, dFdxCoarse, dFxyFine, dFdyCoarse
textureSamples, imageSamples
gl_CullDistance[gl_MaxCullDistances];
#version 310 es
imageAtomicExchange on float
gl_HelperInvocation
gl_MaxSamples
mix() on int, uint and bool
54. OpenGL Demos on K1
Shield Tablet
• Tegra K1 runs Android
• Kepler GPU hardware in K1 supports the full OpenGL 4.5
feature set
– Today 4.4, expect 4.5 support
– OpenGL 4.5 is all the new stuff, plus tons of proven features
• Tessellation, compute, instancing
– Plus latest features: bindless, path rendering, blend modes
• Demos use GameWorks framework
– Write Android-ready OpenGL code that runs on Windows and Linux too
60. Mega Geometry with Instancing
glDrawElementsInstanced +
glVertexAttribDivisor
61. Getting GameWorks
• Get Tegra Android Development Pack (TADP)
– All the tools you need for Android development
• Windows or Linux
– Includes GameWorks samples
• Samples also available on Github
https://github.com/NVIDIAGameWorks/OpenGLSamples
62. OpenGL Debug Features
KHR_debug added to OpenGL 4.3
App has access to driver “stderr” message stream
— Via Callback function or
— Query of message queue
Any object can have a meaningful “label”
Driver can tell app about
— Errors
— Performance warnings
— Hazards
— Usage hints
App can insert own events into stream for marking
64. Enable Debug
Can be done on-the-fly
void GLAPIENTRY DebugCallback(GLenum source, GLenum type, GLuint id, GLenum
severity, GLsizei length, const GLchar* message, const void* userParam)
{
printf(“0x%X: %sn", id, message);
}
void DebugDrawTexture()
{
glDebugMessageCallback(DebugCallback, NULL);
glDebugMessageControl(GL_DONT_CARE, GL_DONT_CARE, GL_DONT_CARE, 0, 0, GL_TRUE);
glEnable(GL_DEBUG_OUTPUT);
DrawTexture();
}
The callback function outputs:
0x20084: Texture state usage warning: Texture 1 has no mipmaps, while its min
filter requires mipmap.
Works in non-debug context!
65. Give the texture a name
Instead of “texture 1” – give it a name
void DrawTexture()
{
GLuint tex;
glGenTextures(1, &tex);
glBindTexture(GL_TEXTURE_2D, tex);
GLchar texName[] = "Sky";
glObjectLabel(GL_TEXTURE, tex, sizeof texName, texName);
...
}
The callback function outputs:
0x20084: Texture state usage warning: Texture Sky has no mipmaps, while its min
filter requires mipmap.
66. Organize your debug trace
Lots of text can get unwieldy
— What parts of my code does this error apply?
Use synchronous debug output:
— glEnable(GL_DEBUG_OUTPUT_SYNCHRONOUS);
— Effectively disables dual-core driver
— So your callback goes to your calling application thread
— Instead of a driver internal thread
Use groups and markers
— App injects markers to notate debug output
— Push/pop groups to easily control volume
67. Notating debug with groups
Use a group
void DebugDrawTexture()
{
...
GLchar groupName[] = "DrawTexture";
glPushDebugGroup(GL_DEBUG_SOURCE_APPLICATION, 0x1234,
sizeof groupName, groupName);
glDebugOutputControl(...); // Can change volume if needed
DrawTexture();
glPopDebugGroup(); // Old debug volume restored
}
Improved output
0x1234: DrawTexture PUSH
0x20084: Texture state usage warning: Texture Sky has no mipmaps, while its
min filter requires mipmap.
0x1234: DrawTexture POP
69. Nsight: Interactive OpenGL debugging
Frame Debugging and Profiling
Shader Debugging and Pixel History
Frame Debugging and Dynamic Shader Editing
OpenGL API & Hardware Trace
Supports up to OpenGL 4.2 Core
— And a bunch of useful extensions
https://developer.nvidia.com/nvidia-nsight-visual-studio-edition
70. OpenGL related Linux improvements
Support for EGL on desktop Linux within X11 (r331)
OpenGL-based Framebuffer Capture (NvFBC), for remote
graphics (r331)
Support for Quad-Buffered stereo + Composite X extension
(GLX_EXT_stereo_tree) (r337)
Support for G-SYNC (Variable Refresh Rate) (r340)
Support for Tegra K1: NVIDIA SOC with Kepler graphics core
— Linux Tegra K1 (Jetson) support leverages same X driver, OpenGL
implementation as desktop NVIDIA GPUs
— NVIDIA also contributing to Nouveau for K1 support
Coming soon: Framebuffer Object creation dramatically
faster!
71. Beyond OpenGL 4.5 Path Rendering
Path rendering and blend modes
Resolution-independent 2D rendering
Not your classic 3D hardware rendering
Earlier Illustrator demo showed this
NV_path_rendering +
NV_blend_equation_advanced
72. PostScript Tiger with Perspective Warping
No textures!
Paths rendered from
resolution-independent
2D paths (outlines)
77. Beyond OpenGL 4.5
Advanced scene rendering with ARB_multi_draw_indirect
— Added to OpenGL 4.3
Bring even more processing onto the GPU with
NV_bindless_multi_draw_indirect
— Even less work for the CPU – no Vertex Buffer Object (VBO) binds
between draws
Covered in depth by Christoph Kubisch yesterday
— SG4117: OpenGL Scene Rendering Techniques
78. NV_bindless_multi_draw_indirect
DrawIndirect combined with Bindless
struct DrawElementsIndirect {
GLuint count;
GLuint instanceCount;
GLuint firstIndex;
GLint baseVertex;
GLuint baseInstance;
}
struct BindlessPtr {
Gluint index;
Gluint reserved;
GLuint64 address;
GLuint64 length;
}
struct DrawElementsIndirectBindlessCommandNV {
DrawElementsIndirect cmd;
GLuint reserved;
BindlessPtr index;
BindlessPtr vertex[];
}
Change vertex buffers per draw iteration!
Change index buffer per draw iteration!
MultiDrawElementsIndirectBindlessNV(enum mode, enum type, const void *indirect,
sizei drawCount, sizei stride, int vertexBufferCount);
Caveat: Does the CPU know the drawCount?
The GL_BUFFER_GPU_ADDRESS_NV of the buffer object
79. NV_bindless_multi_draw_indirect_count
Source the drawCount from a buffer object
void MultiDrawElementsIndirectBindlessCountNV(
enum mode,
enum type,
const void * indirect,
intptr drawCount,
sizei maxDrawCount,
sizei stride,
int vertexBufferCount
);
drawCount now an offset into the bound GL_PARAMETER_BUFFER_ARB buffer range.
80. Khronos OpenGL BOF at SIGGRAPH
Date: Wednesday, August 13 2014
Venue: Marriott Pinnacle Hotel, next to the Convention
Center
Website: http://s2014.siggraph.org
Times: 5pm-7pm OpenGL and OpenGL ES Track
BOF After-Party: 7:30pm until late
— Rumor: Free beer and door prizes
2009 2 versions OpenGL 3.1 and 3.2
2010 3 versions OpenGL 3.3, 4.0 and 4.1
Can incrementally use new features without breaking your app
What about 2014?
I’ll go through all of these in more detail
This is how the picture looks now… move on.
Still
OpenGL is made up of extensions…
Here are all the extensions – grouped.
Some OpenGL 4.x hardware can’t support these, so they aren’t in 4.5 core spec.
Can I play with it today? Yes you can.
GeForce 400 Series and up.
GLEW with OpenGL 4.5 released today.
First and biggest new feature of OpenGL 4.5 is DSA. Finally.
Some binding is slow, especially things like framebuffer object.
GetInteger may cause a stall – a client/server sync.
The Bind has cost.
Shorter, more concise code.
No longer need to use an active texture selector.
Just easier and nicer programmer.
The new DSA commands don’t create objects.
They only work on created objects.
Convenient way to generate name and create object in one shot.
The new texture functions don’t require a target parameter.
All covered; texture creation, compressed textures, texture update and readback, state sets and gets, texbo and mipmap generation. Can all be done without binding.
New to DSA.
Can also manipulate framebuffer objects. Modify attachments. You can even clear the framebuffer and copy from one to another.
All the buffer objects functions have DSA equivalents.
All functions that operate on a vertex array object are now available in DSA.
You can even modify the element array binding.
Functions that dealt with compatibility like the ARB_image and matrix functions were left out.
EXT_DSA functions can create objects. 4.5 DSA functions can’t.
Accesses to out-of-bounds data is no longer undefined. GL termination is no longer and option.
Writes are discarded and reads contained undefined data.
WEB BROWER
Example of a safe API.
Example of polling for a reset notification.
Provide some user feedback if the GPU got reset.
OpenGL ES 3.1 released in March.
OpenGL 4.5 includes the new functionality.
NVIDIA GPUs support OpenGL ES 3.1 and all the recent extensions.
OpenGL 4.5 is a super-set of OpenGL ES 3.1.
But if you want to only use OpenGL ES 3.1 and the ES extensions, then create an ES profile.Our OpenGL ES 3.1 is fully conformance and listed at Khronos.
We measured a 2x performance boost in MakeCurrent time with this simple implementation.
Disabling implicit flush requires a new context.
DX zero y-coordinate is left left – GL is bottom left.
If you can’t modify the transform matrices in the shader simply flip the y-origin with glClipControl.
Most useful when rendering to a texture when you want to keep the texcoords DX like.
The depth scale when transforming device normalized coordinates to window coordinates is modified from [-1,-1] to [0,1]. Can provide direct mapping of you near/far planes. So there is no potential for numerical loss.
Doesn’t make much sense for SAMPLES_PASSED. But good for some of the new query targets like GL_TRANSFORM_FEEDBACK_OVERFLOW.
Using coarse may take less cycles than fine on some implementations.
In shader query of the number of samples in a texture or image.
New ARB extension.
Compute invocations includes both dispatch and indirect dispatching.
New ARB extension.
REHEARSE
Texture can be bound to both FBO and for texturing. Before this was an error.
Rehearse
Object must not self-intersect.
Keep drawing objects until you hit an overlap.
Issue the texture barrier and continue.
Supported for a couple of months.
Subset of NV_blend_equation_advanced which can be supported by more vendors.
Maxwell and TK1 allow intersecting draws without having to use glBlendBarrier since it’s implemented native.
We expose the _coherent extension for this.
Not sure how this escaped GL for so long.
Now you can finally read a sub image of a texture without having to read the whole thing.
Sparse buffer, similar to sparse texture.
Not whole buffer has to be resident.
Writes to uncommitted pages are discarded and reads return undefined data.
Flash slide. For your reference.
Mark
How do you debug GL in a platform agnostic way? KHR_debug
Shipped for a couple of years now, but some folks haven’t heard of this. Got a question yesterday that can be answered by KHR_debug.
All kinds of useful data. Like buffer placement and eviction, performance warnings.
THIS ONE ANIMATES. – What’s wrong with this code?
Uses compatibility is not the right answer!
You don’t need a debug context to do this.
Hard to tie debug output with the app.
Use labels, markers and groups.
Step frame by frame. Inspect pixels – where did the come from, what shaders were involved.
Breakpoints and inspection of shaders. Interactive debugging.
Linux improvements over the last year.
EGL, Framebuffer capture, quad-buffered stereo, G-Sync monitors and Linux for Tegra K1.
Order your Jetson development kit today!
Finally one last feature we released recently.
First – we already support this.
Code on next page.
What if you don’t know the drawCount – doing culling in the GPU for example?