1. The document discusses NVIDIA's NV_command_list OpenGL extension, which aims to reduce driver overhead by batching OpenGL commands and state changes into a command list.
2. It provides examples showing how command lists can reduce the number of draw calls and CPU workload for rendering complex 3D scenes with thousands of draw calls. Benchmark results show a large performance improvement over traditional OpenGL rendering.
3. Key concepts of the command list approach include using bindless resources accessed via pointers, state objects to encapsulate OpenGL state, and encoding rendering commands and state changes into a tokenized command buffer for efficient GPU execution.
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadTristan Lorach
This presentation introduces a new NVIDIA extension called Command-list.
The purpose of this presentation is to explain the basic concepts on how to use it and show what are the benefits.
The sample I used for the talk is here: https://github.com/nvpro-samples/gl_commandlist_bk3d_models
The driver for trying should be PreRelease 347.09
http://www.nvidia.com/download/driverResults.aspx/80913/en-us
OpenGL 4.4 provides new features for accelerating scenes with many objects, which are typically found in professional visualization markets. This talk will provide details on the usage of the features and their effect on real-life models. Furthermore we will showcase how more work for rendering a scene can be off-loaded to the GPU, such as efficient occlusion culling or matrix calculations.
Video presentation here: http://on-demand.gputechconf.com/gtc/2014/video/S4379-opengl-44-scene-rendering-techniques.mp4
Presentation of NvFX: an effect layer that allows encapsulation of GLSL and/or D3D shading language.
The basic concept follows the footprints of NVIDIA CgFX
https://github.com/tlorach/nvFX
presented at SIGGRAPH 2014 in Vancouver during NVIDIA's "Best of GTC" sponsored sessions
http://www.nvidia.com/object/siggraph2014-best-gtc.html
Watch the replay that includes a demo of GPU-accelerated Illustrator and several OpenGL 4 demos running on NVIDIA's Tegra Shield tablet.
http://www.ustream.tv/recorded/51255959
Find out more about the OpenGL examples for GameWorks:
https://developer.nvidia.com/gameworks-opengl-samples
Efficient occlusion culling in dynamic scenes is a very important topic to the game and real-time graphics community in order to accelerate rendering. We present a novel algorithm inspired by recent advances in depth culling for graphics hardware, but adapted and optimized for SIMD-capable CPUs. Our algorithm has very low memory overhead and is three times faster than previous work, while culling 98% of all triangles by a full resolution depth buffer approach. It supports interleaving occluder rasterization and occlusion queries without penalty, making it easy
Ever wondered how to use modern OpenGL in a way that radically reduces driver overhead? Then this talk is for you.
John McDonald and Cass Everitt gave this talk at Steam Dev Days in Seattle on Jan 16, 2014.
Bill explains some of the ways that the Vertex Shader can be used to improve performance by taking a fast path through the Vertex Shader rather than generating vertices with other parts of the pipeline in this AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Check out more technical presentations at http://developer.amd.com/resources/documentation-articles/conference-presentations/
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadTristan Lorach
This presentation introduces a new NVIDIA extension called Command-list.
The purpose of this presentation is to explain the basic concepts on how to use it and show what are the benefits.
The sample I used for the talk is here: https://github.com/nvpro-samples/gl_commandlist_bk3d_models
The driver for trying should be PreRelease 347.09
http://www.nvidia.com/download/driverResults.aspx/80913/en-us
OpenGL 4.4 provides new features for accelerating scenes with many objects, which are typically found in professional visualization markets. This talk will provide details on the usage of the features and their effect on real-life models. Furthermore we will showcase how more work for rendering a scene can be off-loaded to the GPU, such as efficient occlusion culling or matrix calculations.
Video presentation here: http://on-demand.gputechconf.com/gtc/2014/video/S4379-opengl-44-scene-rendering-techniques.mp4
Presentation of NvFX: an effect layer that allows encapsulation of GLSL and/or D3D shading language.
The basic concept follows the footprints of NVIDIA CgFX
https://github.com/tlorach/nvFX
presented at SIGGRAPH 2014 in Vancouver during NVIDIA's "Best of GTC" sponsored sessions
http://www.nvidia.com/object/siggraph2014-best-gtc.html
Watch the replay that includes a demo of GPU-accelerated Illustrator and several OpenGL 4 demos running on NVIDIA's Tegra Shield tablet.
http://www.ustream.tv/recorded/51255959
Find out more about the OpenGL examples for GameWorks:
https://developer.nvidia.com/gameworks-opengl-samples
Efficient occlusion culling in dynamic scenes is a very important topic to the game and real-time graphics community in order to accelerate rendering. We present a novel algorithm inspired by recent advances in depth culling for graphics hardware, but adapted and optimized for SIMD-capable CPUs. Our algorithm has very low memory overhead and is three times faster than previous work, while culling 98% of all triangles by a full resolution depth buffer approach. It supports interleaving occluder rasterization and occlusion queries without penalty, making it easy
Ever wondered how to use modern OpenGL in a way that radically reduces driver overhead? Then this talk is for you.
John McDonald and Cass Everitt gave this talk at Steam Dev Days in Seattle on Jan 16, 2014.
Bill explains some of the ways that the Vertex Shader can be used to improve performance by taking a fast path through the Vertex Shader rather than generating vertices with other parts of the pipeline in this AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Check out more technical presentations at http://developer.amd.com/resources/documentation-articles/conference-presentations/
Presented as a pre-conference tutorial at the GPU Technology Conference in San Jose on September 20, 2010.
Learn about NVIDIA's OpenGL 4.1 functionality available now on Fermi-based GPUs.
GDC talk on how to render Per-Face Texture Mapping (PTEX) datasets on commodity GPUs in Realtime in a much simpler fashion than the earlier proposed techniques. This method is heavily suitable for real time consumption, and can be altered to support other texture techniques.
OTOY founder and CEO, Jules Urbach, announces a colossal update to the OctaneRender ecosystem, including the pricing and availability of its highly anticipated OctaneRender 3 software and OctaneRender Cloud rendering service, and a detailed roadmap outlining the future of Octane’s development towards a 4.0 release in 2017 with full integration of OTOY’s advanced real-time path tracing engine, Brigade.
The goal of this session is to demonstrate techniques that improve GPU scalability when rendering complex scenes. This is achieved through a modular design that separates the scene graph representation from the rendering backend. We will explain how the modules in this pipeline are designed and give insights to implementation details, which leverage GPU''s compute capabilities for scene graph processing. Our modules cover topics such as shader generation for improved parameter management, synchronizing updates between scenegraph and rendering backend, as well as efficient data structures inside the renderer.
Video here: http://on-demand.gputechconf.com/gtc/2013/video/S3032-Advanced-Scenegraph-Rendering-Pipeline.mp4
Realtime Per Face Texture Mapping (PTEX)basisspace
This presentation shows the original method for implementing Per-Face Texture Mapping (PTEX) in real-time on commodity hardware. PTEX is used throughout the film industry to handle texture seams robustly while simultaneously easing artist workflow.
Game engines have long been in the forefront of taking advantage of the ever increasing parallel compute power of both CPUs and GPUs. This talk is about how the parallel compute is utilized in practice on multiple platforms today in the Frostbite game engine and how we think the parallel programming models, hardware and software in the industry should look like in the next 5 years to help us make the best games possible
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...AMD Developer Central
Presentation WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by Leo Meyerovich and Matthew Torok at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
Direct3D 11 will have tessellation for smoother curves and finer details. The new compute shader will make postprocessing faster and easier. You'll need Direct3D 11 to have the best graphics, and this talk will show you how you can get started using current generation hardware.
A set of mobile game optimization best practices. This presentation extensively covers PowerVR series of GPUs from Imagination Technologies and iOS, however the majority of recommendations can be applied to other GPUs and mobile operating systems.
Siggraph 2016 - Vulkan and nvidia : the essentialsTristan Lorach
This presentation introduces Vulkan components, what you must know to start using this new API. And what you must know when using it on NVIDIA hardware
Presented as a pre-conference tutorial at the GPU Technology Conference in San Jose on September 20, 2010.
Learn about NVIDIA's OpenGL 4.1 functionality available now on Fermi-based GPUs.
GDC talk on how to render Per-Face Texture Mapping (PTEX) datasets on commodity GPUs in Realtime in a much simpler fashion than the earlier proposed techniques. This method is heavily suitable for real time consumption, and can be altered to support other texture techniques.
OTOY founder and CEO, Jules Urbach, announces a colossal update to the OctaneRender ecosystem, including the pricing and availability of its highly anticipated OctaneRender 3 software and OctaneRender Cloud rendering service, and a detailed roadmap outlining the future of Octane’s development towards a 4.0 release in 2017 with full integration of OTOY’s advanced real-time path tracing engine, Brigade.
The goal of this session is to demonstrate techniques that improve GPU scalability when rendering complex scenes. This is achieved through a modular design that separates the scene graph representation from the rendering backend. We will explain how the modules in this pipeline are designed and give insights to implementation details, which leverage GPU''s compute capabilities for scene graph processing. Our modules cover topics such as shader generation for improved parameter management, synchronizing updates between scenegraph and rendering backend, as well as efficient data structures inside the renderer.
Video here: http://on-demand.gputechconf.com/gtc/2013/video/S3032-Advanced-Scenegraph-Rendering-Pipeline.mp4
Realtime Per Face Texture Mapping (PTEX)basisspace
This presentation shows the original method for implementing Per-Face Texture Mapping (PTEX) in real-time on commodity hardware. PTEX is used throughout the film industry to handle texture seams robustly while simultaneously easing artist workflow.
Game engines have long been in the forefront of taking advantage of the ever increasing parallel compute power of both CPUs and GPUs. This talk is about how the parallel compute is utilized in practice on multiple platforms today in the Frostbite game engine and how we think the parallel programming models, hardware and software in the industry should look like in the next 5 years to help us make the best games possible
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...AMD Developer Central
Presentation WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by Leo Meyerovich and Matthew Torok at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
Direct3D 11 will have tessellation for smoother curves and finer details. The new compute shader will make postprocessing faster and easier. You'll need Direct3D 11 to have the best graphics, and this talk will show you how you can get started using current generation hardware.
A set of mobile game optimization best practices. This presentation extensively covers PowerVR series of GPUs from Imagination Technologies and iOS, however the majority of recommendations can be applied to other GPUs and mobile operating systems.
Siggraph 2016 - Vulkan and nvidia : the essentialsTristan Lorach
This presentation introduces Vulkan components, what you must know to start using this new API. And what you must know when using it on NVIDIA hardware
Talk by Yuriy O’Donnell at GDC 2017.
This talk describes how Frostbite handles rendering architecture challenges that come with having to support a wide variety of games on a single engine. Yuriy describes their new rendering abstraction design, which is based on a graph of all render passes and resources. This approach allows implementation of rendering features in a decoupled and modular way, while still maintaining efficiency.
A graph of all rendering operations for the entire frame is a useful abstraction. The industry can move away from “immediate mode” DX11 style APIs to a higher level system that allows simpler code and efficient GPU utilization. Attendees will learn how it worked out for Frostbite.
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsPrabindh Sundareson
Introduction to OpenGL ES and GPU Programming portion of the 7 part session on GFX workshops. Introduces the OpenGL ES specifications from Khronos and provides a perspective of current GPU architectures.
A technical deep dive into the DX11 rendering in Battlefield 3, the first title to use the new Frostbite 2 Engine. Topics covered include DX11 optimization techniques, efficient deferred shading, high-quality rendering and resource streaming for creating large and highly-detailed dynamic environments on modern PCs.
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Akihiro Hayashi
Third Workshop on Accelerator Programming Using Directives (WACCPD2016, co-located with SC16)
While GPUs are increasingly popular for high-performance
computing, optimizing the performance of GPU programs is a time-consuming and non-trivial process in general. This complexity stems from the low abstraction level of standard
GPU programming models such as CUDA and OpenCL:
programmers are required to orchestrate low-level operations
in order to exploit the full capability of GPUs. In terms of
software productivity and portability, a more attractive approach
would be to facilitate GPU programming by providing high-level
abstractions for expressing parallel algorithms.
OpenMP is a directive-based shared memory parallel programming model and has been widely used for many years.
From OpenMP 4.0 onwards, GPU platforms are supported
by extending OpenMP’s high-level parallel abstractions with
accelerator programming. This extension allows programmers to
write GPU programs in standard C/C++ or Fortran languages,
without exposing too many details of GPU architectures.
However, such high-level parallel programming strategies generally impose additional program optimizations on compilers,
which could result in lower performance than fully hand-tuned
code with low-level programming models.To study potential
performance improvements by compiling and optimizing high-level GPU programs, in this paper, we 1) evaluate a set of
OpenMP 4.x benchmarks on an IBM POWER8 and NVIDIA
Tesla GPU platform and 2) conduct a comparable performance
analysis among hand-written CUDA and automatically-generated
GPU programs by the IBM XL and clang/LLVM compilers.
Presented September 30, 2009 in San Jose, California at GPU Technology Conference.
Describes the new features of OpenGL 3.2 and NVIDIA's extensions beyond 3.2 such as bindless graphics, direct state access, separate shader objects, copy image, texture barrier, and Cg 2.2.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
1. Siggraph Asia 2014
Tristan Lorach
Manager of Devtech for Professional Visualization group
OPENGL NVIDIA "COMMAND-LIST":
"APPROACHING ZERO DRIVER OVERHEAD"
2. 2
Siggraph Asia 2014
GPU Scalability
GPUs are extremely powerful
But only if used correctly
Mix of Application and Graphic API (Driver) responsibility
Increase amount of work per batch (Job)
Minimize CPUGPU interactions
Lower Memory traffic
Lower API calls: Batch things together
Factorize Data
Re-use data uploaded to Video memory (instancing…)
3. 3
Siggraph Asia 2014
Use of the GPU
Games
~1500 to 3000 Drawcalls for a scene
Intensive use of Multi-layer image processing over the scene
Heavy shaders
CAD/FCC/professional applications
Hard to batch user’s works (CAD applications)
~10,000 to… 300,000 Drawcalls for a scene: heavy CPU workload for our driver
Shaders simpler than games (but catching-up these days...)
Post-Processing more and more used (SSAO)
4. 4
Siggraph Asia 2014
Use of the GPU
New Graphic APIs are trying to address these concerns
Khronos (Announced it at Siggraph Vancouver 2014)
Metal
Mantle
DX12
All propose better ways to issue commands and render-states
But can we improve OpenGL to go the same path ?
Yes: NV_command_list
5. 5
Siggraph Asia 2014
OpenGL
OpenGL is ~25 years old (!!)
1.x: fixed function
2.x: Shading language (GLSL)
3.x/4.x: more as a massive compute system
More graphic stages (Geom.Shader; Tessellation)
High flexibility & efficiency on memory accesses
High programability
Design "flaws": still “State Machine”;
ServerClient; poor multi-threading management
1992
2012
6. 6
Siggraph Asia 2014
Challenge of Issuing Commands
Issuing drawcalls and state changes can be a real bottleneck
650,000 Triangles
68,000 Parts
~ 10 Triangles per part
3,700,000 Triangles
98 000 Parts
~ 37 Triangles per part
14,338,275 Triangles/lines
300,528 drawcalls (parts)
~ 48 Triangles per part
App+driver
GPU
GPU
idle
CPU
Excessive
Work from
App & Driver
On CPU
!
!
courtesy of PTC
9. 9
Siggraph Asia 2014
Application
Command-list
Token-buffer
OpenGL
Driver
Big Picture
Vertex Puller (IA)
Vertex Shader
TCS (Tessellation)
TES (Tessellation)
Tessellator
Geometry Shader
Transform Feedback
Rasterization
Fragment Shader
Per-Fragment Ops
Framebuffer
Tr. Feedback buffer
Uniform Block
Texture Fetch
Image Load/Store
Atomic Counter
Shader Storage
Element buffer (EBO)
Draw Indirect Buffer
Vertex Buffer (VBO)
Front-End
(decoder)
Cmd bundles
Push-Buffer
(FIFO)
cmds
FBO resources
(Textures / RB)
OpenGL
Commands
Token-buffers
+ state objects
resources
64 bits
Pointers
(bindless)
State Object
More work for
FE – but fast !
10. 10
Siggraph Asia 2014
Demo – Quadro K5000
CAD Car model
14,338,275 Primitives
300,528 drawcalls
348,862 attribute update
12,004 uniform update
Set of CAD models together
29,344,075 Primitives
26,144 drawcalls
18,371 attribute update
13,632 uniform update
11. 11
Siggraph Asia 2014
DemoS – Quadro K5000
CAD Car model
K5000:
920,363,200 primitives/S
19,290,668 drawcalls/S
12. 12
Siggraph Asia 2014
DemoS – Quadro K5000
Set of CAD models together
K5000
1,211,684,096 primitives/S
1,079,545 drawcalls/S
13. 13
Siggraph Asia 2014
DemoS – Maxwell
CAD Car model
K5000:
920,363,200 primitives/S
19,290,668 drawcalls/S
Maxwell GM204 (GeForce):
1,782,257,920 primitives/S
37,355,848 drawcalls/S
Drawcall time : 8 Micro-
seconds on CPU (!)
Set of CAD models together
K5000
1,211,684,096 primitives/S
1,079,545 drawcalls/S
Maxwell GM204 (GeForce)
3,012,795,392 primitives/S
2,684,239 drawcalls/S
Drawcall time: 42 Micro-
seconds on CPU
!
14. 14
Siggraph Asia 2014
More Performances
Another test: always using bindless UBO and VBO
CPU-Emulation: Token stream executed as CPU "switch-case"
Strategies
Un-grouped: ~ 1 MB, 79,000 Tokens, ~68,000 drawcalls
Material-grouped: ~300 KB, 22,000 Tokens, ~11,000 drawcalls
Strategy Draw time Material-grouped
0.73
HW TOKEN-Buffers 0.56
Technique GPU (ms)
1.4 1.14 x
~0 BIG x
CPU (ms)
Draw time un-grouped
2.4
1.1
GPU (ms)
4.5 1.6 x
~0 BIG x
CPU (ms)
CPU-emulated TOKEN-buffers
glBindBufferRange + drawcalls 0.73 1.6 3.5 7.2
Preliminary results K6000
15. 15
Siggraph Asia 2014
More Performances
5 000 shader changes: toggling between two shaders in „shaded & edges“
5 000 fbo changes: similar as above but with fbo toggle instead of shader
Almost no additional cost compared to rendering without fbo changes
Timing GPU (ms) CPU (ms)
CPU-emulated TOKEN-buffers 12.7 15.1
2.9 4.3 x 0.4 37 xHW TOKEN-buffers
2.8 4.5 x 0.005 BIG xCommand-Lists
Timer GPU (ms) CPU (ms)
CPU-emulated TOKEN-buffers 60.0 60.0
1.8 33 x 0.9 66 xHW TOKEN-buffers
1.7 35 x 0.022 BIG xCommand-Lists
Preliminary results on K5000
16. 16
Siggraph Asia 2014
GPU
Virtual
Memory
Bindless Technology
What is it about?
Work from native GPU pointers/handles (NVIDIA
pioneered this technology)
A lot less CPU work (memory hopping, validation...)
Allow GPU to use flexible data structures
Bindless Buffers
Vertex & Global memory since Tesla Generation (CUDA
capable)
Bindless Textures
Since Kepler
Bindless Constants (UBO)
New driver feature, support for Fermi and above
Bindless plays a central role for Command-List
Vertex Puller (IA)
Vertex Shader
TCS (Tessellation)
TES (Tessellation)
Tessellator
Geometry Shader
Transform Feedback
Rasterization
Fragment Shader
Per-Fragment Ops
Framebuffer
Uniform Block
Texture Fetch
Element buffer (EBO)
Vertex Buffer (VBO)
Front-End
(decoder)
Push buffer
64bitsaddress
Attr.s
Idx
Uniform Block
Send Ptrs
17. 17
Siggraph Asia 2014
#define UBA UNIFORM_BUFFER_ADDRESS_NV
UpdateBuffers();
glEnableClientState(UNIFORM_BUFFER_UNIFIED_NV);
glBufferAddressRangeNV (UBA, 0, addrView, viewSize);
foreach (obj in scene) {
...
// glBindBufferRange (UBO, 1, uboMatrices, obj.matrixOffset, maSize);
glBufferAddressRangeNV(UBA, 1, addrMatrices + obj.matrixOffset, maSize);
foreach ( batch in obj.primitive_group_material) {
// glBindBufferRange (UBO, 2, uboMaterial, batch.materialOffset, mtlSize);
glBufferAddressRangeNV(UBA, 2, addrMaterial + batch.materialOffset, mtlSize);
...
}
}
Example On Using Bindless UBO
New pointer for UBO#1 updated per Object
New pointer for UBO#2 updated per
Primitive group
pointer for UBO#0 updated once for all objects
regular UBO binds are now
ignored!
18. 18
Siggraph Asia 2014
Pointers must be aligned
So do Ptr Offsets
glGetIntegerv(GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT,
&offsetAlignment);
Normally: 256 bytes
gaps between each item
Try to fit as much data as possible…
Not the case if passing an index for
array access (MDI + “Base-instance”)
But requires special GLSL code (indexing)
Bindless And Memory Alignement
GPU
Virtual
Memory
64bitsaddress
Uniform Material 0
Uniform Material 1
Uniform Material 2
…
glBufferAddressRangeNV
(UBA, 2, addrMaterial
+ batch.materialOffset
, mtlSize);
256b
Uniform array materials[]
Material 0
Material 1
Material 2
…
Uniform …
19. 19
Siggraph Asia 2014
NV_command_list Key Concepts
1. Tokenized Rendering:
some state changes and draw commands are encoded into binary data stream
Depends on bindless technology
2. State Objects
Whole OpenGL States (program, blending...) captured as an object
Allows pre-validation of state combinations, later reuse of objects is very fast
3. Command List Object
„Display-list“ paradigm but more flexible: buffer are outside (referenced by
address), so content can still be modified (matrics, vertices...)
20. 20
Siggraph Asia 2014
UpdateBuffers();
glBindBufferBase (UBO, 0, uboView);
foreach (obj in scene) {
// redundancy filter for these (if (used != last)...
glBindVertexBuffer (0, obj.geometry->vbo, 0, vtxSize);
glBindBuffer (ELEMENT, obj.geometry->ibo);
glBindBufferRange (UBO, 1, uboMatrices, obj.matrixOffset, maSize);
// iterate over cached material groups
foreach ( batch in obj.materialGroups) {
glBindBufferRange (UBO, 2, uboMaterial, batch.materialOffset, mtlSize);
glMultiDrawElements (...);
}
}
Typical Scene Drawing Example
~ 2 500 api drawcalls
~11 000 drawcalls
~55 triangles per call
21. 21
Siggraph Asia 2014
UpdateBuffers();
glBindBufferBase (UBO, 0, uboView);
foreach (obj in scene) {
// redundancy filter for these (if (used != last)...
glBindVertexBuffer (0, obj.geometry->vbo, 0, vtxSize);
glBindBuffer (ELEMENT, obj.geometry->ibo);
glBindBufferRange (UBO, 1, uboMatrices, obj.matrixOffset, maSize);
// iterate over all parts individually
foreach ( part in obj.parts) {
if (part.material != lastMaterial){
glBindBufferRange (UBO, 2, uboMaterial, part.materialOffset, mtlSize);
}
glDrawElements (...);
}
}
Typical Scene Drawing Example
~68 000 drawcalls
~10 triangles per call
22. 22
Siggraph Asia 2014
Scene Drawing Converted To Token Buffer
glDrawCommandsNV (GL_TRIANGLES, tokenBuffer
, offsets[], sizes[], count);
// {0}, {bufferSize}, 1
foreach (obj in scene) {
glBufferAddressRangeNV(VERTEX.., 0, obj.geometry->addrVBO, ...);
glBufferAddressRangeNV(ELEMENT..., 0, obj.geometry->addrIBO, ...);
glBufferAddressRangeNV(UNIFORM..., 1, addrMatrices ...);
foreach ( batch in obj.materialGroups) {
glBufferAddressRangeNV(UNIFORM, 2, addrMaterial ...);
glMultiDrawElements(...)
}
}
becomes a single drawcall
replaces 80k calls to GL
(for graphic-card model)
Token buffer
Set Attr#0 on
VBO address …
Set Attr#1 on
VBO address …
Set Elements on
EBO address …
Uniform Matrix
on UBO address …
Uniform Material
on UBO address …
DrawElements
Uniform Material
on UBO address …
DrawElements
Obj#1
Obj#2
…
For all
objects
…
24. 24
Siggraph Asia 2014
Tokenized Rendering
Token Headers are GPU commands, followed by arguments
FRONTFACE_COMMAND_NV
BLEND_COLOR_COMMAND_NV
STENCIL_REF_COMMAND_NV
LINE_WIDTH_COMMAND_NV
POLYGON_OFFSET_COMMAND_NV
ALPHA_REF_COMMAND_NV
VIEWPORT_COMMAND_NV
SCISSOR_COMMAND_NV
ELEMENT_ADDRESS_COMMAND_NV
ATTRIBUTE_ADDRESS_COMMAND_NV
UNIFORM_ADDRESS_COMMAND_NV
DRAW_ELEMENTS_COMMAND_NV
DRAW_ARRAYS_COMMAND_NV
DRAW_ELEMENTS_STRIP_COMMAND_NV
DRAW_ARRAYS_STRIP_COMMAND_NV
DRAW_ELEMENTS_INSTANCED_COMMAND_NV
DRAW_ARRAYS_INSTANCED_COMMAND_NV
TERMINATE_SEQUENCE_COMMAND_NV
NOP_COMMAND_NV
Final list might change
Render States
Index Buffer;
Position Attribute; Normals; Texcoords...
Matrices; Material data...
Draw Commands !
Special Commands (see later)
25. 25
Siggraph Asia 2014
Tokenized Rendering
Example: UBO Address assignment structure
FRONTFACE_COMMAND_NV
BLEND_COLOR_COMMAND_NV
STENCIL_REF_COMMAND_NV
LINE_WIDTH_COMMAND_NV
POLYGON_OFFSET_COMMAND_NV
ALPHA_REF_COMMAND_NV
VIEWPORT_COMMAND_NV
SCISSOR_COMMAND_NV
ELEMENT_ADDRESS_COMMAND_NV
ATTRIBUTE_ADDRESS_COMMAND_NV
UNIFORM_ADDRESS_COMMAND_NV
DRAW_ELEMENTS_COMMAND_NV
DRAW_ARRAYS_COMMAND_NV
DRAW_ELEMENTS_STRIP_COMMAND_NV
DRAW_ARRAYS_STRIP_COMMAND_NV
DRAW_ELEMENTS_INSTANCED_COMMAND_NV
DRAW_ARRAYS_INSTANCED_COMMAND_NV
TERMINATE_SEQUENCE_COMMAND_NV
NOP_COMMAND_NV
TokenUbo
{
GLuint header;
UniformAddressCommandNV
{
GLushort index;
GLushort stage;
GLuint64 address;
} cmd;
}
Example
Real Token to be queried
header = glGetCommandHeaderNV(
UNIFORM_ADDRESS_COMMAND_NV, Sizeof(TokenUbo) )
One Shading stage to target (not ‘program’)
Stage = glGetStageIndexNV(S)
(S: STAGE_VERTEX|GEOMETRY|FRAGMENT|TESS_CONTROL/EVAL)
Hardware Index
*not* Uniform Program index
26. 26
Siggraph Asia 2014
TOKENIZED RENDERING
Allows rendering of a mix of LIST, STRIP, FAN, LOOP modes in one go
Pass „basic type“ as Token, use „mode“ for more in "Instanced" drawing
TokenDrawElements
{
Gluint header;
DrawElementsCommandNV
{
GLuint count;
GLuint firstIndex;
GLuint baseVertex;
} cmd;
}
TokenDrawElementsInstanced
{
Gluint header;
DrawElementsInstancedCommandNV
{
GLuint mode;
GLuint count;
GLuint instanceCount;
GLuint firstIndex;
GLuint baseVertex;
GLuint baseInstance;
} cmd;
}
LIST, STRIP,
FAN, LOOP types
(GL_LINE_LOOP...)
DRAW_ELEMENTS_COMMAND_NV
DRAW_ARRAYS_COMMAND_NV
DRAW_ELEMENTS_STRIP_COMMAND_NV
DRAW_ARRAYS_STRIP_COMMAND_NV
glGetCommandHeaderNV()
DRAW_ELEMENTS_INSTANCED_COMMAND_NV
DRAW_ARRAYS_INSTANCED_COMMAND_NV
glGetCommandHeaderNV()
27. 27
Siggraph Asia 2014
TERMINATE_SEQUENCE: Occlusion Culling
Now overcomes deficit of previous methods
http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-
techniques.pdf
Enqueue all tokens for entire object
Shader to Check Visibility; Shader for SCAN operation;
Add TERMINATE_SEQUENCE_COMMAND_NV to end
Technique Draw time GROUPED
Timing GPU (ms) CPU (ms)
6.3 2.6 x 0.9 6 xCULL Old Bindless MDI (*)
CULL NEW TOKEN native 0.2 27 x
glBindBufferRange + drawcalls 16.8 5.4
6.2 2.7 x 6.3 0.85 xCULL Old Readback (stalls pipe) (*)
K5000, Preliminary results, (*) taken from slightly differnt framework
HW TOKEN-buffers ~0 BIG x
No instancing!
(synthetic test, data replication)
SCENE:
materials: 138
objects: 13,032
geometries: 3,312
parts: 789,464
triangles: 29,527,840
vertices: 27,584,376
16.8
5.3 3.1 x
28. 28
Siggraph Asia 2014
Just a little...
“commandBindableNV” turns binding IDs
to Hardware IDs
Texturing must go through bindless
ARB_bindless_texture
Regular Texture binding Turned Off
Changing UBO Ptr Addresses
== texture changes (here ‘myBuf’)
Does Shaders Need Special Code ?
#extension GL_ARB_bindless_texture : require
#extension GL_NV_command_list : enable
#if GL_NV_command_list
layout(commandBindableNV) uniform;
#endif
layout(std140,binding=2) uniform myBuf {
sampler2D mySampler;
//or:
in int whichSampler;
};
layout(std140,binding=1) uniform samplers {
sampler2D allSamplers[NUM_TEXTURES];
};
…
c = texture(mySampler, tc);
//or
c = texture(allSamplers [ whichSampler ] , tc);
…
Example:
29. 29
Siggraph Asia 2014
State Objects
StateObject
Encapsulates majority of state (fbo format, active shader, blend,
depth ...)
State Object immutable: gives more control over validation cost
no bindings captured: bindless used instead; textures passed via UBO
Primitive type is a state: participates to important validation work
Cannot be put in the token buffer
glCaptureStateNV(stateobject, GL_TRIANGLES );
Primitive type as argument:
because it is part of states
30. 30
Siggraph Asia 2014
Render entire scenes with different
shaders/fbos... in one go
Driver caches state transitions !
Draw With State Objects
glDrawCommandsStatesNV(
tokenBuffer,
offsets[], sizes[],
states[], fbos[],
count);
Token buffer
VBO address …
VBO address …
EBO address …
Uniform Matrix
Uniform Material
DrawElements
Uniform Material
DrawElements
…
EBO address …
Uniform Matrix
Uniform Material
DrawElements
Uniform Material
DrawElements
FBO 1
FBO 2
fbos[]
FBO 1
FBO 1
State Obj 1
State Obj 2
State Obj 1
State Obj 4
Count = 4
States[]
offsets[]
sizes[0]sizes[1]sizes[2]sizes[3]
31. 31
Siggraph Asia 2014
Driver bakes „Diff“ between states and apply only the difference
Pseudo Code of glDrawCommandsStatesNV
for i < count {
if (i == 0) set state from states[i];
else set state transition states[i-1] to states[i]
if (fbo[i]) {
// fbo[i] must be compatible with states[i].fbo
glBindFramebuffer( fbo[i] )
} else glBindFramebuffer( states[i].fbo )
ProcessCommandSequence(...tokenBuffer, offsets[i], sizes[i])
}
32. 32
Siggraph Asia 2014
Rendering must always happen in a FBO
Backbuffer Ptr. Address is un-reliable (dynamic re-alloc when resizing…)
Use glBlitFramebuffer() for FBO final Backbuffer result
FBO Resources Must be Resident (glMakeTextureHandleResidentARB())
Resources can be replaced/swapped if they keep the same base
configuration (attachment formats, # drawbuffers...)
I.e. Same configuration as captured FBO settings from stateobject
This is what happens when resizing:
Detach & Destroy resource create new ones attach them to FBO(s)
Frame-Buffer Objects And State Objects
33. 33
Siggraph Asia 2014
Combine multiple token buffers &
states into a CommandList object
glCreateCommandListsNV(N, &list)
glCommandListSegmentsNV(list, segs)
Convenient for concatenation
glListDrawCommandsStatesClientNV(list,
segment, token-buffer-ptrs, token-buffer-
sizes, states, FBOs, count)
glCompileCommandListNV(list);
glCallCommandListNV(list)
Command-List Object
System mem.
VBO address …
VBO address …
EBO address …
Uniform Matrix
Uniform Material
DrawElements
…
Uniform Material
DrawElements
FBO
State Object
System mem.
VBO address …
VBO address …
EBO address …
Uniform Matrix
Uniform Material
DrawElements
…
FBO
State Object
System mem.
VBO address …
VBO address …
Uniform Matrix
Uniform Material
DrawArrays
…
Uniform Material
DrawArrays
FBO
State Object
…
Command-List Segment #0 Segment #1
34. 34
Siggraph Asia 2014
Command-List Object
Less flexibilty compared to token buffer
Token buffers taken from client pointers (CPU memory)
Driver will optimize these token buffers and related state-objects
Optimize State transitions; token cmds for the GPU Front-End
Resulting command-list is immutable
If any resource pointer changed, the command-list must be recompiled
But still rather flexible
possible to change content of referenced buffers! (matrices, materials,
vertices...)
Animation; skinning... No need for command-list recompilation
35. 35
Siggraph Asia 2014
Command-List doesn’t pretend to solve any OpenGL multi-threading
Multi-threading can help if complex CPU work going-on
update of Vertex / Element buffers: transform. Matrices; skinning…
Update of Token Buffers: adding/removing characters in a game scene…
OpenGL & Multithreading issues:
OpenGL is bad at sharing its access from various threads
Make-Current or Multiple Contexts aren’t good solutions
Prefer to keep context ownership to one main thread
OpenGL And Multi-threading ?
36. 36
Siggraph Asia 2014
Token Buffers and Vertex/Elements/Uniform Buffers (VBO/UBO)
Token Buffer Objects; VBOs and UBOs owned by OpenGL
Multi-threading can be used for data update if:
Work on CPU system memory; then push data back to OpenGL (glBufferSubData)
Or Request for a pointer (glMapBuffers); then work with it from the thread
State Objects:
State Object and their 'Capture' must be handled in OpenGL context
No way to do it on a separate threads that don’t have OpenGL context
Multi-threading And Command-List
37. 37
Siggraph Asia 2014
Multi-threading Use-Case Example #1
Thread #0
OpenGL Ctxt
Thread #1
Thread #1
Submit
States
PushAttrib()
set states & and capture
PopAttribs()
Get State ID
PushAttrib()
set states & and capture
PopAttribs()
Submit
States
unmap
Token buf.
ask for
Token buf.
Ptr
Build state
list
ask for
Token buf.
Ptr
unmap
Token buf.
Build token cmds
Get State ID
Build token
cmds
Build token
cmds
ask for
Token buf.
Ptr unmap
Token
buf.
ask for
Token buf.
Ptr ...
ask for
Token buf.
Ptr
Build token
cmds
unmap
Token
buf.
Build token cmds
Build state
list
38. 38
Siggraph Asia 2014
Multi-threading Use-Case Example #2
Split state capture from Token-buffer creation
Thread 1 Thread 2 Thread 3
Build Token buffers
in CPU memory
Thread 0
OpenGL
Ctxt
Walk through whole scene
and build State-Objects
Receive Token-Buffers
and issue them
as Token Buffer Objects
to OpenGL
39. 39
Siggraph Asia 2014
Migration Steps
All rendering via FBO (statecapture doesn‘t support default backbuffer)
No legacy states in GLSL
use custom uniforms no built-in uniforms (gl_ModelView...)
use glVertexAttribPointer (no glTexCoordPointer etc.)
No classic uniforms, pack them all in UBO
ARB_bindless_texture for texturing
Bindless Buffers
ARB_vertex_attrib_binding combined with NV_vertex_buffer_unified_memory (VBUM)...
Organize for StateObject reuse
state capture expensive: Capture them at init. time; Factorize their use in the scene
40. 40
Siggraph Asia 2014
Migration Tips
Vertex Attributes and bindless VBO
http://on-demand.gputechconf.com/siggraph/2014/presentation/SG4117-
OpenGL-Scene-Rendering-Techniques.pdf (slide 11-16)
GLSL
// classic attributes
...
normal = gl_Normal;
gl_Position = gl_Vertex;
// generic attributes
// ideally share this definition across C and GLSL
#define VERTEX_POS 0
#define VERTEX_NORMAL 1
in layout(location= VERTEX_POS) vec4 attr_Pos;
in layout(location= VERTEX_NORMAL) vec3 attr_Normal;
...
normal = attr_Normal;
gl_Position = attr_Pos;
41. 41
Siggraph Asia 2014
Migration Tips
UBO Parameter management
http://on-demand.gputechconf.com/siggraph/2014/presentation/SG4117-
OpenGL-Scene-Rendering-Techniques.pdf ( 18-27, 44-47)
Ideally group by frequency of change
// classic uniforms
uniform samplerCube viewEnvTex;
uniform vec4 materialColor;
uniform sampler2D materialTex;
...
// UBO usage, bindless texture inside UBO, grouped by change
layout(std140, binding=0, commandBindableNV) uniform view
{
samplerCube texViewEnv;
};
layout(std140, binding=1, commandBindableNV) uniform material
{
vec4 materialColor;
sampler2D texMaterialColor;
};
...
42. 42
Siggraph Asia 2014
OpenGL Devtech 'Proviz' samples soon available on GitHub !
check https://github.com/nvpro-samples on a regular basis
working on ramping-up https://developer.nvidia.com/samples &
https://developer.nvidia.com/professional-graphics
command-lists examples to come here this Dec. 2014
Many other samples will be pushed here as standalone repositories
Beta Driver for Command-Lists: Release 346 for Dec.2014
Special Thanks to Pierre Boudier and Christoph Kubisch
Questions/Feedback: tlorach@nvidia.com / ckubisch@nvidia.com
Thanks!