Your SlideShare is downloading. ×
0
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Khronos OpenGL Efficiency - GDC 2014
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Khronos OpenGL Efficiency - GDC 2014

704

Published on

OpenGL Efficiency: AZDO. Slide presentation by Cass Everitt, OpenGL Engineer at NVIDIA.

OpenGL Efficiency: AZDO. Slide presentation by Cass Everitt, OpenGL Engineer at NVIDIA.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
704
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
34
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. © Copyright Khronos Group 2014 - Page 1 OpenGL Efficiency: AZDO Cass Everitt OpenGL Engineer, NVIDIA GDC, San Francisco, March 2014
  • 2. © Copyright Khronos Group 2014 - Page 2 AZDO? • Approaching Zero Driver Overhead
  • 3. © Copyright Khronos Group 2014 - Page 3 Why do you care about driver overhead? •Because driver overhead == cost •Costs - CPU cycles from app - CPU cache from app - power / battery - GPU throughput
  • 4. © Copyright Khronos Group 2014 - Page 4 OpenGL Fallacy: Old and Inefficient Immediate Mode Fixed Function Ancient crufty stuff Feedback Selection Evaluators Display Lists Selectors
  • 5. © Copyright Khronos Group 2014 - Page 5 OpenGL Reality: Modern & Efficient Bindless ARB SSBO GL4.3 Multi-Draw Indirect GL4.3 UBO GL3.1 Texture Arrays GL3.0 Buffer Storage GL4.4
  • 6. © Copyright Khronos Group 2014 - Page 6 Plus, OpenGL has all the features Compute Tessellation Geometry Shaders Sparse Textures Image Load/Store
  • 7. © Copyright Khronos Group 2014 - Page 7 indirect draw buffer object buffer object texture object buffer object buffer object texture object buffer object buffer object buffer object render target buffer object Classic OpenGL Model CPU GPU … Memory cmd cmd cmdcmd Direct Drawing Commands (via the command fifo)
  • 8. © Copyright Khronos Group 2014 - Page 8 Classic Model Pros / Cons • Pro - Very stable – 20+ year old code still “just works” - Simple - driver handles hazards, sync, allocation - Empowered the GPU revolution - Many classes of applications well served • Cons - Demanding apps are not so well served - Intense games, VR - Doesn’t scale with high scene complexity - Threading model - Hardware abstraction showing age
  • 9. © Copyright Khronos Group 2014 - Page 9 Aspirational Goal • Can we address the cons within the framework of the existing API? - That is, can we fix the cons without tossing the pros? • Good question! - As it turns out, Smart People in Khronos have actually been working on this question for a while now - And they’ve developed an efficient, modern OpenGL that - Gives amazing perf improvements, and lives within the existing framework • And here’s what it looks like…
  • 10. © Copyright Khronos Group 2014 - Page 10 indirect draw buffer object indirect draw buffer object texture object buffer object indirect draw buffer object texture object buffer object buffer object buffer object render target buffer object Efficient OpenGL Model CPU CPU CPU CPU GPU … Memory
  • 11. © Copyright Khronos Group 2014 - Page 11 CPU and GPU decoupled CPU CPU CPU CPU GPU … Memory
  • 12. © Copyright Khronos Group 2014 - Page 12 indirect draw buffer object indirect draw buffer object texture object buffer object indirect draw buffer object texture object buffer object buffer object buffer object render target buffer object CPU Writes Memory – multi-threaded (no API)! CPU CPU CPU CPU GPU … Memory
  • 13. © Copyright Khronos Group 2014 - Page 13 indirect draw buffer object indirect draw buffer object texture object buffer object indirect draw buffer object texture object buffer object buffer object buffer object render target buffer object And/Or GPU Writes Memory CPU CPU CPU CPU GPU … Memory GPU Work Creation Still no API – the magic of communicating through memory…
  • 14. © Copyright Khronos Group 2014 - Page 14 indirect draw buffer object indirect draw buffer object texture object buffer object indirect draw buffer object texture object buffer object buffer object buffer object render target buffer object GPU Reads Commands from Memory CPU CPU CPU CPU GPU … Memory Minimal CPU / driver involvement…
  • 15. © Copyright Khronos Group 2014 - Page 15 Results •Integer multiple speedups ~5x – ~15x - This is not a typo - On driver limited cases, obviously •Works TODAY on existing drivers! - Mostly GL4.2+ - Extensions are at least EXT
  • 16. © Copyright Khronos Group 2014 - Page 16 Bonuses • Enables scalable multi-threading with no new API - Cores just write to memory • Enables GPU Work Creation - Compute job or similar - Builds buffers, constructs MDI commands • Does not require a new object model • Does not require breaking existing applications
  • 17. © Copyright Khronos Group 2014 - Page 17 Results • - This is not a typo - On driver limited cases, obviously • - Mostly GL4.2+ - Extensions are at least EXT
  • 18. © Copyright Khronos Group 2014 - Page 18 Results •Integer multiple speedups ~5x – ~15x - This is not a typo - On driver limited cases, obviously •Works TODAY on existing drivers! - Mostly GL4.2+ - Extensions are at least EXT
  • 19. © Copyright Khronos Group 2014 - Page 19 Results •Integer multiple speedups ~5x – ~15x - This is not a typo - On driver limited cases, obviously •Works TODAY on existing drivers! - Mostly GL4.2+ - Extensions are at least EXT

×