SlideShare a Scribd company logo
1 of 43
Ivan Nevraev
Microsoft
Introduction to Direct3D 12
Goals & Assumptions
• Preview of Direct3D 12
• More API details in future talks
• Assuming familiarity with Direct3D 11
Direct3D 12 API – Goals
• Console API efficiency and performance
• Reduce CPU overhead
• Increase scalability across multiple CPU cores
• Greater developer control
• Superset of D3D 11 rendering functionality
ID3D11DeviceContext
Render Context: Direct3D 11
Input Assembler
Vertex Shader
Hull Shader
Tessellator
Rasterizer
Domain Shader
Geometry Shader
Pixel Shader
Output Merger
GPU Memory
Other State
CPU Overhead: Changing Pipeline State
• Direct3D 10 reduced number of state objects
• Still mismatched from hardware state
• Drivers resolve state at Draw
Direct3D 11 – Pipeline State Overhead
Small state objects  Hardware mismatch overhead
HW State 1
HW State 2
D3D Vertex Shader
D3D Rasterizer
D3D Pixel Shader
D3D Blend State
HW State 3
Direct3D 12 – Pipeline State Optimization
Group pipeline into single object
Copy from PSO to Hardware State
HW State 1
HW State 2
Pipeline
State
Object
HW State 3
ID3D11DeviceContext
Render Context: Direct3D 11
Input Assembler
Vertex Shader
Hull Shader
Tessellator
Rasterizer
Domain Shader
Geometry Shader
Pixel Shader
Output Merger
GPU Memory
Non-PSO State
Render Context: Pipeline State Object (PSO)
Pipeline State Object
Input Assembler
Vertex Shader
Hull Shader
Tessellator
Rasterizer
Domain Shader
Geometry Shader
Pixel Shader
Output Merger
GPU Memory
Non-PSO State
CPU Overhead: Resource Binding
• System needs to do lots of binding inspection
• Resource hazards
• Resource lifetime
• Resource residency management
• Mirrored copies of state used to implement Get*
• Ease of use for middleware
Resource Hazard Resolution
• Hazard tracking and resolution
• Runtime
• Driver
• Resource hazards
• Render Target/Depth <> Texture
• Tile Resource Aliasing
• etc…
Direct3D 12 – Explicit Hazard Resolution
ResourceBarrier: generalization of Direct3D 11’s TiledResourceBarrier
D3D12_RESOURCE_BARRIER_DESC Desc;
Desc.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
Desc.Transition.pResource = pRTTexture;
Desc.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
Desc.Transition.StateBefore = D3D12_RESOURCE_USAGE_RENDER_TARGET;
Desc.Transition.StateAfter = D3D12_RESOURCE_USAGE_PIXEL_SHADER_RESOURCE;
pContext->ResourceBarrier( 1, &Desc );
Resource Lifetime and Residency
• Explicit application control over resource lifetime
• Resource destruction is immediate
• Application must ensure no queued GPU work
• Use Fence API to track GPU progress
• One fence per-frame is well amortized
• Explicit application control over resource residency
• Application declares resources currently in use on GPU
Remove State Mirroring
• Application responsibility to communicate current state to
middleware
Render Context: Pipeline State Object (PSO)
Pipeline State Object
Input Assembler
Vertex Shader
Hull Shader
Tessellator
Rasterizer
Domain Shader
Geometry Shader
Pixel Shader
Output Merger
GPU Memory
Non-PSO State
Render Context: Remove State Reflection
Pipeline State Object
Input Assembler
Vertex Shader
Hull Shader
Tessellator
Rasterizer
Domain Shader
Geometry Shader
Pixel Shader
Output Merger
GPU Memory
Non-PSO State
CPU Overhead: Redundant Resource Binding
• Streaming identical resource bindings frame over frame
• Partial changes require copying all bindings
Direct3D 12: Descriptor Heaps & Tables
• Scales across extremes of HW capability
• Unified approach serves breadth of app binding flows
• Streaming changes to bindings
• Reuse of static bindings
• And everything between
• Dynamic indexing of shader resources
Descriptor
• Small chunk of data defining resource parameters
• Just opaque data – no OS lifetime management
• Hardware representation of Direct3D “View”
Descriptor
{
Type
Format
Mip Count
pData
}
Descriptor Heaps
• Storage for descriptors
• App owns the layout
• Low overhead to manipulate
• Multiple heaps allowed
GPU Memory
DescriptorHeap
Descriptor Tables
• Context points to active heap
• A table is an index and a size in the heap
• Not an API object
• Single view type per table
• Multiple tables per type
Pipeline State Object
…
Vertex Shader
…
Pixel Shader
…
Start Index
Size
Render Context: Remove State Reflection
Pipeline State Object
Input Assembler
Vertex Shader
Hull Shader
Tessellator
Rasterizer
Domain Shader
Geometry Shader
Pixel Shader
Output Merger
GPU Memory
Non-PSO State
Render Context: Descriptor Tables & Heaps
Pipeline State Object
Input Assembler
Vertex Shader
Hull Shader
Tessellator
Rasterizer
Domain Shader
Geometry Shader
Pixel Shader
Output Merger
GPU Memory
Non-PSO State
Render Context: Direct3D 12
Pipeline State Object
Input Assembler
Vertex Shader
Hull Shader
Tessellator
Rasterizer
Domain Shader
Geometry Shader
Pixel Shader
Output Merger
GPU Memory
Non-PSO State
CPU Overhead: Redundant Render Commands
• Typical applications send identical sequences of commands frame-
over-frame
• Measured 90-95% coherence on typical modern games
Bundles
• Small command list
• Recorded once
• Reused multiple times
• Free threaded creation
• Inherits from execute site
• Non-PSO State
• Descriptor Table Bindings
• Restrictions to ensure efficient driver implementation
Bundles
Context
Clear
Draw
SetTable
Execute Bundle
SetTable
Execute Bundle
SetPSO
…
Example code without Bundles
// Setup
pContext->SetPipelineState(pPSO);
pContext->SetRenderTargetViewTable(0, 1, FALSE, 0);
pContext->SetVertexBufferTable(0, 1);
pContext->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
// Draw 1
pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);
pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);
pContext->DrawInstanced(6, 1, 0, 0);
pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);
pContext->DrawInstanced(6, 1, 6, 0);
// Draw 2
pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);
pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);
pContext->DrawInstanced(6, 1, 0, 0);
pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);
pContext->DrawInstanced(6, 1, 6, 0);
Set object #1 specific tables and draw
Setup pipeline state and common
descriptor tables
Set object #2 specific tables and draw
Bundles – Creating a Bundle
// Create bundle
pDevice->CreateCommandList(D3D12_COMMAND_LIST_TYPE_BUNDLE, pBundleAllocator, pPSO, pDescriptorHeap, &pBundle);
// Record commands
pBundle->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
pBundle->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);
pBundle->DrawInstanced(6, 1, 0, 0);
pBundle->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);
pBundle->DrawInstanced(6, 1, 6, 0);
pBundle->Close();
No Bundles
// Setup
pContext->SetPipelineState(pPSO);
pContext->SetRenderTargetViewTable(0, 1, FALSE, 0);
pContext->SetVertexBufferTable(0, 1);
pContext->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
// Draw 1
pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);
pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);
pContext->DrawInstanced(6, 1, 0, 0);
pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);
pContext->DrawInstanced(6, 1, 6, 0);
// Draw 2
pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);
pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);
pContext->DrawInstanced(6, 1, 0, 0);
pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);
pContext->DrawInstanced(6, 1, 6, 0);
// Setup
pContext->SetRenderTargetViewTable(0, 1, FALSE, 0);
pContext->SetVertexBufferTable(0, 1);
// Draw 1 and 2
pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);
pContext->ExecuteBundle(pBundle);
pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);
pContext->ExecuteBundle(pBundle);
Bundles
Bundles: CPU performance improvements
• PC – 0.7ms to 0.2ms in a simple test (GPU bound)
• Xbox
• 1/3 CPU consumption for rendering submission in one game
• 100s of thousand DrawBundle executions are possible per 60FPS frame
• Even one draw per draw bundle helps
• Saves engine overhead
Direct3D 12 – Command Creation Parallelism
• About that context…
• No Immediate Context
• All rendering via Command Lists
• Command Lists are submitted on a Command Queue
Command Lists and Command Queue
• Application responsible for
• Hazard tracking
• Declaring maximum number of recording command lists
• Resource renaming with GPU signaled fence
• Resources lifetime referenced by command lists
• Fence operations on the Command Queue
• Not on Command List or Bundle
• Signals occur on Command List completion
• Command List submission cost reduced by WDDM 2.0
Command Queue
Command Queue
Execute Command List 1
Execute Command List 2
Signal Fence
Command List 1
Clear
SetTable
Execute Bundle A
SetTable
Draw
SetPSO
Draw
Command List 2
Clear
Dispatch
SetTable
Execute Bundle A
SetTable
Execute Bundle B
Command Queue
Command Queue
Execute Command List 1
Execute Command List 2
Signal Fence
Command List 1
Clear
SetTable
Execute Bundle A
SetTable
Draw
SetPSO
Draw
Command List 2
Clear
Dispatch
SetTable
Execute Bundle A
SetTable
Execute Bundle B
Dynamic Heaps
• Resource Renaming Overhead
• Significant CPU overhead on ExecuteCommandList
• Significant driver complexity
• Solution: Efficient Application Suballocation
• Application creates large buffer resource and suballocates
• Data type determined by application
• Standardized alignment requirements
• Persistently mapped memory
Allocation vs. Suballocation
GPU Memory
Resource 2Resource 1Heap
CB IB VB …
GPU Memory
Resource 2Resource 1
CB IB VB
Direct3D 12 – CPU Parallelism
• Direct3D 12 has several parallel tasks
• Command List Generation
• Bundle Generation
• PSO Creation
• Resource Creation
• Dynamic Data Generation
• Runtime and driver designed for parallelism
• Developer chooses what to make parallel
D3D11 Profiling
PresentApp Logic D3D11 UMD KMDDXGK
App Logic
D3D
11
App Logic D3D
11
App Logic D3D
11
Thread 0
Thread 1
Thread 2
Thread 3
0 ms 2.50 ms 5.00 ms 7.50 ms
App Logic D3D Runtime User-mode Driver DXGKernel Kernel-mode Driver
Present
D3D12 Profiling
App Logic UMD
D3D12
Present
DXGK/KMD
App Logic UMD
D3D12
App Logic UMD
D3D12
App Logic UMD
D3D12
Thread 0
Thread 1
Thread 2
Thread 3
0 ms 2.50 ms 5.00 ms 7.50 ms
App Logic D3D Runtime User-mode Driver DXGKernel Kernel-mode Driver
Present
D3D11 v D3D12 numbers
App Logic UMD
D3D12
Present
DXGK/KMD
App Logic UMD
D3D12
App Logic UMD
D3D12
App Logic UMD
D3D12
Thread 0
Thread 1
Thread 2
Thread 3
0 ms 2.50 ms 5.00 ms 7.50 ms
PresentApp Logic D3D11 UMD KMDDXGK
App Logic
D3D
11
App Logic
D3D
11
App Logic
D3D1
1
Thread 0
Thread 1
Thread 2
Thread 3
0 ms 2.50 ms 5.00 ms 7.50 ms
App+GFX (ms) GFX-only (ms)
D3D11 D3D12 D3D11 D3D12
Thread 0 7.88 3.80 5.73 1.17
Thread 1 3.08 2.50 0.35 0.81
Thread 2 2.84 2.46 0.34 0.69
Thread 3 2.63 2.45 0.23 0.65
Total 16.42 11.21 6.65 3.32
Summary
• Greater CPU Efficiency
• Greater CPU Scalability
• Greater Developer Control
• CPU Parallelism
• Resource Lifetime
• Memory Usage
The End

More Related Content

What's hot

OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesNarann29
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteElectronic Arts / DICE
 
Parallel Futures of a Game Engine
Parallel Futures of a Game EngineParallel Futures of a Game Engine
Parallel Futures of a Game EngineJohan Andersson
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility bufferWolfgang Engel
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsElectronic Arts / DICE
 
Screen Space Decals in Warhammer 40,000: Space Marine
Screen Space Decals in Warhammer 40,000: Space MarineScreen Space Decals in Warhammer 40,000: Space Marine
Screen Space Decals in Warhammer 40,000: Space MarinePope Kim
 
Terrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemTerrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemElectronic Arts / DICE
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overheadCass Everitt
 
The Next Generation of PhyreEngine
The Next Generation of PhyreEngineThe Next Generation of PhyreEngine
The Next Generation of PhyreEngineSlide_N
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
Direct x 11 입문
Direct x 11 입문Direct x 11 입문
Direct x 11 입문Jin Woo Lee
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Tiago Sousa
 
NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017Mark Kilgard
 
The Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next StepsThe Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next StepsJohan Andersson
 
Game Engine Architecture
Game Engine ArchitectureGame Engine Architecture
Game Engine ArchitectureAttila Jenei
 
Crysis Next-Gen Effects (GDC 2008)
Crysis Next-Gen Effects (GDC 2008)Crysis Next-Gen Effects (GDC 2008)
Crysis Next-Gen Effects (GDC 2008)Tiago Sousa
 

What's hot (20)

OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering Techniques
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in Frostbite
 
Beyond porting
Beyond portingBeyond porting
Beyond porting
 
Parallel Futures of a Game Engine
Parallel Futures of a Game EngineParallel Futures of a Game Engine
Parallel Futures of a Game Engine
 
Scope Stack Allocation
Scope Stack AllocationScope Stack Allocation
Scope Stack Allocation
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-Graphics
 
Screen Space Decals in Warhammer 40,000: Space Marine
Screen Space Decals in Warhammer 40,000: Space MarineScreen Space Decals in Warhammer 40,000: Space Marine
Screen Space Decals in Warhammer 40,000: Space Marine
 
SEED - Halcyon Architecture
SEED - Halcyon ArchitectureSEED - Halcyon Architecture
SEED - Halcyon Architecture
 
Terrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemTerrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable System
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overhead
 
The Next Generation of PhyreEngine
The Next Generation of PhyreEngineThe Next Generation of PhyreEngine
The Next Generation of PhyreEngine
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
Direct x 11 입문
Direct x 11 입문Direct x 11 입문
Direct x 11 입문
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)
 
NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017
 
The Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next StepsThe Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next Steps
 
Game Engine Architecture
Game Engine ArchitectureGame Engine Architecture
Game Engine Architecture
 
Crysis Next-Gen Effects (GDC 2008)
Crysis Next-Gen Effects (GDC 2008)Crysis Next-Gen Effects (GDC 2008)
Crysis Next-Gen Effects (GDC 2008)
 

Viewers also liked

Efficient Rendering with DirectX* 12 on Intel® Graphics
Efficient Rendering with DirectX* 12 on Intel® GraphicsEfficient Rendering with DirectX* 12 on Intel® Graphics
Efficient Rendering with DirectX* 12 on Intel® GraphicsGael Hofemeier
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesAMD Developer Central
 
DirectX12 Graphics and Performance
DirectX12 Graphics and PerformanceDirectX12 Graphics and Performance
DirectX12 Graphics and PerformanceDevGAMM Conference
 
Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12mistercteam
 
Solving Visibility and Streaming in the The Witcher 3: Wild Hunt with Umbra 3
Solving Visibility and Streaming in the The Witcher 3: Wild Hunt with Umbra 3Solving Visibility and Streaming in the The Witcher 3: Wild Hunt with Umbra 3
Solving Visibility and Streaming in the The Witcher 3: Wild Hunt with Umbra 3Umbra
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceAMD Developer Central
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellAMD Developer Central
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozAMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornAMD Developer Central
 

Viewers also liked (20)

Efficient Rendering with DirectX* 12 on Intel® Graphics
Efficient Rendering with DirectX* 12 on Intel® GraphicsEfficient Rendering with DirectX* 12 on Intel® Graphics
Efficient Rendering with DirectX* 12 on Intel® Graphics
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
DirectX12 Graphics and Performance
DirectX12 Graphics and PerformanceDirectX12 Graphics and Performance
DirectX12 Graphics and Performance
 
Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12
 
Solving Visibility and Streaming in the The Witcher 3: Wild Hunt with Umbra 3
Solving Visibility and Streaming in the The Witcher 3: Wild Hunt with Umbra 3Solving Visibility and Streaming in the The Witcher 3: Wild Hunt with Umbra 3
Solving Visibility and Streaming in the The Witcher 3: Wild Hunt with Umbra 3
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
 

Similar to Introduction to Direct 3D 12 by Ivan Nevraev

3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)mistercteam
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloadedmistercteam
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And EffectsThomas Goddard
 
Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Eran Gampel
 
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray TracingSyysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray TracingElectronic Arts / DICE
 
isca22-feng-menda_for sparse transposition and dataflow.pptx
isca22-feng-menda_for sparse transposition and dataflow.pptxisca22-feng-menda_for sparse transposition and dataflow.pptx
isca22-feng-menda_for sparse transposition and dataflow.pptxssuser30e7d2
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio Owen Wu
 
OpenStack Dragonflow shenzhen and Hangzhou meetups
OpenStack Dragonflow shenzhen and Hangzhou  meetupsOpenStack Dragonflow shenzhen and Hangzhou  meetups
OpenStack Dragonflow shenzhen and Hangzhou meetupsEran Gampel
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
 
The next generation of GPU APIs for Game Engines
The next generation of GPU APIs for Game EnginesThe next generation of GPU APIs for Game Engines
The next generation of GPU APIs for Game EnginesPooya Eimandar
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakovmistercteam
 
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016Cloud Native Day Tel Aviv
 
Dragonflow 01 2016 TLV meetup
Dragonflow 01 2016 TLV meetup  Dragonflow 01 2016 TLV meetup
Dragonflow 01 2016 TLV meetup Eran Gampel
 
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlStorage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlITCamp
 
Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...Matthias Trapp
 
Dragon flow neutron lightning talk
Dragon flow neutron lightning talkDragon flow neutron lightning talk
Dragon flow neutron lightning talkEran Gampel
 
Inspecting Block Closures To Generate Shaders for GPU Execution
Inspecting Block Closures To Generate Shaders for GPU ExecutionInspecting Block Closures To Generate Shaders for GPU Execution
Inspecting Block Closures To Generate Shaders for GPU ExecutionESUG
 
02 direct3 d_pipeline
02 direct3 d_pipeline02 direct3 d_pipeline
02 direct3 d_pipelineGirish Ghate
 

Similar to Introduction to Direct 3D 12 by Ivan Nevraev (20)

3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloaded
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
 
GR740 User day
GR740 User dayGR740 User day
GR740 User day
 
Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk
 
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray TracingSyysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
 
isca22-feng-menda_for sparse transposition and dataflow.pptx
isca22-feng-menda_for sparse transposition and dataflow.pptxisca22-feng-menda_for sparse transposition and dataflow.pptx
isca22-feng-menda_for sparse transposition and dataflow.pptx
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
 
ADCSS 2022
ADCSS 2022ADCSS 2022
ADCSS 2022
 
OpenStack Dragonflow shenzhen and Hangzhou meetups
OpenStack Dragonflow shenzhen and Hangzhou  meetupsOpenStack Dragonflow shenzhen and Hangzhou  meetups
OpenStack Dragonflow shenzhen and Hangzhou meetups
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!
 
The next generation of GPU APIs for Game Engines
The next generation of GPU APIs for Game EnginesThe next generation of GPU APIs for Game Engines
The next generation of GPU APIs for Game Engines
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakov
 
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
 
Dragonflow 01 2016 TLV meetup
Dragonflow 01 2016 TLV meetup  Dragonflow 01 2016 TLV meetup
Dragonflow 01 2016 TLV meetup
 
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlStorage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
 
Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...
 
Dragon flow neutron lightning talk
Dragon flow neutron lightning talkDragon flow neutron lightning talk
Dragon flow neutron lightning talk
 
Inspecting Block Closures To Generate Shaders for GPU Execution
Inspecting Block Closures To Generate Shaders for GPU ExecutionInspecting Block Closures To Generate Shaders for GPU Execution
Inspecting Block Closures To Generate Shaders for GPU Execution
 
02 direct3 d_pipeline
02 direct3 d_pipeline02 direct3 d_pipeline
02 direct3 d_pipeline
 

More from AMD Developer Central

An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...AMD Developer Central
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14AMD Developer Central
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14AMD Developer Central
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahAMD Developer Central
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...AMD Developer Central
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...AMD Developer Central
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...AMD Developer Central
 
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...AMD Developer Central
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...AMD Developer Central
 

More from AMD Developer Central (12)

An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
 
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 

Recently uploaded

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governanceWSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseWSO2
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 

Introduction to Direct 3D 12 by Ivan Nevraev

  • 2. Goals & Assumptions • Preview of Direct3D 12 • More API details in future talks • Assuming familiarity with Direct3D 11
  • 3. Direct3D 12 API – Goals • Console API efficiency and performance • Reduce CPU overhead • Increase scalability across multiple CPU cores • Greater developer control • Superset of D3D 11 rendering functionality
  • 4. ID3D11DeviceContext Render Context: Direct3D 11 Input Assembler Vertex Shader Hull Shader Tessellator Rasterizer Domain Shader Geometry Shader Pixel Shader Output Merger GPU Memory Other State
  • 5. CPU Overhead: Changing Pipeline State • Direct3D 10 reduced number of state objects • Still mismatched from hardware state • Drivers resolve state at Draw
  • 6. Direct3D 11 – Pipeline State Overhead Small state objects  Hardware mismatch overhead HW State 1 HW State 2 D3D Vertex Shader D3D Rasterizer D3D Pixel Shader D3D Blend State HW State 3
  • 7. Direct3D 12 – Pipeline State Optimization Group pipeline into single object Copy from PSO to Hardware State HW State 1 HW State 2 Pipeline State Object HW State 3
  • 8. ID3D11DeviceContext Render Context: Direct3D 11 Input Assembler Vertex Shader Hull Shader Tessellator Rasterizer Domain Shader Geometry Shader Pixel Shader Output Merger GPU Memory Non-PSO State
  • 9. Render Context: Pipeline State Object (PSO) Pipeline State Object Input Assembler Vertex Shader Hull Shader Tessellator Rasterizer Domain Shader Geometry Shader Pixel Shader Output Merger GPU Memory Non-PSO State
  • 10. CPU Overhead: Resource Binding • System needs to do lots of binding inspection • Resource hazards • Resource lifetime • Resource residency management • Mirrored copies of state used to implement Get* • Ease of use for middleware
  • 11. Resource Hazard Resolution • Hazard tracking and resolution • Runtime • Driver • Resource hazards • Render Target/Depth <> Texture • Tile Resource Aliasing • etc…
  • 12. Direct3D 12 – Explicit Hazard Resolution ResourceBarrier: generalization of Direct3D 11’s TiledResourceBarrier D3D12_RESOURCE_BARRIER_DESC Desc; Desc.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION; Desc.Transition.pResource = pRTTexture; Desc.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES; Desc.Transition.StateBefore = D3D12_RESOURCE_USAGE_RENDER_TARGET; Desc.Transition.StateAfter = D3D12_RESOURCE_USAGE_PIXEL_SHADER_RESOURCE; pContext->ResourceBarrier( 1, &Desc );
  • 13. Resource Lifetime and Residency • Explicit application control over resource lifetime • Resource destruction is immediate • Application must ensure no queued GPU work • Use Fence API to track GPU progress • One fence per-frame is well amortized • Explicit application control over resource residency • Application declares resources currently in use on GPU
  • 14. Remove State Mirroring • Application responsibility to communicate current state to middleware
  • 15. Render Context: Pipeline State Object (PSO) Pipeline State Object Input Assembler Vertex Shader Hull Shader Tessellator Rasterizer Domain Shader Geometry Shader Pixel Shader Output Merger GPU Memory Non-PSO State
  • 16. Render Context: Remove State Reflection Pipeline State Object Input Assembler Vertex Shader Hull Shader Tessellator Rasterizer Domain Shader Geometry Shader Pixel Shader Output Merger GPU Memory Non-PSO State
  • 17. CPU Overhead: Redundant Resource Binding • Streaming identical resource bindings frame over frame • Partial changes require copying all bindings
  • 18. Direct3D 12: Descriptor Heaps & Tables • Scales across extremes of HW capability • Unified approach serves breadth of app binding flows • Streaming changes to bindings • Reuse of static bindings • And everything between • Dynamic indexing of shader resources
  • 19. Descriptor • Small chunk of data defining resource parameters • Just opaque data – no OS lifetime management • Hardware representation of Direct3D “View” Descriptor { Type Format Mip Count pData }
  • 20. Descriptor Heaps • Storage for descriptors • App owns the layout • Low overhead to manipulate • Multiple heaps allowed GPU Memory DescriptorHeap
  • 21. Descriptor Tables • Context points to active heap • A table is an index and a size in the heap • Not an API object • Single view type per table • Multiple tables per type Pipeline State Object … Vertex Shader … Pixel Shader … Start Index Size
  • 22. Render Context: Remove State Reflection Pipeline State Object Input Assembler Vertex Shader Hull Shader Tessellator Rasterizer Domain Shader Geometry Shader Pixel Shader Output Merger GPU Memory Non-PSO State
  • 23. Render Context: Descriptor Tables & Heaps Pipeline State Object Input Assembler Vertex Shader Hull Shader Tessellator Rasterizer Domain Shader Geometry Shader Pixel Shader Output Merger GPU Memory Non-PSO State
  • 24. Render Context: Direct3D 12 Pipeline State Object Input Assembler Vertex Shader Hull Shader Tessellator Rasterizer Domain Shader Geometry Shader Pixel Shader Output Merger GPU Memory Non-PSO State
  • 25. CPU Overhead: Redundant Render Commands • Typical applications send identical sequences of commands frame- over-frame • Measured 90-95% coherence on typical modern games
  • 26. Bundles • Small command list • Recorded once • Reused multiple times • Free threaded creation • Inherits from execute site • Non-PSO State • Descriptor Table Bindings • Restrictions to ensure efficient driver implementation
  • 28. Example code without Bundles // Setup pContext->SetPipelineState(pPSO); pContext->SetRenderTargetViewTable(0, 1, FALSE, 0); pContext->SetVertexBufferTable(0, 1); pContext->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST); // Draw 1 pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1); pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1); pContext->DrawInstanced(6, 1, 0, 0); pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1); pContext->DrawInstanced(6, 1, 6, 0); // Draw 2 pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1); pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1); pContext->DrawInstanced(6, 1, 0, 0); pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1); pContext->DrawInstanced(6, 1, 6, 0); Set object #1 specific tables and draw Setup pipeline state and common descriptor tables Set object #2 specific tables and draw
  • 29. Bundles – Creating a Bundle // Create bundle pDevice->CreateCommandList(D3D12_COMMAND_LIST_TYPE_BUNDLE, pBundleAllocator, pPSO, pDescriptorHeap, &pBundle); // Record commands pBundle->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST); pBundle->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1); pBundle->DrawInstanced(6, 1, 0, 0); pBundle->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1); pBundle->DrawInstanced(6, 1, 6, 0); pBundle->Close();
  • 30. No Bundles // Setup pContext->SetPipelineState(pPSO); pContext->SetRenderTargetViewTable(0, 1, FALSE, 0); pContext->SetVertexBufferTable(0, 1); pContext->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST); // Draw 1 pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1); pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1); pContext->DrawInstanced(6, 1, 0, 0); pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1); pContext->DrawInstanced(6, 1, 6, 0); // Draw 2 pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1); pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1); pContext->DrawInstanced(6, 1, 0, 0); pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1); pContext->DrawInstanced(6, 1, 6, 0); // Setup pContext->SetRenderTargetViewTable(0, 1, FALSE, 0); pContext->SetVertexBufferTable(0, 1); // Draw 1 and 2 pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1); pContext->ExecuteBundle(pBundle); pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1); pContext->ExecuteBundle(pBundle); Bundles
  • 31. Bundles: CPU performance improvements • PC – 0.7ms to 0.2ms in a simple test (GPU bound) • Xbox • 1/3 CPU consumption for rendering submission in one game • 100s of thousand DrawBundle executions are possible per 60FPS frame • Even one draw per draw bundle helps • Saves engine overhead
  • 32. Direct3D 12 – Command Creation Parallelism • About that context… • No Immediate Context • All rendering via Command Lists • Command Lists are submitted on a Command Queue
  • 33. Command Lists and Command Queue • Application responsible for • Hazard tracking • Declaring maximum number of recording command lists • Resource renaming with GPU signaled fence • Resources lifetime referenced by command lists • Fence operations on the Command Queue • Not on Command List or Bundle • Signals occur on Command List completion • Command List submission cost reduced by WDDM 2.0
  • 34. Command Queue Command Queue Execute Command List 1 Execute Command List 2 Signal Fence Command List 1 Clear SetTable Execute Bundle A SetTable Draw SetPSO Draw Command List 2 Clear Dispatch SetTable Execute Bundle A SetTable Execute Bundle B
  • 35. Command Queue Command Queue Execute Command List 1 Execute Command List 2 Signal Fence Command List 1 Clear SetTable Execute Bundle A SetTable Draw SetPSO Draw Command List 2 Clear Dispatch SetTable Execute Bundle A SetTable Execute Bundle B
  • 36. Dynamic Heaps • Resource Renaming Overhead • Significant CPU overhead on ExecuteCommandList • Significant driver complexity • Solution: Efficient Application Suballocation • Application creates large buffer resource and suballocates • Data type determined by application • Standardized alignment requirements • Persistently mapped memory
  • 37. Allocation vs. Suballocation GPU Memory Resource 2Resource 1Heap CB IB VB … GPU Memory Resource 2Resource 1 CB IB VB
  • 38. Direct3D 12 – CPU Parallelism • Direct3D 12 has several parallel tasks • Command List Generation • Bundle Generation • PSO Creation • Resource Creation • Dynamic Data Generation • Runtime and driver designed for parallelism • Developer chooses what to make parallel
  • 39. D3D11 Profiling PresentApp Logic D3D11 UMD KMDDXGK App Logic D3D 11 App Logic D3D 11 App Logic D3D 11 Thread 0 Thread 1 Thread 2 Thread 3 0 ms 2.50 ms 5.00 ms 7.50 ms App Logic D3D Runtime User-mode Driver DXGKernel Kernel-mode Driver Present
  • 40. D3D12 Profiling App Logic UMD D3D12 Present DXGK/KMD App Logic UMD D3D12 App Logic UMD D3D12 App Logic UMD D3D12 Thread 0 Thread 1 Thread 2 Thread 3 0 ms 2.50 ms 5.00 ms 7.50 ms App Logic D3D Runtime User-mode Driver DXGKernel Kernel-mode Driver Present
  • 41. D3D11 v D3D12 numbers App Logic UMD D3D12 Present DXGK/KMD App Logic UMD D3D12 App Logic UMD D3D12 App Logic UMD D3D12 Thread 0 Thread 1 Thread 2 Thread 3 0 ms 2.50 ms 5.00 ms 7.50 ms PresentApp Logic D3D11 UMD KMDDXGK App Logic D3D 11 App Logic D3D 11 App Logic D3D1 1 Thread 0 Thread 1 Thread 2 Thread 3 0 ms 2.50 ms 5.00 ms 7.50 ms App+GFX (ms) GFX-only (ms) D3D11 D3D12 D3D11 D3D12 Thread 0 7.88 3.80 5.73 1.17 Thread 1 3.08 2.50 0.35 0.81 Thread 2 2.84 2.46 0.34 0.69 Thread 3 2.63 2.45 0.23 0.65 Total 16.42 11.21 6.65 3.32
  • 42. Summary • Greater CPU Efficiency • Greater CPU Scalability • Greater Developer Control • CPU Parallelism • Resource Lifetime • Memory Usage