Direct3D 11 will have tessellation for smoother curves and finer details. The new compute shader will make postprocessing faster and easier. You'll need Direct3D 11 to have the best graphics, and this ...
Direct3D 11 will have tessellation for smoother curves and finer details. The new compute shader will make postprocessing faster and easier. You'll need Direct3D 11 to have the best graphics, and this talk will show you how you can get started using current generation hardware.
Want the full header-file to save on typing? Drop me an email
Switchable DX10/DX11 support examples // using D3D10 requires dxgi.lib and D3D11 beta requires dxgi_beta.lib and if we // link with only one through the common method then it crashes when creating // the D3D device. so instead conditionally link with the // correct dxgi library here for now --johan #ifdef DICE_D3D11_ENABLE #pragma comment(lib, "dxgi_beta.lib") #else #pragma comment(lib, "dxgi.lib") #endif // Setting a shader takes an extra parameter on D3D11: ID3D11ClassLinkage // which is used for the D3D11 subroutine support (which we don’t use) #ifdef DICE_D3D11_ENABLE m_deviceContext->PSSetShader(solution.pixelPermutation->shader, nullptr, 0); #else m_deviceContext->PSSetShader(solution.pixelPermutation->shader); #endif
Mapping buffers on DX10 vs DX11 #ifdef DICE_D3D11_ENABLE D3D11_MAPPED_SUBRESOURCE mappedResource; DICE_SAFE_DX(m_deviceContext->Map( m_functionConstantBuffers[type], // cbuffer 0, // subresource D3D11_MAP_WRITE_DISCARD, // map type 0, // map flags &mappedResource)); // map resource data = reinterpret_cast<Vec*>(mappedResource.pData); // fill in data m_deviceContext->Unmap(m_functionConstantBuffers[type], 0); #else DICE_SAFE_DX(m_functionConstantBuffers[type]->Map( D3D10_MAP_WRITE_DISCARD, // map type 0, // map flags (void**)&data)); // data // fill in data m_functionConstantBuffers[type]->Unmap(); #endif
Frostbite DX11 parallel dispatch
The Killer Feature for reducing CPU rendering overhead!
~90% of our rendering dispatch job is in D3D/driver
Have a DX11 deferred device context per core
Together with dynamic resources (cbuffer/vbuffer) for usage on that deferred context
Renderer has list of all draw calls we want to do for the each rendering “layer” of the frame
Split draw calls for each layer into chunks of ~256 and dispatch in parallel to the deferred contexts
Each chunk generates a command list
Render to immediate context & execute command lists
Profit!
Goal: close to linear perf. scaling up to octa-core when we get DX11 driver support (hint hint to the IHVs)
Frostbite DX11 - Other HW features of interest
Short term / easy:
Read-only depth buffers. Saves copy & memory.
BC6H compression for static HDR envmaps or lightmaps
BC7 compression for high-quality RGB[A] textures
Per-resource fractional MinLod. Properly fade in streamed textures.
Still works great for custom shadow map kernels and SSA0, since the depth buffer is a single component.
You can use Gather() on today’s hardware
Consider doing the port in stages
Use the HAL when you can
Software rendering isn’t fun
If your starting with D3D 9, the D3D 10 feature level should be your first target
First, get the engine working with D3D 10.1 feature level before adding Direct3D 11 specific features
10.1 is the highest level that will work with the HAL
Next, add new features on downlevel hardware where available
Finally, some new features will need to use the reference rasterizer without D3D 11 hardware
Strategies for Transitioning to Direct3D 11
A simple port from D3D 9 to D3D 11 will not perform well
Hopefully we’ve all learned this lesson from D3D 10
Going from D3D 9 to device feature level 10 will be a big chunk of the work
Very similar to the Direct3D 10 API
Direct 3D 10 fundamentals are still important
You can still use SM 3.0 for this stage
Starting with Direct3D 9
Constant Buffers
Group constants into buffers by frequency of update
Remember: when one constant is updated, the whole buffer needs to get uploaded
State Changes
State objects are immutable for better performance
Initialize the state you need before you need it
Avoid creating lots of state objects on the fly
Resources
Resource creation and deletion is slow
Create most of your resources at the beginning
Direct3D 10 programming review
Direct3D 10 programming review
Texture Updates
Call Map() with the DO_NOT_WAIT flag to update staging textures, then CopyResource() to update the video memory texture
Do not use UpdateSubResource() – slow
Batch Counts
Keep batch counts low with instancing
Alpha test is now done with clip()/discard()
Don’t put this in every shader – it may disable early z!
Try to do the clip early to avoid unnecessary shader instructions
Fairly easy port from Direct3D 10 to D3D 11 with 10 or 10.1 device feature level
You can still use the HAL
Modify the existing Direct3D 10 code to use a Rendering Context
You should only need the Immediate Context for now
Essentially just replacing API calls
Get the simple port working first
You can still use your SM 4.0 or 4.1 shaders at this point in the process
Going from Direct3D 10 to Direct3D 11
Multithreading
Requires changes to your rendering code
Add Windows multithreading support
Run deferred contexts in separate threads
Need to break up your rendering workload in to logical chunks
Parallelize the command list building to improve performance
Fortunately the runtime will emulate this feature
Performance improvements may not be fully realized until new drivers and new hardware is released.
Adding in new Direct3D 11 features
Adding in new Direct3D 11 features
Compute Shader
Post Processing
Replace your old pixel shader implementations with faster compute shader versions
Use CS 4.x on current hardware
Good for testing and backwards compatibility
Tessellation
Prototype tessellation algorithms using the ATI tessellator on Direct3D 9
Use instanced tessellation for Direct3D 11 on downlevel hardware
Consider how Tessellation will affect your art pipeline – better to prepare early
Add new features that require Direct3D 11 hardware
Not too difficult, since you’ve already done most of the work!
Tesellation
Simplify your algorithms by using the hull shader
Compute Shader
Start using CS 5.0
More local storage, write anywhere, can output to textures
Multithreading
Should automatically see improvements with new hardware and drivers
Full Direct3D 11 Implementation
Direct 3D 11 features will improve your game
Multithreading, Compute Shader, Tessellation and more
Current Hardware will take you close to a full Direct3D 11 implementation
Downlevel support is good for prototyping and for backwards compatibility
Have your game ready to ship when Direct3D 11 ships
Windows 7 and powerful new hardware will help spotlight your game!
There’s nothing stopping you from starting now
Johan Andersson, DICE – advice on porting to D3D 11
Nicholas Thibieroz, AMD – Compute Shader
Holger Gruen, Efficient Tessellation on the GPU through Instancing, Journal of Game Development Volume 1, Issue 3, December 2005
Tatarchuk, Barczak, Bilodeau, Programming for Real-Time Tessellation on GPU, 2009 AMD whitepaper on tessellation
Microsoft Corporation, DirectX 11 Software Development Kit, November, 2008
Acknowledgements
Trademark Attribution
AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. Other names used in this presentation are for identification purposes only and may be trademarks of their respective owners.
1–1 of 1 previous next