D3 D10 Unleashed New Features And Effects

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    06/07/09 01:02

    2 Favorites

    D3 D10 Unleashed New Features And Effects - Presentation Transcript

    1. D3D10 Unleashed: New Features and Effects David Tuft Program Manager Direct 3D
    2. Outline
      • Direct3D 10
        • Design Imperatives
        • Features and Capabilities
        • Applications
      • No single graphics hardware target
      • CPU-bound games and applications
        • Bandwidth and CPU cycles are the bottleneck in multiple areas (physics, AI)
        • Large amount of CPU resources spent directing the GPU
      The Situation Today GPU CPU
    3. The Situation Today
      • No single graphics hardware target
      • CPU-bound games and applications
        • Bandwidth and CPU cycles are the bottleneck in multiple areas (physics, AI)
        • Large amount of CPU resources spent directing the GPU
      • GPU overly-specialized
    4. Direct3D 10 Unleashing the power of the GPU
      • Consistency – guarantee a common feature-set with strict requirements
      • Performance –
        • Render MORE
        • objects, materials, clutter, vegetation, shadows
        • with LESS
        • CPU cycles, stalls, and bandwidth cost
      • Visual Effects – unprecedented graphics
      • Capability – empower the GPU to handle a new series of applications
    5. Consistency
      • Completely Re-architected
      • NO CAPS!
      • No more fixed function
      • Strict rasterization and floating point rules
      • Logical, straightforward, but still powerful
      • Core layer
        • Validation moved from set add draw time to create time
      • Debug layer
        • No behavior changes between layers
        • No perf hit when disabled
      • Switch-to-reference layer
      • Thread-safe layer
      Consistency Layered Design
    6. Consistency
      • No half texel offset!
      • Texture coordinates match the pixel positions
      • State grouping
      • Validation
        • Minimal at draw time
        • Necessary at state creation time
        • Lots in debug layer
      • Lower API overhead allows for less draw time hit
      Performance Small Batch
      • The cores are yours
        • Create all needed depth stencil objects at init time
        • Set when needed
      • SetRenderState( D3DRS_STENCILENABLE )
      • SetRenderSTate( D3DRS_STENCILMASK )
      • ID3D10Device::CreateDepthStencilState()
      • OMSetDepthStencilState()
      • State grouped to match hardware
      Performance State Grouping Depth Stencil DepthEnable DepthFunc DepthWriteMask StencilEnable StencilReadMask StencilWriteMask FrontFace BackFace
      • Reduce state-change overhead by grouping state into immutable objects
      • Input layout
        • Format, Offset, InstanceDataStepRate, …
      • Rasterizer
        • Cull Mode, Multisample Enable, Fill Mode, …
      • DepthStencil
        • Depth Enable, Depth Func, Stencil Masks, …
      • Blend
        • SrcBlend, DestBlend, BlendOp, …
      • Sampler (No longer bound to a specific texture)
        • Filter Mode, MinLOD, MaxLOD,…
      Performance State Grouping
      • D3D10_USAGE_
        • IMMUTABLE = never updated
          • Create time update
          • Bound is input
        • DEFAULT = updated < once per frame
          • Update via UpdateSubresource
        • DYNAMIC = update >= once per frame
          • Update with map / unmap
        • STAGING = fast read back path
          • Can’t bind or write to in pipeline
      Performance Resource Usage
      • Texture Arrays
      • Format Reinterpretation
      • Stream Output
      • Resource Views
      • Input Assembler
      • Immediate offset on Memory Access
        • Integer/Bitwise Instructions
        • Comparison Filtering
      • Constant Buffers
      • State Objects
      Visual Effects GPU Features! Shared-Exponent HDR Compression (RGBE) Block-Compressed Formats for bump/normal maps 128 texture slots 8 Render targets More interstage communication Instance, Vertex, Primitive identifiers Per-primitive Clip distance Predicated Rendering Alpha-to-Coverage Multisample Readback Better cubemap filtering Input Assembler … Input Assembler Vertex Buffer Index Buffer Texture Texture Texture Depth/Stencil Render Target Stream Output Vertex Shader Geometry Shader Rasterizer/ Interpolator Pixel Shader Output Merger
      • New Unified Shader Core
        • Have the same functionality
        • On some cards, all shader stages use the same cores
      • Comparison-Sample instruction
        • Percentage-Closer shadow Filtering
      • Immediate offset (up to +/-8) on Texture/Buffer load
        • Custom filter kernels
      • Resource info
        • Returns height, width, # of miplevels, arraysize for the resource view
      • More of everything
        • Inter-stage registers, samplers, textures
        • Unlimited instruction count
      Visual Effects The Shader Core
    7. Shader Model 4.0 A new level of programmability
      • Full integer/bitwise instruction set
        • Massively parallel image and data processing
        • Custom decompression schemes
      • Buffer load – CPU-like unfiltered memory access
      • Switch statements
    8. New Resource Types: Texture Arrays
      • Dynamically indexable in the shader
      • Whole array can be set as a render target or as a texture input
      • Views enable interpretation of resources at different bind locations
      Resource Views Resource views example: cubemap
    9. Resource Views
      • Resource in D3D10 are generally typeless
      • Resource must be interpreted as a specific type by obtaining a view of the resource
      • Allows you to reinterpret data in a different format
      • Forces type validation earlier in setup
        • Don’t have to re-validate on every draw
    10. Geometry Shader Amplification and De-Amplification
      • Emits primitives of a specified output type (point, linestrip, trianglestrip)
        • Limited geometry amplification/de-amplification: Output 0-1024 values per invocation
      • No more 1-in / 1-out limit!
        • Shadow Volumes
        • Fur/Fins
        • Procedural Geometry/Detailing
        • All-GPU Particle Systems
        • Point Sprites
      Geometry Shader
    11. The New Pipeline Direct3D10 – Geometry Shader
      • Access to the whole primitive
        • Triangle
        • Line
        • Point
      • With adjacency
    12. Geometry Shader Example Shadow volume generation
    13. Geometry Shader Example Generalized displacement maps
      • Normal mapping (Direct3D 9)
    14. Geometry Shader Example Generalized displacement maps
      • Displacement Mapping (Direct3D 10)
    15. Render-To-Volume Geometry Shader
    16. Stream Out
      • Amplification from GS/VS can be directed into a buffer
      • Generated geometry easily redrawn using DrawAuto() command with no CPU intervention
      DrawAuto()
    17. FX10
      • D3D 10 runtime is optimized, and it is significantly faster and leaner
      • No hidden performance cliffs; Any slow paths will be reported in debug
      • Better reflection
      • You can retrieve almost anything from an Effect
      • Reflection metadata can be discarded— no performance or memory cost at run time
    18. FX10 Pipeline Requirements
        • All State Commands:
        • IASetVertexBuffers/SetIndexBuffer
        • IASetPrimitiveTopology
        • {VS|GS|PS}SetShader
        • {VS|GS|PS}SetShaderResources
        • {VS|GS|PS}SetConstantBuffers
        • {VS|GS|PS}SetSamplers
        • SOSetTargets
        • RSSetState
        • RSSetViewports/ScissorRects
        • OMSetRenderTargets
        • OMSetBlendState
        • OMSetDepthStencilState
    19. FX10 Pipeline Requirements
        • All State Commands:
        • IASetVertexBuffers/SetIndexBuffer
        • IASetPrimitiveTopology
        • {VS|GS|PS}SetShader
        • {VS|GS|PS}SetShaderResources
        • {VS|GS|PS}SetConstantBuffers
        • {VS|GS|PS}SetSamplers
        • SOSetTargets
        • RSSetState
        • RSSetViewports/ScissorRects
        • OMSetRenderTargets
        • OMSetBlendState
        • OMSetDepthStencilState
    20. Constant Buffers
      • Constants now managed like vertex/texture data
        • Updated efficiently via lock/discard or UpdateResourceUP
        • Set like any other resource
      • Up to 4096 4-channel × 32-bit elements per CB
      • Create as many CBs as you want; 16 can be bound to a shader at once
      A C B D A B B A D C Constant Buffers Shader A Shader B
    21. Constant Buffers
      • Example HLSL Syntax
      • Variables still exist in the global namespace
        • arrayIndex = 4;
        • myObject.arrayIndex = 4;
      cbuffer myObject { float4x4 matWorld; float3 vObjectPosition; int arrayIndex; } cbuffer myScene { float3 vSunPosition; float4x4 matView; }
    22. Additional Features
    23. Queries & Predicates
      • Many events and stats gathered by runtime
        • Command completion
        • Object Occlusion (in samples rendered)
        • Pipeline Stats
      • Commands can be queued depending on the result of the query
        • Called a Predicate
    24. Example: Predicated Rendering
      • Depending on occlusion query of a bounding geometry( OCCLUSIONPREDICATE ), queue the rendering of a more complex object
        • No CPU involvement required
      • Use PREDICATEHINT to avoid accidental pipeline stall for query result
    25.  
    26.  
    27. Direct3D 10 GPU material management
      • Render a multitude of unique materials without taxing the CPU
        • Unlimited instruction length
        • Switch statements
        • Texture arrays
        • Geometry shader
        • Constant buffers
        • Access to material descriptions
    28. Learn About DX10 Material Systems
    29. The Great Divide The PCI “Express”
      • Using the GPU when
        • You can’t pay for bandwidth
        • The cores are busy
      • Particle system animation
      • Collision Detection
      • DSP effects
        • Convolution
        • Bloom
      • Advanced Rendering
      CPU GPU
      • Multiple ways to implemented
        • Geometry shader Stream OUT
          • Amplification, processing done in GS, or VS
        • Render to Texture, with vertex shader position lookup
          • Processing done is PS
      Capability Particle System
    30. Back to our Goals
      • Fewer calls needed
        • Geometry Shaders/ Constant Buffers/ Texture Arrays…
      • Remaining calls are fast
        • Massive reduction in state and validation overhead :
          • Validation on CREATION, not on binding
          • Views , State Objects
      • Avoid CPU intervention
        • Predicated Draw()
        • DrawAuto()
      • Lean’n’mean runtime, refactored for performance
      Small Batch Performance
    31. Strict Specification
      • Strictly-defined, consistent behavior throughout the pipeline
        • IEEE floating-point compliance
          • Includes IEEE754R NaN-quashing Min/Max instructions
          • Precise FP32 sampling/blending/math/conversion rules. Ex:
            • FP32 shader ops – precise to 1.0 ULP
            • FP32 to Integer – precise to 0.6 ULP per op
            • FP16 blending - precise to 0.6 ULP per op
        • FP32 blending required
        • Exact line/triangle/AA rasterization rules
    32. GPU Exploitation
      • API specifically designed to enable pushing any computations onto GPU
        • Of course, it’s up to you!
      • Extra pipeline stage: Geometry Shader
      • Minimized CPU interaction.
        • Get the pipeline flowing and leave it alone
    33. Call to Action
      • Try out the new API
      • Get the SDK!
      • http://msdn2.microsoft.com/en-us/xna/aa937788.aspx
    34. © 2007 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. www.xna.com

    + CyclicCyclic, 2 years ago

    custom

    339 views, 2 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 339
      • 339 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 2
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Tags