• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Modern Graphics Pipeline Overview

Modern Graphics Pipeline Overview



An overview of a typical graphics pipeline for current GPU hardware

An overview of a typical graphics pipeline for current GPU hardware



Total Views
Views on SlideShare
Embed Views



2 Embeds 10

http://a0.twimg.com 6
http://us-w1.rockmelt.com 4



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Modern Graphics Pipeline Overview Modern Graphics Pipeline Overview Presentation Transcript

    • A Brief Overview of the Graphics Pipeline Cedric Lee
    • What is a graphics pipeline?3D Raster Stage Stage StageScene Image● Hardware, real-time / interactive rendering● Popular APIs : OpenGL and DirectX
    • Overview● Basic Graphics Pipeline● Modern Graphics Pipeline● Beyond Pipelining● The New Wave
    • Basic Graphics Pipeline● Use case: ● Render a textured mesh with per-pixel lighting ● ambient light, 1 dir, 1 point, no shadows ● Assume z-buffer based architecture
    • 3D Scene● Surface ● Triangle mesh – Vertices and indices – Per-vertex position, normal ● Position + orientation (world matrix)● Material ● Per-vertex uv, tangent, binormal ● Diffuse + normal maps● Diffuse lighting (direction, colour)● Camera (view + projection matrices)
    • Vertex FetchingVertexStream Per-Vertex Position-OS Input Normal-OS Assembler Tangent-OS Binormal-OS Index Texture UVStream
    • Vertex Processing Per-Vertex Position-OS Normal-OS Tangent-OS Binormal-OS Per-Vertex Texture UV Position-WS Vertex Position-SS Shader Normal-WS Tangent-WS Uniform Binormal-WS Constants Texture UV World Matrix View MatrixProjection Matrix
    • Scan ConversionPer-Vertex Per-PixelPosition-WS Position-WSPosition-SS Trivial Position-SS RasterizerNormal-WS Reject Normal-WSTangent-OS Tangent-OSBinormal-OS Binormal-OS Texture UV Texture UV Viewport clipping Early Z rejection Back-face culling Interpolate
    • Pixel Processing Textures Per-Pixel Diffuse Position-WS Position-SS Normal Normal-WS Tangent-WS Per-Pixel Binormal-WS Texture UV Depth Pixel Colour Shader Alpha Uniform ConstantsAmbient L colour Texturing Dir L colour Lighting Dir L dir Point L colour Point L pos
    • Raster Operators (ROPs) DepthPer-Pixel Buffer Depth Test, Depth Alpha Test, Colour Alpha Blend Colour Buffer Frame buffer / render targets
    • Modern GPU Pipeline● Programmable units● Vertex shaders, Pixel Shaders● DX10 : Geometry Shader ● Kill/emit vertices, primitives ● Ex. displacement mapping, fur, 1-pass render to cube map
    • Modern GPU Pipeline● Unified shader architecture ● Common shading cores shared between Vertex, Geometry and Pixel shading units ● Scheduler distributes work ● Load balancing
    • http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf
    • http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf
    • Modern GPU Pipeline● Bandwidth: ● Hierarchical Z ● PS3: Compressed Z and colour to reduce bandwidth for MSAA reads ● X360: in-GPU EDRAM – lots of bandwidth
    • Modern GPU Pipeline● CUDA / DX11 Compute Shader ● Stream processing (GPGPU) ● Exposes shading functionality ● Arbitrary memory reads
    • Modern GPU● More memory, processing units● More floating point formats, fewer usage restrictions● More render targets (8)● Longer shaders● New data structures (e.g. Texture arrays)● Better MSAA and anisotropic filtering support
    • Beyond Pipelining● Multi-processor ● Solution to “memory” and “power” walls ● Pipelining : multiple stages happening at once ● Parallelism : many things happening in the same stage● Limit of pipelining ● Small number of pipeline steps ● Some steps are much more compute intensive
    • Parallelism● Parallelism examples: ● All components of float4 at the same time ● Multiple vertices at the same time ● Multiple triangles at the same time
    • SIMD● e.g. GPU ALU● Shared instruction store and control● Compact and less expensive● Efficient with no loops or branches● Problem with unused processing cycles ● Unfilled quads are inefficient ● Solution : avoid small or skinny triangles (PS3)● Not good for more complicated data structures or algorithms
    • SIMT● Still SIMD. Shared code between threads.● Process groups of primitives (e.g. 48 quads) in each thread● Latency hiding: ● 1 Thread stalls on texture fetch ● Othe threads continue execution ● Especially important due to “memory wall”
    • SIMT● When branching: ● Only evaluate one branch if all primitives take that branch ● Must evaluate both branches and mask the results if not all primitives take the same branch● Reduces unused processor cycles
    • MIMD● e.g. Multi-core CPUs, Cell SPEs, Larrabee● Diff code stores and controls for diff processors● More complex hardware● More expensive● Synchronization issues● Can handle more complex data structures and algorithms
    • The New Wave● MIMD ● Cell SPEs ● Larrabee
    • Cell SPEs● SPEs ● Local memory store ● Shared memory accessed via DMA ● Ring bus
    • PS3● RSX ● Traditional GPU (z-buffer, ROP) ● SIMD data structures and processing (arrays)● Offload GPU work to SPUs ● Micro triangle removal ● Skinning ● Post-FX ● Lighting ● Mostly rely on SIMD-friendly data structures
    • Larrabee● Many general purpose CPU cores● Coherent memory access from cores● Very few fixed-function units (e.g. Texture)● Most graphics pipeline components are programmable ● Depth buffer ● Blending● Invites more complex data structures and algo
    • What does this mean?
    • Programming● GPU programming may become more like SPU programming ● More MIMD ● More synchronization and data buffering issues ● More attention to latency hiding
    • Surfaces and Volumes● Curved surfaces● Displacement mapping● Multi-resolution meshes● Volumes
    • Lighting● Non-uniform representations ● Irregular Shadow Mapping ● Deep Shadow Maps
    • Rasterization● Object-parallel rasterization ● Ray-casting – Implicit surfaces (e.g. Metaballs, Level sets, CSG) – Direct volume rendering ● Order independent transparency
    • Questions?