Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

707 views
457 views

Published on

Sanjiv Satoor, Sr. Manager, NVIDIA talks about evolution of the modern graphics architectures with a focus on GPUs.

Published in: Technology, Art & Photos
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
707
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

  1. 1. Evolution of GraphicsArchitectureswith a focus on GPUsSanjiv SatoorSenior Manager, NVIDIA
  2. 2. First Generation - WireframeVertex: transform, clip, and projectRasterization: lines onlyPixel: no pixels! calligraphic displayDates: prior to 1987
  3. 3. Storage Tube TerminalsCRTs with analog charge “persistence”Accumulate a detailed static image by writing points orline segmentsErase the stored image to start a new one
  4. 4. Early FramebuffersBy the mid-1970’s one could afford framebuffers with afew bits per pixel at modest resolution“A Random Access Video Frame Buffer”,Kajiya, Sutherland, Cheadle, 1975Vector displays were still better for fine position detailFramebuffers were used to emulate storage tube vectorterminals on a raster display
  5. 5. Second Generation – Shaded SolidsVertex: lightingRasterization: filled polygonsPixel: depth buffer, color blendingDates: 1987 - 1992
  6. 6. Third Generation – Texture MappingVertex: more, fasterRasterization: more, fasterPixel: texture filtering, antialiasingDates: 1992 - 2001
  7. 7. IRIS 3000 Graphics CardsGeometry Engines & Rasterizer 4 bit / pixel Framebuffer(2 instances)
  8. 8. 1990’sDesktop 3D workstations under $5000Single-board, multi-chip graphics subsystemsRise of 3D on the PC40 company free-for-all until intense competition knocked out all but afew playersMany were “decelerators”, and easy to beatSingle-chip GPUsInteresting hardware experimentationPCs would take over the workstation businessInteresting consoles3DO, Nintendo, Sega, Sony
  9. 9. 1998 1999 2000 2001 2002 2003 2004DirectX 6MultitexturingRiva TNTDirectX 8SM 1.xGeForce 3 CgDirectX 9SM 2.0GeForceFXDirectX 9.0cSM 3.0GeForce 6DirectX 5Riva 128DirectX 7T&L TextureStageStateGeForce 256Quake 3 Giants Halo Far Cry UE3Half-LifeAll images © their respective ownersMoving toward programmability
  10. 10. RIVA 1283M xtorsGeForce 25623M xtorsGeForce FX250M xtorsGeForce 8800681M xtorsGeForce 360M xtors“Kepler”7B xtors1995 2000 2001 2006 2012Fixed function Programmable shaders CUDA2003Evolution of GPUs
  11. 11. Copyright © NVIDIA Corporation 2006Unreal © EpicPer-Vertex LightingNo Lighting Per-Pixel Lighting
  12. 12. Lush, Rich WorldsStunning Graphics RealismCore of the Definitive Gaming PlatformIncredible Physics EffectsHellgate: London © 2005-2006 Flagship Studios, Inc. Licensed by NAMCO BANDAI Games America, Inc.Crysis © 2006 Crytek / Electronic ArtsFull Spectrum Warrior: Ten Hammers © 2006 Pandemic Studios, LLC. All rights reserved. © 2006 THQ Inc. All rights reserved.
  13. 13. Tradition Fixed Function Graphics pipelineT&L evolvedto vertexshadingmemoryinterfacevertexprocessingtrianglesetuppixelprocessingrasteroperationsTriangle,point, linesetupFlat shading,texturingeventuallypixel shadingBlending, Z-buffering,AntialiasingWider andfaster overthe yearsProcessor per function
  14. 14. Migration of functionality to GPU hardware
  15. 15. GeForce3/DX8 Pixel Shading Pipeline
  16. 16. Programmable Shaders: GeForceFX (2002)Vertex and fragment operations specified in small (macro) assemblylanguageUser-specified mapping of input data to operationsLimited ability to use intermediate computed values to index input data(textures and vertex uniforms)Input 2Input 1Input 0OPTemp 2Temp 1Temp 0ADDR R0.xyz, eyePosition.xyzx, -f[TEX0].xyzx;DP3R R0.w, R0.xyzx, R0.xyzx;RSQR R0.w, R0.w;MULR R0.xyz, R0.w, R0.xyzx;ADDR R1.xyz, lightPosition.xyzx, -f[TEX0].xyzx;DP3R R0.w, R1.xyzx, R1.xyzx;RSQR R0.w, R0.w;MADR R0.xyz, R0.w, R1.xyzx, R0.xyzx;MULR R1.xyz, R0.w, R1.xyzx;DP3R R0.w, R1.xyzx, f[TEX1].xyzx;MAXR R0.w, R0.w, {0}.x;
  17. 17. GeForce 6 Architecture
  18. 18. Unified Hardware Shader Design
  19. 19. L2FBSP SPL1TFThreadProcessorVtx Thread IssueSetup / Rstr / ZCullGeom Thread Issue Pixel Thread IssueInput AssemblerHostSP SPL1TFSP SPL1TFSP SPL1TFSP SPL1TFSP SPL1TFSP SPL1TFSP SPL1TFL2FBL2FBL2FBL2FBL2FBGeForce 8 ArchitectureBuild the architecture around the processor
  20. 20. Millions of triangles Millions of pixelsWhy are somany paralleloperationsneeded?Input triangle Tessellate Projection Rasterize ShadeTransform verticesImage planeCamera
  21. 21. GPU = More computational horsepower andbandwidth per wattFew complex processorsOptimized for single-threaded performanceMany simple processorswith minimal overheadSlow single-threadedperformance but massiveoverall throughput
  22. 22. GPU ArchitectureEfficiencyProgrammabilityPerformance
  23. 23. GPU Architecture:Two Main ComponentsStreaming Multiprocessors (SMs)Perform the actual computationsEach SM has its own:Control units, registers, execution pipelines, cachesGlobal memoryAnalogous to RAM in a CPU serverAccessible by both GPU and CPUCurrently up to 6 GB per GPUBandwidth currently up to 250 GB/sDRAMI/FGigaThreadHOSTI/FDRAMI/FDRAMI/FDRAMI/FDRAMI/FDRAMI/FL2
  24. 24. KEPLERThe Fastest, Most Efficient GPU Ever Built
  25. 25. Kepler GK110 Architecture7.1B Transistors14 SMX units3.95 TFLOP FP321.31 TFLOP FP64250 GB/sec2688 coresPCI Express Gen3
  26. 26. WORLD’S #1 SUPERCOMPUTERWith a peak performance of 27 petaflops, theTitan supercomputer at Oak Ridge NationalLabs is the world’s fastest. 18,688 GPUsprovide 90% of the machine’s computingpower.
  27. 27. The Graphics pipelineVertex and fragment processing are programmableThe programmer can write programs that are executed for every vertex aswell as for every fragmentThis allows fully customizable geometry and shading effects that go wellbeyond the generic look and feel of older 3D applicationshostinterfacevertexprocessingtrianglesetuppixelprocessingmemoryinterface
  28. 28. Thank you

×