An Introduction to GPU          3D Games to HPC           Krishnaraj RaoPresented at Bangalore DV Club, 03/12/2010
Agenda 3D Graphics    The Big Picture    Quick Overview    Programming Model    Importance of 3D High Performance Parallel...
The Big Picture – MoviesCapture      Models   Scene   Rendering   Post                      API                 Processing...
The Big Picture - GamesCapture      Models   Scene   Rendering   Post                      API                 Processing ...
Models end up in World Space     Worldspace includes everything!     Position and orientation for all     items is needed ...
View Transformation world ends upon Screen   Screen Coordinate Space
Simple Interactive 3D Graphics App A simple example     Static scene geometry,       Vertex                               ...
Adding Programmability to the    Graphics Pipeline           3D Application             or Game      3D API    Commands   ...
A History of Innovation    1995                 1999          2002          2003          2004          2005        2006-2...
GPU continues to offload CPU work        Geom               Geom       Triangle         Pixel                             ...
Programming Model API: Set of functions, procedures or classes that an OS, library or service provides to support requests...
Why is 3D Graphics important?More than just Fun and Games....Tokyo, Japan                       California Coastline
3D Consumer Applications  Vista      Office        PDFs  Music       Photos       Maps
GPUS IN HPC
Evolution of Processors Massive   DataParallelismInstruction   LevelParallelism              Data Fits in Cache   Huge Dat...
GPU Processing PowerCPU, meet your new partner!                                         GPU                 CPU    GPU    ...
Beyond Graphics With floating-point math and textures, graphics processors can be used for more than just graphics    GPGP...
What is GPGPU ?  General Purpose computation using GPU  in applications other than 3D graphics     GPU accelerates critica...
Why Computation on the GPU?  A quiet buildup of potential       Calculation Throughput and Memory Bandwidth: 10X       Equ...
Why Computation on the GPU?  Supercomputing Performance     Inherently Parallel Architecture     1000+ cores, massively pa...
Compute Applications  Computational Fluid Dynamics   Data Mining, Analytics &  Computer Aided Engineering     Databases  D...
Heterogeneous Computing   Multi-Core   Parallel-Core     CPU            GPU
APIS FOR HETEROGENEOUS COMPUTING
APIs for Heterogeneous Computing CUDA (Compute Unified Device Architecture) is a parallel computing architecture developed...
OpenCL
OpenCL: Platform Model & Program Structure   One Host+ one or more Compute Devices      Each Compute Device is composed of...
CUDA Parallel Computing ArchitectureISA and hardwarecompute engineIncludes a C-compilerplus support forOpenCL andDX11 Comp...
Option 1OpenCL and C for CUDA                                         Entry point for                            C for CUD...
CUDA Success—Science & ComputationNot 2x or 3x, but speedups are 20x to 150x    146X            36X              18X      ...
100x more affordable                                        20x less power                                                ...
Solving the World’s Most Complex Challenges                                  Film      Science                            ...
Grand Computing Challenges                Personalized    Mathematics for   Information Renewable                 Medicine...
Final Thoughts GPU and heterogeneous parallel architecture will revolutionize computing Parallel computing needed to solve...
From Virtua Fighter to Tsubame       1995 – NV1         2008 – GT200       0.8M transistors   1,200M transistors          ...
BACKUP
Graphics API History
Open GL1992: OpenGL 1.01996: OpenGL 1.1 (Vertex Arrays, Improved Texturing)1998: OpenGL 1.2 (3D Textures, BGRA pixel forma...
OpenGL ES Designed for hand-held and embedded devices    Goal is smaller footprint to support OpenGL    PlayStation 3 and ...
OpenGL ES – Cont OpenGL ES 1.0 : Symbian OS, Android Platform OpenGL ES 1.0+ : Playstation 3 OpenGL ES 1.1 : iPhone SDK, B...
DirectXGDI: legacy Windows graphics API ~1985DirectX 1.0 – 1995/6 (No 3D support, DirectDraw, DirectSound, DirectInput)Dir...
Upcoming SlideShare
Loading in...5
×

2D Games to HPC

302

Published on

Published in: Technology, Art & Photos
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
302
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

2D Games to HPC

  1. 1. An Introduction to GPU 3D Games to HPC Krishnaraj RaoPresented at Bangalore DV Club, 03/12/2010
  2. 2. Agenda 3D Graphics The Big Picture Quick Overview Programming Model Importance of 3D High Performance Parallel Computing Why GPUs for HPPC? Available APIs GPU Computing architecture Q&A
  3. 3. The Big Picture – MoviesCapture Models Scene Rendering Post API ProcessingCreation Creation
  4. 4. The Big Picture - GamesCapture Models Scene Rendering Post API Processing GPU’s DriversCreation HLSL, Creation Cg
  5. 5. Models end up in World Space Worldspace includes everything! Position and orientation for all items is needed to accurately calculate transformations into screen space. Light Source Y Z View Point or Camera Screen X World Coordinate Space
  6. 6. View Transformation world ends upon Screen Screen Coordinate Space
  7. 7. Simple Interactive 3D Graphics App A simple example Static scene geometry, Vertex Setup Raster Fragment Raster Engine Engine Ops moving viewer Repeat this loop: Z Cull Texture CPU takes user input from joystick or mouse CPU re-calculates viewer position, view direction, and light positions in 3-D world space GPU clears memory and Update Viewer Read Draw all draws the complete scene Joystick Position and Light Scene geometry with the new Position Direction Objects viewer and light positions Repeat forever
  8. 8. Adding Programmability to the Graphics Pipeline 3D Application or Game 3D API Commands 3D API: OpenGL or Direct3D CPU – GPU Boundary GPU Assembled Command & Polygons, Pixel Data Stream Vertex Index Lines, and Location Pixel Stream Points Stream Updates GPU Primitive Rasterization & Raster Front Framebuffer Assembly Interpolation Operations End RasterizedPre-transformed Transformed Pre-transformed Transformed Vertices Vertices Fragments Fragments Programmable Programmable Vertex Fragment Processor Processor
  9. 9. A History of Innovation 1995 1999 2002 2003 2004 2005 2006-2007 NV1 GeForce 256 GeForce4 GeForce FX GeForce 6 GeForce 7 GeForce 8 1 Million 22 Million 63 Million 130 Million 222 Million 302 Million 754 Million Transistors Transistors Transistors Transistors Transistors Transistors Transistors 2008 GeForce GTX 200 1.4 Billion Transistors …. but what do all these extra transistors do?NVIDIA Confidential
  10. 10. GPU continues to offload CPU work Geom Geom Triangle Pixel Z / Blend Gather Proc Proc Proc 1996 CPU GPU Geom Geom Triangle Pixel Z / Blend Gather Proc Proc Proc 2000 CPU GPUScene Physics Geom Geom Triangle Pixel Z / BlendMgmt and AI Gather Proc Proc Proc 2004 CPU GPUScene Physics Geom Geom Triangle Pixel Z / BlendMgmt and AI Gather Proc Proc Proc 2008 CPU GPU
  11. 11. Programming Model API: Set of functions, procedures or classes that an OS, library or service provides to support requests made by computer programs DirectX: Collection of APIs to handle multimedia, esp. game programming and video tasks, on MS platforms. OpenGL (Open Graphics Library) is a standard specification defining a cross- language, cross-platform API for writing applications that produce 2D and 3D computer graphics.
  12. 12. Why is 3D Graphics important?More than just Fun and Games....Tokyo, Japan California Coastline
  13. 13. 3D Consumer Applications Vista Office PDFs Music Photos Maps
  14. 14. GPUS IN HPC
  15. 15. Evolution of Processors Massive DataParallelismInstruction LevelParallelism Data Fits in Cache Huge Data Sets
  16. 16. GPU Processing PowerCPU, meet your new partner! GPU CPU GPU Intel Core i7 965 NVIDIA GTX 285 4 cores 240 cores 102 GFLOPS 1.04 TFLOPS CPU
  17. 17. Beyond Graphics With floating-point math and textures, graphics processors can be used for more than just graphics GPGPU = “General Purpose Computing on GPUs” Lots of ongoing research mapping algorithms and problems onto programmable GPUs Solving Linear Equations Black-Scholes Options Pricing Rigid- and Soft-Body Dynamics Middleware layers being developed to accelerate “eye candy” game physics on GPUs (HavokFX)
  18. 18. What is GPGPU ? General Purpose computation using GPU in applications other than 3D graphics GPU accelerates critical path of application Data parallel algorithms leverage GPU attributes Large data arrays, streaming throughput Fine-grain SIMD parallelism Floating point (FP) computation Great for “embarrassingly parallel” algorithms Applications – see //GPGPU.org Game effects (FX) physics, image processing Physical modeling, computational engineering, matrix algebra, convolution, correlation, sorting
  19. 19. Why Computation on the GPU? A quiet buildup of potential Calculation Throughput and Memory Bandwidth: 10X Equivalent performance at fraction of power & cost GPU in every PC – pervasive presence and massive impact GPUs have always been parallel “multi-core” Natively designed to handle massive threading Every pixel is a thread Increased precision (fp32), programmability, flexibility GPUs are a mass-market parallel processor Economies of scale Peak floating point performance is much higher than comparable CPUs ATI x1900XT Intel Core 2 Duo E6600 $400 (video card) $400 (processor only) 250 GFLOPs (SP Float) 40 GFLOPS (SP Float) 46 GB main memory BW 8.5 GB main memory BW
  20. 20. Why Computation on the GPU? Supercomputing Performance Inherently Parallel Architecture 1000+ cores, massively parallel processing 250x the compute performance of a PC Personal “One Researcher, One Supercomputer” Supercomputer in a desktop system Plugs into standard power strip Accessible Program in C, C++, Fortran for Windows or Linux Available from OEMs and resellers worldwide and priced like a workstation
  21. 21. Compute Applications Computational Fluid Dynamics Data Mining, Analytics & Computer Aided Engineering Databases Digital Content Creation MATLAB Acceleration Electronic Design Automation Molecular Dynamics Finance Weather, Atmospheric, Ocean Game Physics Modeling, and Space Sciences Graphics Libraries Imaging and Computer Vision Oil & Gas Medical Imaging Programming Tools Numerics Ray Tracing Bio-Informatics and Life Signal Processing Sciences Video & Audio Computational Chemistry Computational Electromagnetics & Electrodynamics
  22. 22. Heterogeneous Computing Multi-Core Parallel-Core CPU GPU
  23. 23. APIS FOR HETEROGENEOUS COMPUTING
  24. 24. APIs for Heterogeneous Computing CUDA (Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. Programmers use C for CUDA (C with NVIDIA extensions), compiled through a PathScale Open64 C compiler, to code algorithms for execution on the GPU. Both low/high level APIs are provided OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. Microsoft DirectCompute is an API that supports General-purpose computing on GPUs on Microsoft Win Vista or Win 7. DirectCompute is part of the Microsoft DirectX collection of APIs.
  25. 25. OpenCL
  26. 26. OpenCL: Platform Model & Program Structure One Host+ one or more Compute Devices Each Compute Device is composed of one or more Compute Units Each Compute Unit is further divided into one or more Processing Elements
  27. 27. CUDA Parallel Computing ArchitectureISA and hardwarecompute engineIncludes a C-compilerplus support forOpenCL andDX11 ComputeArchitected to nativelysupport allcomputationalinterfaces(standard languagesand APIs)
  28. 28. Option 1OpenCL and C for CUDA Entry point for C for CUDA developers who prefer high-level C Entry point fordevelopers who want OpenCL low-level API Shared back-end compiler and PTX optimization technology GPU
  29. 29. CUDA Success—Science & ComputationNot 2x or 3x, but speedups are 20x to 150x 146X 36X 18X 50X 100X Medical Molecular Video Matlab Astrophysic Imaging Dynamics Transcoding Computing s U of Utah U of Illinois, Elemental Tech AccelerEyes RIKEN Urbana 149X 47X 20X 130X 30X Financial Linear Algebra 3D Quantum Gene simulation Universidad Ultrasound Chemistry Sequencing Oxford Jaime Techniscan U of Illinois, U of Maryland Urbana
  30. 30. 100x more affordable 20x less power Tesla 250x consumption Personal SupercomputerPerformance Supercomputing Cluster 250x Faster 1x Today’s Workstations $100K - $1M < $10 K Accessibility
  31. 31. Solving the World’s Most Complex Challenges Film Science Auto Design Oil & Gas Medicine Broadcast Space Exploration
  32. 32. Grand Computing Challenges Personalized Mathematics for Information Renewable Medicine Scientific Data Mining Energy DiscoveryMachines That Natural Human Predict Economic Think Machine Environmental Analysis Interaction Changes
  33. 33. Final Thoughts GPU and heterogeneous parallel architecture will revolutionize computing Parallel computing needed to solve some of the most interesting and important human challenges ahead Learning parallel programming is imperative for students in computing and sciences
  34. 34. From Virtua Fighter to Tsubame 1995 – NV1 2008 – GT200 0.8M transistors 1,200M transistors 50MHz 1.3GHz 1M Bytes 4G Bytes 0 GFLOPS 1 TFLOPS Another 1000x in 15 years?
  35. 35. BACKUP
  36. 36. Graphics API History
  37. 37. Open GL1992: OpenGL 1.01996: OpenGL 1.1 (Vertex Arrays, Improved Texturing)1998: OpenGL 1.2 (3D Textures, BGRA pixel format)1998: OpenGL 1.2.1 (Multi-Texture)2001: OpenGL 1.3 (Multi-sample AA, Cube/Compressed Textures)2002: OpenGL 1.4 (Depth/Shadow mapping, Auto mipmap generation)2003: OpenGL 1.5 (Vertex Attr from Vid Mem)2005: OpenGL 2.0 (GLSL, Vertex/Pixel Shaders, MRT, Non P-of-2 Tex)2006: OpenGL 2.1 (GLSL1.2, sRGB Textures)2008: OpenGL 3.0 (GLSL1.3, 32b FP Textures)2009: OpenGL 3.1 (March 2009, GLSL1.4, Perf, CopyBufferAPI)2009: OpenGL 3.2 (Aug 2009, GLSL1.5, Geom Shaders)
  38. 38. OpenGL ES Designed for hand-held and embedded devices Goal is smaller footprint to support OpenGL PlayStation 3 and cell phone industry adopting ES OpenGL ES 1.1 Strips out anything deemed extra in OpenGL Keeps conventional fixed-function vertex and fragment processing OpenGL ES 2.0 Adds programmable vertex and fragment shaders Shaders specified in binary format Drops support for fixed-function vertex and fragment processing
  39. 39. OpenGL ES – Cont OpenGL ES 1.0 : Symbian OS, Android Platform OpenGL ES 1.0+ : Playstation 3 OpenGL ES 1.1 : iPhone SDK, Bberry (Some Models) Open GL ES 2.0 : iPhone 3GS, iPOD touch
  40. 40. DirectXGDI: legacy Windows graphics API ~1985DirectX 1.0 – 1995/6 (No 3D support, DirectDraw, DirectSound, DirectInput)DirectX 3.0 – 1996 (Rasterization only 3D Support, Akward prog. Model, Notsuccessful)DirectX 5.0 – 1997 (Draw Primitives, DirectX vs OpenGL War)DirectX 6.0 – 1998 (Multitexture, OGL/Glide features, Texture Compression)DirectX 7.0 – 1999 (Geometry HW accleration and Blending, Cube mapping)DirectX 8.0 – 2000/1 (Programable VS/PS Shaders, XBOX)DirectX 9.0 – 2002-2003 (More programmability, Branching, FP pixel prog.)DirectX 9.0c – 2004 (ShaderModel 3.0)DirectX 10.0 – 2006 (SM4.0, WinVista, Geometry Shaders, Streaming Output)DirectX 10.1 – 2008 (SM4.1, Better Image Quality)DirectX 11.0 - 2009 (SM5.0, DirectCompute Tesselation, WinVista SP2, Win7)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×