0
An Introduction to GPU
          3D Games to HPC
           Krishnaraj Rao
Presented at Bangalore DV Club, 03/12/2010
Agenda

 3D Graphics
    The Big Picture
    Quick Overview
    Programming Model
    Importance of 3D


 High Performance...
The Big Picture – Movies




Capture      Models   Scene   Rendering   Post
                      API                 Proc...
The Big Picture - Games




Capture      Models   Scene   Rendering   Post
                      API                 Proce...
Models end up in World Space
     Worldspace includes everything!
     Position and orientation for all
     items is need...
View Transformation world ends up
on Screen




   Screen Coordinate Space
Simple Interactive 3D Graphics App

 A simple example
     Static scene geometry,       Vertex
                           ...
Adding Programmability to the
    Graphics Pipeline
           3D Application
             or Game


      3D API
    Comm...
A History of Innovation




    1995                 1999          2002          2003          2004          2005        2...
GPU continues to offload CPU work
        Geom               Geom       Triangle         Pixel
                           ...
Programming Model
 API: Set of functions, procedures or classes
 that an OS, library or service provides to
 support reque...
Why is 3D Graphics important?
More than just Fun and Games....




Tokyo, Japan                       California Coastline
3D Consumer Applications
  Vista      Office        PDFs




  Music       Photos       Maps
GPUS IN HPC
Evolution of Processors


 Massive
   Data
Parallelism




Instruction
   Level
Parallelism



              Data Fits in ...
GPU Processing Power
CPU, meet your new partner!

                                         GPU



                 CPU    ...
Beyond Graphics

 With floating-point math and textures, graphics
 processors can be used for more than just graphics
    ...
What is GPGPU ?
  General Purpose computation using GPU
  in applications other than 3D graphics
     GPU accelerates crit...
Why Computation on the GPU?
  A quiet buildup of potential
       Calculation Throughput and Memory Bandwidth: 10X
       ...
Why Computation on the GPU?
  Supercomputing Performance
     Inherently Parallel Architecture
     1000+ cores, massively...
Compute Applications
  Computational Fluid Dynamics   Data Mining, Analytics &
  Computer Aided Engineering     Databases
...
Heterogeneous Computing




   Multi-Core   Parallel-Core
     CPU            GPU
APIS FOR HETEROGENEOUS COMPUTING
APIs for Heterogeneous Computing
 CUDA (Compute Unified Device Architecture) is a
 parallel computing architecture develop...
OpenCL
OpenCL: Platform Model & Program Structure

   One Host+ one or more Compute Devices
      Each Compute Device is composed...
CUDA Parallel Computing Architecture


ISA and hardware
compute engine

Includes a C-compiler
plus support for
OpenCL and
...
Option 1
OpenCL and C for CUDA


                                         Entry point for
                            C fo...
CUDA Success—Science & Computation
Not 2x or 3x, but speedups are 20x to 150x




    146X            36X              18X...
100x more affordable
                                        20x less power
                                              ...
Solving the World’s Most Complex 
Challenges


                                  Film



      Science
                   ...
Grand Computing Challenges




                Personalized    Mathematics for   Information
 Renewable
                 M...
Final Thoughts

 GPU and heterogeneous parallel
 architecture will revolutionize computing

 Parallel computing needed to ...
From Virtua Fighter to Tsubame


       1995 – NV1         2008 – GT200
       0.8M transistors   1,200M transistors

    ...
BACKUP
Graphics API History
Open GL
1992: OpenGL 1.0
1996: OpenGL 1.1 (Vertex Arrays, Improved Texturing)
1998: OpenGL 1.2 (3D Textures, BGRA pixel fo...
OpenGL ES

 Designed for hand-held and embedded devices
    Goal is smaller footprint to support OpenGL
    PlayStation 3 ...
OpenGL ES – Cont


 OpenGL ES 1.0 : Symbian OS, Android Platform
 OpenGL ES 1.0+ : Playstation 3
 OpenGL ES 1.1 : iPhone S...
DirectX

GDI: legacy Windows graphics API ~1985
DirectX 1.0 – 1995/6 (No 3D support, DirectDraw, DirectSound, DirectInput)...
Upcoming SlideShare
Loading in...5
×

3 d to_hpc

1,613

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,613
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
42
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "3 d to_hpc"

  1. 1. An Introduction to GPU 3D Games to HPC Krishnaraj Rao Presented at Bangalore DV Club, 03/12/2010
  2. 2. Agenda 3D Graphics The Big Picture Quick Overview Programming Model Importance of 3D High Performance Parallel Computing Why GPUs for HPPC? Available APIs GPU Computing architecture Q&A
  3. 3. The Big Picture – Movies Capture Models Scene Rendering Post API Processing Creation Creation
  4. 4. The Big Picture - Games Capture Models Scene Rendering Post API Processing GPU’s Drivers Creation HLSL, Creation Cg
  5. 5. Models end up in World Space Worldspace includes everything! Position and orientation for all items is needed to accurately calculate transformations into screen space. Light Source Y Z View Point or Camera Screen X World Coordinate Space
  6. 6. View Transformation world ends up on Screen Screen Coordinate Space
  7. 7. Simple Interactive 3D Graphics App A simple example Static scene geometry, Vertex Setup Raster Fragment Raster Engine Engine Ops moving viewer Repeat this loop: Z Cull Texture CPU takes user input from joystick or mouse CPU re-calculates viewer position, view direction, and light positions in 3-D world space GPU clears memory and Update Viewer Read Draw all draws the complete scene Joystick Position and Light Scene geometry with the new Position Direction Objects viewer and light positions Repeat forever
  8. 8. Adding Programmability to the Graphics Pipeline 3D Application or Game 3D API Commands 3D API: OpenGL or Direct3D CPU – GPU Boundary GPU Assembled Command & Polygons, Pixel Data Stream Vertex Index Lines, and Location Pixel Stream Points Stream Updates GPU Primitive Rasterization & Raster Front Framebuffer Assembly Interpolation Operations End Rasterized Pre-transformed Transformed Pre-transformed Transformed Vertices Vertices Fragments Fragments Programmable Programmable Vertex Fragment Processor Processor
  9. 9. A History of Innovation 1995 1999 2002 2003 2004 2005 2006-2007 NV1 GeForce 256 GeForce4 GeForce FX GeForce 6 GeForce 7 GeForce 8 1 Million 22 Million 63 Million 130 Million 222 Million 302 Million 754 Million Transistors Transistors Transistors Transistors Transistors Transistors Transistors 2008 GeForce GTX 200 1.4 Billion Transistors …. but what do all these extra transistors do? NVIDIA Confidential
  10. 10. GPU continues to offload CPU work Geom Geom Triangle Pixel Z / Blend Gather Proc Proc Proc 1996 CPU GPU Geom Geom Triangle Pixel Z / Blend Gather Proc Proc Proc 2000 CPU GPU Scene Physics Geom Geom Triangle Pixel Z / Blend Mgmt and AI Gather Proc Proc Proc 2004 CPU GPU Scene Physics Geom Geom Triangle Pixel Z / Blend Mgmt and AI Gather Proc Proc Proc 2008 CPU GPU
  11. 11. Programming Model API: Set of functions, procedures or classes that an OS, library or service provides to support requests made by computer programs DirectX: Collection of APIs to handle multimedia, esp. game programming and video tasks, on MS platforms. OpenGL (Open Graphics Library) is a standard specification defining a cross- language, cross-platform API for writing applications that produce 2D and 3D computer graphics.
  12. 12. Why is 3D Graphics important? More than just Fun and Games.... Tokyo, Japan California Coastline
  13. 13. 3D Consumer Applications Vista Office PDFs Music Photos Maps
  14. 14. GPUS IN HPC
  15. 15. Evolution of Processors Massive Data Parallelism Instruction Level Parallelism Data Fits in Cache Huge Data Sets
  16. 16. GPU Processing Power CPU, meet your new partner! GPU CPU GPU Intel Core i7 965 NVIDIA GTX 285 4 cores 240 cores 102 GFLOPS 1.04 TFLOPS CPU
  17. 17. Beyond Graphics With floating-point math and textures, graphics processors can be used for more than just graphics GPGPU = “General Purpose Computing on GPUs” Lots of ongoing research mapping algorithms and problems onto programmable GPUs Solving Linear Equations Black-Scholes Options Pricing Rigid- and Soft-Body Dynamics Middleware layers being developed to accelerate “eye candy” game physics on GPUs (HavokFX)
  18. 18. What is GPGPU ? General Purpose computation using GPU in applications other than 3D graphics GPU accelerates critical path of application Data parallel algorithms leverage GPU attributes Large data arrays, streaming throughput Fine-grain SIMD parallelism Floating point (FP) computation Great for “embarrassingly parallel” algorithms Applications – see //GPGPU.org Game effects (FX) physics, image processing Physical modeling, computational engineering, matrix algebra, convolution, correlation, sorting
  19. 19. Why Computation on the GPU? A quiet buildup of potential Calculation Throughput and Memory Bandwidth: 10X Equivalent performance at fraction of power & cost GPU in every PC – pervasive presence and massive impact GPUs have always been parallel “multi-core” Natively designed to handle massive threading Every pixel is a thread Increased precision (fp32), programmability, flexibility GPUs are a mass-market parallel processor Economies of scale Peak floating point performance is much higher than comparable CPUs ATI x1900XT Intel Core 2 Duo E6600 $400 (video card) $400 (processor only) 250 GFLOPs (SP Float) 40 GFLOPS (SP Float) 46 GB main memory BW 8.5 GB main memory BW
  20. 20. Why Computation on the GPU? Supercomputing Performance Inherently Parallel Architecture 1000+ cores, massively parallel processing 250x the compute performance of a PC Personal “One Researcher, One Supercomputer” Supercomputer in a desktop system Plugs into standard power strip Accessible Program in C, C++, Fortran for Windows or Linux Available from OEMs and resellers worldwide and priced like a workstation
  21. 21. Compute Applications Computational Fluid Dynamics Data Mining, Analytics & Computer Aided Engineering Databases Digital Content Creation MATLAB Acceleration Electronic Design Automation Molecular Dynamics Finance Weather, Atmospheric, Ocean Game Physics Modeling, and Space Sciences Graphics Libraries Imaging and Computer Vision Oil & Gas Medical Imaging Programming Tools Numerics Ray Tracing Bio-Informatics and Life Signal Processing Sciences Video & Audio Computational Chemistry Computational Electromagnetics & Electrodynamics
  22. 22. Heterogeneous Computing Multi-Core Parallel-Core CPU GPU
  23. 23. APIS FOR HETEROGENEOUS COMPUTING
  24. 24. APIs for Heterogeneous Computing CUDA (Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. Programmers use 'C for CUDA' (C with NVIDIA extensions), compiled through a PathScale Open64 C compiler, to code algorithms for execution on the GPU. Both low/high level APIs are provided OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. Microsoft DirectCompute is an API that supports General-purpose computing on GPUs on Microsoft Win Vista or Win 7. DirectCompute is part of the Microsoft DirectX collection of APIs.
  25. 25. OpenCL
  26. 26. OpenCL: Platform Model & Program Structure One Host+ one or more Compute Devices Each Compute Device is composed of one or more Compute Units Each Compute Unit is further divided into one or more Processing Elements
  27. 27. CUDA Parallel Computing Architecture ISA and hardware compute engine Includes a C-compiler plus support for OpenCL and DX11 Compute Architected to natively support all computational interfaces (standard languages and APIs)
  28. 28. Option 1 OpenCL and C for CUDA Entry point for C for CUDA developers who prefer high-level C Entry point for developers who want OpenCL low-level API Shared back-end compiler and PTX optimization technology GPU
  29. 29. CUDA Success—Science & Computation Not 2x or 3x, but speedups are 20x to 150x 146X 36X 18X 50X 100X Medical Molecular Video Matlab Astrophysic Imaging Dynamics Transcoding Computing s U of Utah U of Illinois, Elemental Tech AccelerEyes RIKEN Urbana 149X 47X 20X 130X 30X Financial Linear Algebra 3D Quantum Gene simulation Universidad Ultrasound Chemistry Sequencing Oxford Jaime Techniscan U of Illinois, U of Maryland Urbana
  30. 30. 100x more affordable 20x less power Tesla 250x consumption Personal Supercomputer Performance Supercomputing Cluster 250x Faster 1x Today’s Workstations $100K - $1M < $10 K Accessibility
  31. 31. Solving the World’s Most Complex  Challenges Film Science Auto Design Oil & Gas Medicine Broadcast Space Exploration
  32. 32. Grand Computing Challenges Personalized Mathematics for Information Renewable Medicine Scientific Data Mining Energy Discovery Machines That Natural Human Predict Economic Think Machine Environmental Analysis Interaction Changes
  33. 33. Final Thoughts GPU and heterogeneous parallel architecture will revolutionize computing Parallel computing needed to solve some of the most interesting and important human challenges ahead Learning parallel programming is imperative for students in computing and sciences
  34. 34. From Virtua Fighter to Tsubame 1995 – NV1 2008 – GT200 0.8M transistors 1,200M transistors 50MHz 1.3GHz 1M Bytes 4G Bytes 0 GFLOPS 1 TFLOPS Another 1000x in 15 years?
  35. 35. BACKUP
  36. 36. Graphics API History
  37. 37. Open GL 1992: OpenGL 1.0 1996: OpenGL 1.1 (Vertex Arrays, Improved Texturing) 1998: OpenGL 1.2 (3D Textures, BGRA pixel format) 1998: OpenGL 1.2.1 (Multi-Texture) 2001: OpenGL 1.3 (Multi-sample AA, Cube/Compressed Textures) 2002: OpenGL 1.4 (Depth/Shadow mapping, Auto mipmap generation) 2003: OpenGL 1.5 (Vertex Attr from Vid Mem) 2005: OpenGL 2.0 (GLSL, Vertex/Pixel Shaders, MRT, Non P-of-2 Tex) 2006: OpenGL 2.1 (GLSL1.2, sRGB Textures) 2008: OpenGL 3.0 (GLSL1.3, 32b FP Textures) 2009: OpenGL 3.1 (March 2009, GLSL1.4, Perf, CopyBufferAPI) 2009: OpenGL 3.2 (Aug 2009, GLSL1.5, Geom Shaders)
  38. 38. OpenGL ES Designed for hand-held and embedded devices Goal is smaller footprint to support OpenGL PlayStation 3 and cell phone industry adopting ES OpenGL ES 1.1 Strips out anything deemed extra in OpenGL Keeps conventional fixed-function vertex and fragment processing OpenGL ES 2.0 Adds programmable vertex and fragment shaders Shaders specified in binary format Drops support for fixed-function vertex and fragment processing
  39. 39. OpenGL ES – Cont OpenGL ES 1.0 : Symbian OS, Android Platform OpenGL ES 1.0+ : Playstation 3 OpenGL ES 1.1 : iPhone SDK, Bberry (Some Models) Open GL ES 2.0 : iPhone 3GS, iPOD touch
  40. 40. DirectX GDI: legacy Windows graphics API ~1985 DirectX 1.0 – 1995/6 (No 3D support, DirectDraw, DirectSound, DirectInput) DirectX 3.0 – 1996 (Rasterization only 3D Support, Akward prog. Model, Not successful) DirectX 5.0 – 1997 (Draw Primitives, DirectX vs OpenGL War) DirectX 6.0 – 1998 (Multitexture, OGL/Glide features, Texture Compression) DirectX 7.0 – 1999 (Geometry HW accleration and Blending, Cube mapping) DirectX 8.0 – 2000/1 (Programable VS/PS Shaders, XBOX) DirectX 9.0 – 2002-2003 (More programmability, Branching, FP pixel prog.) DirectX 9.0c – 2004 (ShaderModel 3.0) DirectX 10.0 – 2006 (SM4.0, WinVista, Geometry Shaders, Streaming Output) DirectX 10.1 – 2008 (SM4.1, Better Image Quality) DirectX 11.0 - 2009 (SM5.0, DirectCompute Tesselation, WinVista SP2, Win7)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×