Advanced Graphics Workshop - GFX2011


Published on

This was presented at GFX2011 - conducted by IEEE Consumer Electronics society, Bangalore chapter. This is the public version of the presentation.

Published in: Technology, Art & Photos
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Advanced Graphics Workshop - GFX2011

  1. 1. Advanced Graphics Workshop - Prabindh Sundareson, Texas Instruments GFX2011 Dec 3 rd 2011 Bangalore Note: This slide set is a public version of the actual slides presented at the workshop
  2. 2. GFX2011 8.30 AM [Registration and Introduction, Equipment setup] 9.00 AM Why Graphics ? Present and Future – Prof. Vijay Natarajan, Assistant Professor, Department of Computer Science and Automation, IISc, Bangalore 9.45 AM Introduction to the OpenGL/ES Rendering pipeline, and algorithms Detailed walkthrough of the OpenGL ES2.0 spec and APIs – Part 1 1.00 PM [Lunch] Detailed walkthrough of the OpenGL ES2.0 spec and APIs – Part 2 - Break - Framework and platform integration - EGL, Android (SurfaceFlinger) Tools for performance benchmarking, and Graphics Development Q&A, Certificate presentation to participants – Networking
  3. 3. Detailed Agenda <ul><li>Inaugural Talk – Dr. Vijay Natarajan – “Graphics – Present and Future” </li></ul><ul><li>- Break </li></ul><ul><li>GPU HW Architectures, and the GL API </li></ul><ul><li>The CFF APIs </li></ul><ul><ul><li>Lab 1 </li></ul></ul><ul><li>Texturing objects </li></ul><ul><ul><li>Lab 94 – Rectangle wall </li></ul></ul><ul><li>Vertices and Transformations </li></ul><ul><ul><li>Lab 913 – Eyeing the eyes </li></ul></ul><ul><li>Lunch Break </li></ul><ul><li>Real-life modeling and loading 3D models </li></ul><ul><ul><li>Lab 9 – Review .obj file loading method </li></ul></ul><ul><li>Shaders – Vertex and Fragment </li></ul><ul><ul><li>Lab 96 – Squishing the slices </li></ul></ul><ul><li>- Break </li></ul><ul><li>Rendering targets </li></ul><ul><ul><li>Lab 910 – 3D in a 2D world </li></ul></ul><ul><li>Creating special effects </li></ul><ul><li>EGL – and Platform Integration </li></ul><ul><ul><li>Overview of GLES2.0 usage in Android/ iOS </li></ul></ul><ul><li>*This PPT is to be used in conjunction with the labs at </li></ul>
  4. 4. GPU HW Architectures <ul><li>CPUs are programmed with sequential code </li></ul><ul><ul><li>Typical C program – linear code </li></ul></ul><ul><ul><li>Well defined Pre-fetch architectures, cache mechanisms </li></ul></ul><ul><ul><li>Problem ? </li></ul></ul><ul><ul><ul><li>Limited by how fast “a” processor can execute, read, write </li></ul></ul></ul><ul><li>GPUs are parallel </li></ul><ul><ul><li>Small & same code, multiple data </li></ul></ul><ul><ul><li>Don’t care - control dependencies </li></ul></ul><ul><ul><ul><li>If used, drastically reduces throughput </li></ul></ul></ul><ul><ul><li>“ Output” is a result of a matrix operation (n x n) </li></ul></ul><ul><ul><ul><li>Graphics output – color pixels </li></ul></ul></ul><ul><ul><ul><li>Computational output – matrix values </li></ul></ul></ul>
  5. 5. GPU integrated SOCs <ul><li>The A5 chipset </li></ul>CPU size ~= GPU size
  6. 6. Integrated GPU architectures (Samsung SRP) From – Khronos presentation
  7. 7. Vivante GPU Architecture <ul><li>GPUs vary in </li></ul><ul><li>Unified vs separate shader HW architecture </li></ul><ul><li>internal cache size </li></ul><ul><li>Bus size </li></ul><ul><li>rendering blocks </li></ul><ul><li>Separated 2D and 3D blocks </li></ul>From - Spec evolution
  8. 8. OpenGL specification evolution Reading the spec
  9. 9. A note on reading the GLES specifications <ul><li>It must be clear that GLES specification is a derivative of the GL specification </li></ul><ul><ul><li>It is recommended to read the OpenGL 2.0 specification first </li></ul></ul><ul><ul><li>Then read the OpenGL ES2.0 specification </li></ul></ul><ul><li>Similarly, for shading language </li></ul><ul><ul><li>It is recommended to read the OpenGL SL specification first </li></ul></ul><ul><ul><li>Then read the OpenGL ES SL specification </li></ul></ul>Extensions
  10. 10. What are Extensions ? <ul><li>Extension types </li></ul><ul><ul><li>OES – Conformance tested by Khronos </li></ul></ul><ul><ul><li>EXT – Extension supported by >1 IP vendor </li></ul></ul><ul><ul><li>Proprietary (vendor_ prefix) – Extension from 1 IP vendor </li></ul></ul><ul><li>How to check for extensions ? </li></ul><ul><ul><li>getSupportedExtensions (WebGL), getExtension() </li></ul></ul><ul><ul><li>glGetString (openGL ES) </li></ul></ul><ul><li>Number of extensions </li></ul><ul><ul><li>OpenGL  400 + </li></ul></ul><ul><ul><li>OpenGL ES  100+ </li></ul></ul>Dependencies
  11. 11. OpenGL Dependencies <ul><li>OpenGL depends on a number of external systems to run </li></ul><ul><ul><li>A Windowing system (abstracted by EGL/ WGL/ XGL …) </li></ul></ul><ul><ul><li>External inputs – texture files, 3D modelling tools, shaders, sounds, … </li></ul></ul><ul><li>OpenGL is directly used by </li></ul><ul><ul><li>OS/ Driver developers (ofcourse!) </li></ul></ul><ul><ul><li>HW IP designers </li></ul></ul><ul><ul><li>Game studios (optimisation) </li></ul></ul><ul><ul><li>Researchers (Modelling, Realism, ) … </li></ul></ul><ul><ul><li>Tools developers </li></ul></ul><ul><li>Application developers do not generally program on OpenGL, but rather do it on an Android API binding, or Java binding </li></ul>GL vs ES
  12. 12. OpenGL 2.0 vs OpenGL ES2 <ul><li>All Fixed function functionality is removed </li></ul><ul><ul><li>Specific drawing calls like Fog etc </li></ul></ul><ul><ul><li>Matrix setup </li></ul></ul><ul><li>Replaced with programmable entities, GLES SL is ~= GL SL </li></ul><ul><li>Compatibility issues GL  GLES </li></ul><ul><ul><li>Shaders </li></ul></ul><ul><ul><ul><li>GLSL does not enable fixed function state to be available </li></ul></ul></ul><ul><ul><ul><li>ex, gl_TexCoord </li></ul></ul></ul><ul><li>To enable compatibility, Architecture Review Board (ARB) extension – “ GL_ARB_ES2_compatibility” </li></ul><ul><ul><li> </li></ul></ul><ul><li>Good reference </li></ul><ul><ul><li> </li></ul></ul>GLES API
  13. 13. The OpenGL ES API <ul><li>From the Khronos OpenGL ES Reference Card </li></ul><ul><ul><li>“ OpenGL ® ES is a software interface to graphics hardware. </li></ul></ul><ul><ul><li>The interface consists of a set of procedures and functions that allow a programmer to specify the objects and operations involved in producing high-quality graphical images, specifically color images of three-dimensional objects” </li></ul></ul><ul><li>Keep this document handy for API reference </li></ul><ul><ul><li> </li></ul></ul>Client server
  14. 14. The GL Client – Server Model <ul><li>Client (application on host), server (OpenGL on GPU) </li></ul><ul><li>Server can have multiple rendering contexts, but has a global state </li></ul><ul><ul><li>Client will connect to one of the contexts at any point of time </li></ul></ul><ul><li>Client can set the states of the server by sending commands </li></ul><ul><ul><li>Further API calls will thus be affected by the previous states set by the client </li></ul></ul><ul><li>Server expects not-to-be interrupted by the client during operation </li></ul><ul><ul><li>Inherent nature of the parallel processor </li></ul></ul>GL,SL, EGL spec versions
  15. 15. OpenGL Specifications OPENGL Full version ES version Common Common-Lite GLSL companion GLSL-ES companion What we miss in ES compared to desktop version: Polygons, Display lists, Accumulation buffers,… Currently in 4.0+ Currently in 2.0 Currently in 1.0.16 Currently in 1.20 EGL Currently in 1.3 Core GL Spec Shader Spec Platform Integration EGL Currently in 1.3 Programming Flow
  16. 16. Programming in OpenGL / ES <ul><li>Step1: </li></ul><ul><ul><li>Initialise EGL for rendering – context, surface, window </li></ul></ul><ul><li>Step2: </li></ul><ul><ul><li>Describe the scene (VBOs, Texture coordinates) – objects with Triangles, lighting </li></ul></ul><ul><li>Step3: </li></ul><ul><ul><li>Load the textures needed for the objects in the scene </li></ul></ul><ul><li>Step4: </li></ul><ul><ul><li>Compile the Vertex and Fragment Shaders </li></ul></ul><ul><li>Step 5: </li></ul><ul><ul><li>Select the output target (FBO, Fb, Pixmap …) </li></ul></ul><ul><li>Step5: </li></ul><ul><ul><li>Draw the scene </li></ul></ul><ul><li>Step 6 </li></ul><ul><ul><li>Run this in a loop </li></ul></ul>Frames/vertex/basics
  17. 17. Preliminaries <ul><li>Pixel Throughput </li></ul><ul><ul><li>Memory bandwidth for a 1080P display @ 60 fps </li></ul></ul><ul><ul><li>Did we forget Overdraw ? </li></ul></ul><ul><li>Vertex Throughput </li></ul><ul><ul><li>Vertex throughput for a 100k triangle scene </li></ul></ul><ul><li>Tearing </li></ul>Frame switch (Uniform) Driver frame draw (non-uniform) Real frame switch happens here Triangles
  18. 18. Why Triangles ? <ul><li>Connectivity, Simplicity, Cost </li></ul>+ + ??? pipeline
  19. 19. ES2.0 Pipeline What do the APIs look like ? …
  20. 20. The GLES API – Overall view Global Platform Management Vertex Operations Texture Operations Shader Operations Rendering Operations <ul><li>VBOs </li></ul><ul><li>Attaching attributes </li></ul><ul><li>Attaching textures </li></ul><ul><li>Loading texture data </li></ul><ul><li>Mipmaps </li></ul><ul><li>Loading, compiling, </li></ul><ul><li>Linking to program </li></ul><ul><li>Binary shaders </li></ul><ul><li>Rendering to Framebuffer </li></ul><ul><li>Rendering to FBO </li></ul><ul><li>RTT </li></ul>State Management <ul><li>CFF </li></ul><ul><li>Front/Back facing </li></ul><ul><li>Enable/Disable (culling, ..) </li></ul><ul><li>Get/ Set uniforms </li></ul><ul><li>egl, wgl, glx .. </li></ul><ul><li>Antialiasing, </li></ul><ul><li>Configuration </li></ul>Context Management <ul><li>Surface - window </li></ul><ul><li>Threading models </li></ul><ul><li>Context sharing </li></ul>EGL GL
  21. 21. Starting on the GPU
  22. 22. Flush, and Finish <ul><li>Several types of rendering methods adopted by GPUs </li></ul><ul><ul><li>Immediate rendering </li></ul></ul><ul><ul><li>Deferred rendering </li></ul></ul><ul><ul><ul><li>Tiled rendering (ex, QCOM_tiled_rendering) </li></ul></ul></ul><ul><li>Immediate rendering – everyone is happy </li></ul><ul><ul><li>Except the memory bus! </li></ul></ul><ul><li>Deferred – the GPU applies its “intelligence” to do/not do certain draw calls/ portions of draw calls </li></ul><ul><ul><li>Used most commonly in embedded GPUs </li></ul></ul><ul><li>Flush() – The call ensures pending operations are kicked off, returns </li></ul><ul><li>Finish() – The call ensures pending operations are kicked off, “waits for completion” , returns </li></ul>
  23. 23. A note on WebGL <ul><li>In native code, need to handle Surface, context </li></ul><ul><ul><li>For example, refer to this native code </li></ul></ul><ul><ul><li>eglGetDisplay </li></ul></ul><ul><ul><li>eglInitialize – for this display </li></ul></ul><ul><ul><li>eglBindAPI – bind to GLES/ VG context </li></ul></ul><ul><ul><li>eglChooseConfig – configure surface type, GLES1/2/VG, Depth/Stencil buffer size </li></ul></ul><ul><ul><li>eglCreateWindowSurface / eglCreatePixmapSurface </li></ul></ul><ul><ul><li>eglCreateContext </li></ul></ul><ul><ul><li>eglMakeCurrent </li></ul></ul><ul><li>Platform initialisation in WebGL is handled by browser </li></ul><ul><ul><li>Only configurations are – stencil, depth, AA, Alpha, preserve </li></ul></ul><ul><ul><li>No EGL calls in JS application code </li></ul></ul><ul><ul><li>No multi-threading issues (not yet, but “workers” are coming) </li></ul></ul><ul><li>The hands on labs will focus on GL, not EGL </li></ul><ul><li>Note : - GL context gets lost, when user account is locked/screen saver mode etc – Restart browser as required. </li></ul>
  24. 24. Programming <ul><li>ClearColor </li></ul><ul><li>Clear – clears with color specified earlier </li></ul><ul><li>Flush/ Finish </li></ul><ul><li>“ setup”, “vertex”, “fragment”, “animate” functions in class </li></ul><ul><li>Will be called by framework </li></ul><ul><li>Clear web cache first time </li></ul><ul><li>Login </li></ul><ul><li>Open “Default Lab #1” - Copy </li></ul><ul><li>Open “My labs” #1 – Paste code </li></ul><ul><li>Change code in setupFunc() </li></ul><ul><li>Save, Run again </li></ul><ul><li>Lab #1 </li></ul>Below link contains introductory video for starting the labs:
  25. 25. Texturing
  26. 26. A note on Binding, Buffer Objects <ul><li>What is “Binding” ? </li></ul><ul><ul><li>Binding a server to a client – ex, VBO to a texture </li></ul></ul><ul><ul><li>All objects are associated with a context state </li></ul></ul><ul><ul><li>Binding an object is ~ copying the object state  context </li></ul></ul><ul><ul><li>Removes client  server movement everytime </li></ul></ul><ul><ul><li>“ Xfer-once-to-server, keep the token, Use-multipletimes-later” </li></ul></ul><ul><ul><li>Good practice to “unbind” after operations– set binding to 0/null to avoid rogue programs changing state of bound object </li></ul></ul><ul><li>Buffer-Objects </li></ul><ul><ul><li>Allows data to be stored on the “server” ie, the GPU memory, rather than client memory (via pointer) </li></ul></ul><ul><ul><ul><li>GPU can decide where to place it for the fastest performance </li></ul></ul></ul>
  27. 27. Correct approach of data transfers <ul><li>Generate Object (ex, glGenBuffers, glGenTextures ) </li></ul><ul><li>Bind Object to an ID “xyz” ( glBindBuffer(xyz), .. ) </li></ul><ul><li>Transfer data to Object ( glBufferData, glTexImage2D ) </li></ul><ul><li>Unbind ( glBindBuffer(0) ) </li></ul><ul><li> After this point, the data remains bound to “xyz” and is managed by GPU. </li></ul><ul><li>Can be accessed later by referencing “xyz” </li></ul><ul><li> Applies to VBOs, Textures, … </li></ul><ul><li>Note the implicit “no atomicity” – needs locking </li></ul>
  28. 28. Texturing basics <ul><li>Texture Formats available </li></ul><ul><ul><li>RGB* formats, Luminance only formats </li></ul></ul><ul><ul><li>Relevance of YUV </li></ul></ul><ul><li>Texture Filtering </li></ul><ul><ul><li>Maps texture coordinates to object coordinates – think of wrapping cloth over object </li></ul></ul><ul><li>Mipmaps </li></ul><ul><ul><li>Local optimisation – use pre-determined “reduced” size images, if object is far away from viewer – as compared to filtering full image </li></ul></ul><ul><ul><li>Objective is to reduce bandwidth, not necessarily higher quality </li></ul></ul><ul><ul><li>Application can generate and pass through TexImage2D() </li></ul></ul><ul><ul><li>GPU can generate using GenerateMipMap() </li></ul></ul><ul><ul><li>Occupies more memory </li></ul></ul>Uv mapping
  29. 29. Texturing 3D objects <ul><li>Mapping from a bitmap to a 3D object involves matching the texture coordinates to the object surface </li></ul><ul><li>Texture coordinates are calculated along with the vertex coordinates </li></ul><ul><li>3D tools output Texture coordinates along with vertex information, for the scene </li></ul><ul><li>Lab 12 (Sphere), Lab 13 </li></ul>compression
  30. 30. Texture Compression types <ul><li>GLES spec supports RGBA textures, Luminance … </li></ul><ul><ul><li>To reduce memory bandwidth, compression used </li></ul></ul><ul><li>Texture Compression major types </li></ul><ul><ul><li>PVRTC </li></ul></ul><ul><ul><li>ETC1 </li></ul></ul><ul><ul><li>Others </li></ul></ul><ul><li>Android primarily supports ETC1 </li></ul><ul><li>iOS supports PVRTC (and no other) </li></ul><ul><li>Extension support queryable using GL API queries </li></ul><ul><li>How to store this information in an uniform manner ? </li></ul><ul><ul><li>Texture file formats </li></ul></ul><ul><ul><li>PVRTC (using Textool converter from IMG) commonly used </li></ul></ul><ul><ul><li>KTX file format </li></ul></ul>KTX
  31. 31. Khronos KTX file format <ul><li>To render a texture, steps to be used today: </li></ul><ul><ul><li>Application needs apriori knowledge of texture type, format, storage type, mipmap levels, and filename or the buffer data itself </li></ul></ul><ul><ul><li>Then load into server using TexImage2D() </li></ul></ul><ul><li>Proprietary formats exist to separate this application+texture dependency – </li></ul><ul><ul><li>ex, PVRT from IMG </li></ul></ul><ul><li>KTX file format from Khronos is a standard way to store texture information, and the texture itself </li></ul><ul><ul><li>See next slide for structure of the file </li></ul></ul>
  32. 32. KTX format … Passing coords
  33. 33. Texture coordinates to GPU <ul><li>Texture coordinates are passed to GPU as “Attributes” along with the vertices </li></ul><ul><li>Gen-bind-bufferdata, then bindAttrib </li></ul>WebGL/Textures
  34. 34. Note on WebGL and Textures <ul><li>Because of the way file loads work on browsers (asynchronous), texture loading may not happen before the actual draw </li></ul><ul><ul><li>Expect black screen for a very short-while till the Texture image loads from the website </li></ul></ul><ul><li>On native applications, due to the synchronous nature of loading the texture this issue will not be present </li></ul>Programming
  35. 35. Programming with Textures <ul><li>bindTexture </li></ul><ul><li>pixelStorei (webGL only) </li></ul><ul><ul><li>UNPACK_FLIP_Y_WEBGL </li></ul></ul><ul><li>texImage2D </li></ul><ul><li>texParameteri </li></ul><ul><ul><li>TEXTURE_MAG_FILTER </li></ul></ul><ul><ul><li>TEXTURE_MIN_FILTER </li></ul></ul><ul><li>Note: WebGL “null” binding instead of “0” </li></ul><ul><li>Lab #2 (why not square ?) </li></ul><ul><li>Lab #94 – The wall </li></ul>
  36. 36. Note on the lab hints <ul><li>Each session ends with a lab. The lab sessions online intentionally have errors that the reader has to debug to show the rendered object on screen. </li></ul><ul><li>Keys are provided for each such lab at the end of the section in this PPT </li></ul>
  37. 37. Lab 94 – Texturing (Keys to remove errors) <ul><li>var indexArray = new Uint16Array([0, 1, 2, 2 , 1, 3]); </li></ul><ul><li>var texCoordArray = new Float32Array([0,0, 10 ,0, 0, 10, 10,10]); </li></ul><ul><li>context.enableVertexAttribArray( 1 ); </li></ul><ul><li>context.vertexAttribPointer(1, 2 , context.FLOAT, context.FALSE, 0, 0); </li></ul>
  38. 38. Vertices, Transformations
  39. 39. What are vertices ? <ul><li>Vertices – </li></ul><ul><ul><li>Points defined in a specific coordinate axes, to represent 3D geometry </li></ul></ul><ul><ul><li>Atleast 3 vertices are used to define a Triangle – one of the primitives supported by GLES </li></ul></ul>
  40. 40. Vertex operations <ul><li>Where do vertices come from ? </li></ul><ul><ul><li>Output of Modelling tools </li></ul></ul><ul><ul><li>Mesh rendering / transforms – optimisations </li></ul></ul><ul><li>For 2D operations (ex Window systems), just 2 triangles </li></ul>Attributes
  41. 41. Vertex Attributes <ul><li>A vertex is characterised by its position {x,y,z} </li></ul><ul><ul><li>{x,y,z} are floating point values </li></ul></ul><ul><li>Additionally, normals are required for directional lighting calculations in shader </li></ul><ul><ul><li>3D Tools output the normal map also along with vertex information </li></ul></ul><ul><li>Additionally, texture coordinates are required </li></ul><ul><ul><li>Again, 3D tools output the texture coordinates </li></ul></ul><ul><li>Each HW implementation must support a minimum number of vertex attributes </li></ul><ul><ul><li>Maximum number can be queried using MAX_VERTEX_ATTRIBS </li></ul></ul>CPU to GPU xfer
  42. 42. Vertices – CPU to GPU <ul><li>Optimising Vertex operations </li></ul><ul><ul><li>A 3D object will have a lot of “common” vertices </li></ul></ul><ul><ul><ul><li>Ex – Cube has 6*2 triangles, (6*2)*3 vertices, but only 8 “points” </li></ul></ul></ul><ul><ul><li>So rather than passing vertices, pass 8 vertices, and 36 indices to the vertices to reduce Bandwidth </li></ul></ul><ul><ul><ul><li>Indices can be 16bit, so reduce BW by ~50% </li></ul></ul></ul><ul><ul><li>GL_ELEMENT_ARRAY_BUFFER and GL_ARRAY_BUFFER </li></ul></ul><ul><ul><li>STATIC_DRAW, DYNAMIC_DRAW </li></ul></ul><ul><ul><li>Not uploading again and again but re-use </li></ul></ul><ul><li>What are Vertex Buffer Objects ? </li></ul><ul><ul><li>genBuffers (createBuffer in WebGL), binding, bufferData/offset and usage </li></ul></ul><ul><ul><li>Usage of Index Buffers (ELEMENT_ARRAY_BUFFER) </li></ul></ul>Cartesian
  43. 43. Translation <ul><li>matrix1.translate(X,0.0,0); </li></ul>X = 0 X = 0.4  Translation applied to all objects (effect is not dependent on depth of object)
  44. 44. Rotation x y z Rotation  Observe effect of x offset! Refresh M,V,P after every rotate -0 Lookat
  45. 45. Getting the eye to see the object <ul><li>“ Model” Matrix made the object “look” right </li></ul><ul><li>Now make the object visible to the “eye” – The “View” </li></ul><ul><ul><li>Eye is always at the origin {0,0,0} </li></ul></ul><ul><ul><li>So using matrices, move the current object to the eye </li></ul></ul><ul><li>“ LookAt” is implemented in many standard toolkits </li></ul><ul><ul><li>The LookAt transformation is defined by </li></ul></ul><ul><ul><ul><li>Viewpoint - from where the view ray starts (eye) </li></ul></ul></ul><ul><ul><ul><li>A Reference point (where the view ray ends) – in middle of scene (center) </li></ul></ul></ul><ul><ul><ul><li>A look-”up” direction (up) </li></ul></ul></ul><ul><li>ex – gluLookAt Utility function </li></ul><ul><li>Significant contributor of grey-hair </li></ul>Viewport
  46. 46. Viewport Transformation <ul><li>Convert from the rendering to the final screen size </li></ul><ul><ul><li>ie physical screen </li></ul></ul><ul><li>Define the viewport using glViewport() </li></ul><ul><ul><li>Viewport can be an area anywhere within the physical screen </li></ul></ul><ul><li>This takes care of aspect ratio </li></ul><ul><ul><li>Ex, square becomes rectangle in laptop </li></ul></ul><ul><li>After the transformation, successful triangles get to the rasterisation HW, and then to the Fragment shader </li></ul>HW optimisations
  47. 47. Summary - The Transformation Sequence Translation example mathematical step - w
  48. 48. HW Optimisations <ul><li>Not all triangles are visible </li></ul><ul><ul><li>HW can reject based on depth </li></ul></ul><ul><ul><li>coverage </li></ul></ul><ul><ul><li>Front-facing or back-facing (Culling) </li></ul></ul><ul><li>Culling is disabled by default per specification </li></ul><ul><ul><li>However, most HW do this optimisation by default to save on bandwidth/ later pixel processing </li></ul></ul>Programming
  49. 49. Programming ! <ul><li>Recall the Bandwidth needs for the vertex transfers / frame </li></ul><ul><li>Passing Vertices </li></ul><ul><ul><li>Create Buffer Object </li></ul></ul><ul><ul><li>bindBuffer </li></ul></ul><ul><ul><li>bufferData </li></ul></ul><ul><ul><li>Indices are passed as type ELEMENT_ARRAY </li></ul></ul><ul><li>Passing Attributes </li></ul><ul><ul><li>bindAttribLocation </li></ul></ul><ul><ul><li>enableVertexAttribArray </li></ul></ul><ul><ul><li>vertexAttribPointer </li></ul></ul><ul><li>matrix.getAsArray() </li></ul><ul><li>Lab #913 – Eyeing the eyes </li></ul>
  50. 50. Lab 913 – Keys for “Eyeing the eyes” <ul><li>var totalArcs = 36 ; //shadervertexsetup_tunnel </li></ul><ul><li>texCoords.push( 10 *numArcs / totalArcs); </li></ul><ul><li>texCoords.push( 10 *zslice / numZSlices); </li></ul><ul><li>matrix1.scale(1.0, 1.0, 1.0 ); //not 15.0 </li></ul>
  51. 51. Real life 3D models
  52. 52. Real-life modelling of objects <ul><li>3D models are stored in a combination of </li></ul><ul><ul><li>Vertices </li></ul></ul><ul><ul><li>Indices / Faces * </li></ul></ul><ul><ul><li>Normals </li></ul></ul><ul><ul><li>Texture coordinates </li></ul></ul><ul><li>Ex, .OBJ, 3DS, STL, FBX … </li></ul><ul><ul><li>f, v, v//norm, v/t, o </li></ul></ul><ul><ul><li>Export of vertices => scaling to 1.0-1.0 </li></ul></ul><ul><ul><li>Vertex normals vs face normals </li></ul></ul><ul><ul><li>Materials (mtl), animations </li></ul></ul><ul><ul><li>Problem of multiple indices not allowed in openGL </li></ul></ul><ul><li>Tools and Models </li></ul><ul><ul><li>Blender, Maya, … </li></ul></ul><ul><ul><li> - tool for importing multiple types </li></ul></ul><ul><ul><li> - Blender models </li></ul></ul><ul><li>Tessellation of meshes can be aided by HW in GPUs </li></ul>
  53. 53. Programming <ul><li>Loading 3D models is an application functionality </li></ul><ul><ul><li>No new APIs from OpenGLES are needed </li></ul></ul><ul><li>A parser is required to parse the model files, and extract the vertex, attribute, normal, texture coordinate information </li></ul><ul><li>Look through objdata.js in </li></ul><ul><ul><li>Lab #9 </li></ul></ul>
  54. 54. Shaders
  55. 55. Vertices, Fragments - Revisited <ul><li>Vertices – </li></ul><ul><ul><li>Points defined in a specific coordinate axes, to represent 3D geometry </li></ul></ul><ul><ul><li>Atleast 3 vertices are used to define a Triangle – one of the primitives supported by GLES </li></ul></ul><ul><li>Fragments </li></ul><ul><ul><li>The primitives are “rasterised” to convert the “area” under the primitive to a set of color pixels that are then placed in the output buffer </li></ul></ul>Shader characteristics
  56. 56. Shader characteristics <ul><li>Uniforms – uniform for all shader passes </li></ul><ul><ul><li>Can be updated at run time from application </li></ul></ul><ul><li>Attributes – changes per shader pass </li></ul><ul><li>Varying – Passed between vertex and fragment shaders </li></ul><ul><ul><li>Ex, written by Vertex shader, and used by Fragment shader </li></ul></ul><ul><ul><ul><li>gl_Position </li></ul></ul></ul><ul><li>Programs </li></ul><ul><ul><li>Why do we need multiple programs in an application </li></ul></ul><ul><ul><ul><li>for offscreen animation, different effects </li></ul></ul></ul><ul><li>MAX VARYING VECTORS – enum </li></ul>Inputs to shader
  57. 57. Inputs to the Shaders <ul><li>Vertex Shader </li></ul><ul><ul><li>Vertices, attributes, </li></ul></ul><ul><ul><li>Uniforms </li></ul></ul><ul><li>Fragment Shader </li></ul><ul><ul><li>Rasterised fragments (ie, after rasteriser fixed function HW) </li></ul></ul><ul><ul><li>Varyings from vertex shader </li></ul></ul><ul><ul><li>Uniforms </li></ul></ul>Shader types
  58. 58. Fragment Shaders <ul><li>A fragment is – a pixel belonging to an area of the target render screen (on-screen or off-screen) </li></ul><ul><ul><li>Primitives are rasterised, after clipping </li></ul></ul><ul><li>Fragment shader is responsible for the output colour, just before the post-processing operations </li></ul><ul><li>A Fragment shader can operate on “1” fragment at a time </li></ul><ul><li>Minimum number of “TEXTURE UNITS” is 8 </li></ul><ul><li>Calculation of colors </li></ul><ul><ul><li>Colors are interpolated across vertices automatically (Ref Lab 6 in the hands-on session) – ie, “varyings” are interpolated in Fragment shaders during rendering </li></ul></ul><ul><ul><li>Colors can be generated from a texture “sampler” </li></ul></ul><ul><ul><li>Each HW has a specific number of “Texture Units” that need to be activated, and textures assigned to it for operation in the shader </li></ul></ul><ul><ul><li>Additional information from vertex shader through “varyings” </li></ul></ul><ul><li>Outputs </li></ul><ul><ul><li>gl_FragColor </li></ul></ul>Sample Frag shader
  59. 59. Program <ul><li>Each program consists of 1 fragment shader, and 1 vertex shader </li></ul><ul><li>Within a program, all uniforms share a single global space </li></ul>Precision
  60. 60. Advanced Shaders <ul><li>Animation </li></ul><ul><li>Environment Mapping </li></ul><ul><li>Per-Pixel Lighting (As opposed to textured lighting) </li></ul><ul><li>Bump Mapping </li></ul><ul><li>Ray Tracers </li></ul><ul><li>Procedural Textures </li></ul><ul><li>CSS – shaders (HTML5 – coming up) </li></ul>
  61. 61. Programming with Shaders <ul><li>Pass in shader strings </li></ul><ul><li>Compile, link, Use </li></ul><ul><li>Set uniforms </li></ul><ul><li>Do calculations </li></ul><ul><li>Lab #96 </li></ul>
  62. 62. Lab 96 – Keys for “Squishing the slice” <ul><li>uniform mediump float skyline; </li></ul><ul><li>vec4 tempPos; </li></ul><ul><li>tempPos = MVPMatrix * inVertex; </li></ul><ul><li>tempPos.y=min(skyline, tempPos.y); //or, try below – one of the 2 </li></ul><ul><li>tempPos.y=min(sin(inVertex.x*5.0)+cos(inVertex.y*2.0), tempPos.y); </li></ul><ul><li>gl_Position = tempPos; </li></ul><ul><li>var skylineLoc = context.getUniformLocation(sprogram,&quot;skyline&quot;); </li></ul><ul><li>context.uniform1f(skylineLoc, -0.1); </li></ul><ul><li>context.drawArrays(context.TRIANGLES, 0, vertexparams[1 ] /3 ); </li></ul>
  63. 63. Rendering Targets
  64. 64. Rendering Targets <ul><li>A rendering context is required before drawing a scene. And a correponding Framebuffer </li></ul><ul><ul><li>Recall bindFramebuffer() </li></ul></ul><ul><li>It can be </li></ul><ul><ul><li>Window system Framebuffer (Fb) </li></ul></ul><ul><ul><li>Offscreen buffer (Implemented in a Frame Buffer Object) </li></ul></ul><ul><ul><ul><li>FBO is not a memory area – it is information about the actual color buffer in memory, depth/ stencil buffers </li></ul></ul></ul><ul><li>By default, rendering happens to the Window system framebuffer (ID ‘0’) </li></ul>Need
  65. 65. Need for offscreen rendering <ul><li>Special effects </li></ul><ul><ul><li>Refer the fire effect specified earlier (Multiple passes) </li></ul></ul><ul><li>Interfacing to “non-display” use-cases </li></ul><ul><ul><li>Ex, passing video through GPU, perform 3D effects, then re-encode back to compressed format </li></ul></ul><ul><ul><li>Edge detection/ computation – output is sent to a memory buffer for use by other (non-GL) engines </li></ul></ul>FBO
  66. 66. FrameBuffer Object <ul><li>A Frame Buffer Object </li></ul><ul><ul><li>Can be just a color buffer (ex, a buffer of size 1920x1080x 4) </li></ul></ul><ul><ul><li>Typically also has depth/ stencil buffer </li></ul></ul><ul><ul><li>By default – FBO – ID “0” is never assigned to new FBO </li></ul></ul><ul><ul><ul><li>It is assigned to Window system provided Frame Buffer (onscreen) </li></ul></ul></ul><ul><ul><li>Renderbuffers and Textures can be “attached” to FBO </li></ul></ul><ul><ul><ul><li>For RB – application has to allocate storage </li></ul></ul></ul><ul><ul><ul><li>For FBO, the GL server will allocate the storage </li></ul></ul></ul>rtt
  67. 67. Render-To-Texture <ul><li>By binding a Texture to a FBO, the FBO can be used as </li></ul><ul><ul><li>Stage 1 – target of a rendering operation </li></ul></ul><ul><ul><li>Stage 2 – used as a texture to another draw </li></ul></ul><ul><ul><li>This is “Render-To-Texture” (RTT) </li></ul></ul><ul><li>This allows the flexibility of “discreetly” using the server to do 3D operations (not visible onscreen), then use this output as texture input to a visible object </li></ul><ul><ul><li>If not for RTT, we have to render to regular Framebuffer then do CopyTexImage2D() or readPixels() which are inefficient </li></ul></ul><ul><li>Offscreen rendering is needed for dynamic-reflections </li></ul>APIs
  68. 68. Post-processing operations <ul><li>Blending with Framebuffer - enables nice effects (Ref Lab #6) </li></ul><ul><li>Standard Alpha-Blending </li></ul><ul><ul><li>glEnable ( GL_BLEND ); </li></ul></ul><ul><ul><li>glBlendFunc ( GL_SRC_ALPHA, GL_ONE ); </li></ul></ul><ul><li>Is a “bad” way of creating effects </li></ul><ul><ul><li>Reads back previous framebuffer contents, then blend </li></ul></ul><ul><ul><li>Makes application memory bound, specially at larger resolutions </li></ul></ul><ul><ul><li>Stalls parallel operations within the GPU </li></ul></ul><ul><ul><li>Recommended way is to perform Render-To-Texture, and blending where necessary in the shader </li></ul></ul><ul><li>But needed for medical image viewing – ex Ultrasound images, > 128 slices blending </li></ul>programming
  69. 69. Programming FBO and back to Fb <ul><li>glGenFramebuffers </li></ul><ul><li>glBindFramebuffer </li></ul><ul><ul><li>Makes this FBO used </li></ul></ul><ul><li>glFramebufferTexture2D(id) </li></ul><ul><ul><li>Indicate ‘id’ is to be used for rendering to TEXTURE, so storage is different </li></ul></ul><ul><li>glDeleteFramebuffers </li></ul><ul><li>Then, create separate object to texture with TEXTURE ‘id’ </li></ul><ul><li>Then, use previous textureID id as input to texImage2D next </li></ul><ul><li>Switching to FB </li></ul><ul><ul><li>Change binding to screen FB </li></ul></ul><ul><ul><li>Load different set of vertices as needed, different program as needed </li></ul></ul><ul><ul><li>Set texture binding to FBO texture drawn previously </li></ul></ul><ul><ul><li>DrawElements call </li></ul></ul><ul><li>FBOs are used to do post-processing effects </li></ul>
  70. 70. Programming <ul><li>Draw a textured rectangle to a FBO </li></ul><ul><li>Using this FBO as texture, render another rectangle on-screen </li></ul><ul><li>CheckFramebufferStatus very important </li></ul><ul><li>Lab #910 </li></ul>
  71. 71. Lab 910 – Keys for “Render to Texture” lab <ul><li>Location, location, location ! Also note that readpixels doesn’t show anything! </li></ul><ul><li>// context.clearColor(1.0, 0.0, 0.0, 1.0); </li></ul><ul><li>// context.clear(context.COLOR_BUFFER_BIT | context.DEPTH_BUFFER_BIT); </li></ul><ul><li>// context.flush(); </li></ul>
  72. 72. Platform Integration
  73. 73. Setting up the platform - EGL <ul><li>Context, Window, Surface </li></ul><ul><li>OpenGL ES – </li></ul><ul><ul><li>EGL_SWAP_BEHAVIOR == “EGL_BUFFER_PRESERVED” </li></ul></ul><ul><ul><ul><li>Reduces performance </li></ul></ul></ul><ul><ul><li>Anti-aliasing configurations </li></ul></ul><ul><ul><ul><li>EGL_SAMPLES (4 to 16 typically, 4 on embedded platforms) </li></ul></ul></ul><ul><li>WebGL - preserveDrawingBuffer – attribute </li></ul><ul><ul><li>Optimisations done if it is known that app is clearing the buffer – no dirty region check and whole scene is drawn efficiently </li></ul></ul><ul><ul><li>Dirty region check made in some systems </li></ul></ul>Android
  74. 74. Android Integration Details <ul><li>Android composition uses GLES2.0 mostly as a pixel processor, not a vertex processor </li></ul><ul><ul><li>Uninteresting rectangular windows, treated as a texture </li></ul></ul><ul><ul><ul><li>6 vertices </li></ul></ul></ul><ul><ul><li>Blending of translucent screens/ buttons/ text </li></ul></ul><ul><li>3D (GLES2.0) is natively integrated </li></ul><ul><ul><li>3D Live wallpaper backgrounds </li></ul></ul><ul><ul><li>Video morphing during conferencing (?) </li></ul></ul><ul><ul><li>Use the NDK </li></ul></ul>Surfaceflinger
  75. 75. Android SurfaceFlinger architecture <ul><li>Introduction to OpenGL interface on Android </li></ul><ul><ul><li> </li></ul></ul><ul><li>HW acceleration on Android 3.0 / 4.0 </li></ul><ul><ul><li> </li></ul></ul>composition
  76. 76. Optimising OpenGL / ES applications <ul><li>Graphics performance is closely tied to a specific HW </li></ul><ul><ul><li>Size of interface to memory, cache lines </li></ul></ul><ul><ul><li>HW shared with CPU – ex, dedicated memory banks </li></ul></ul><ul><ul><li>Power vs Raw performance </li></ul></ul><ul><ul><li>Intelligent Discarding of vertices/ objects (!) </li></ul></ul><ul><li>Performance is typically limited by </li></ul><ul><ul><li>Memory throughput </li></ul></ul><ul><ul><li>GPU pixel operations per GPU clock </li></ul></ul><ul><ul><li>CPU throughput for operations involving vertices </li></ul></ul><ul><ul><li>Load balancing of units – within the GPU </li></ul></ul><ul><li>GPUs that are integrated into SOCs are more closely tied to the CPU for operations, than separate GPUs </li></ul><ul><ul><li>Ex, GPU drivers offload some operations to CPU </li></ul></ul>debugging
  77. 77. Debugging OpenGL <ul><li>Vanishing vertices, Holes </li></ul><ul><li>Improper lighting </li></ul><ul><li>Missing objects in complex scenes </li></ul><ul><li>Windows Tools </li></ul><ul><ul><li>Perfkit/ GLExpert / gDEBugger </li></ul></ul><ul><ul><li>Intel GPA </li></ul></ul><ul><li>Linux Tools </li></ul><ul><ul><li>PVRTune (IMG) </li></ul></ul><ul><ul><li>GDebugger </li></ul></ul><ul><ul><li>Standard kernel tools </li></ul></ul><ul><ul><li>Intel GPA </li></ul></ul><ul><li>Pixel vs Vertex throughput, CPU loading, FPS, Memory limited – tuning knobs </li></ul>
  78. 78. References <ul><li>Specs - </li></ul><ul><li>CanvasMatrix.js </li></ul><ul><ul><li> </li></ul></ul><ul><li>Tools - </li></ul><ul><ul><li> (from Maya) </li></ul></ul><ul><ul><li> - Asset importer </li></ul></ul><ul><li>ARM – Mali – Architecture Recommendations </li></ul><ul><ul><li> </li></ul></ul><ul><li>Optimising games – simple tips </li></ul><ul><ul><li> </li></ul></ul>
  79. 79. Appendix: Video and Graphics <ul><li>Graphics is computed creation </li></ul><ul><ul><li>Video is recorded as-is </li></ul></ul><ul><li>Graphics is object – based </li></ul><ul><ul><li>Video (today) is not </li></ul></ul><ul><li>Graphics is computed every frame fully </li></ul><ul><ul><li>Video is mostly delta sequences </li></ul></ul><ul><ul><ul><li>Motion-detection, construction, compensation </li></ul></ul></ul><ul><ul><ul><li>But extensions like swap_region (Nokia) exist </li></ul></ul></ul>
  80. 80. Q & A, Feedback <ul><li>Feedback </li></ul><ul><ul><li> </li></ul></ul>