Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mobile Graphics, The Need for Open Source Drivers


Published on

The presentation explains the need for Open source drivers for graphics in the mobile space. Unlike the PC the mobile system is more dynamic and has a broader innovation space. Will Android push this among the IP vendors ?. more information on

Published in: Technology, Art & Photos
  • open gl
    Are you sure you want to  Yes  No
    Your message goes here

Mobile Graphics, The Need for Open Source Drivers

  1. 1. MOBILE GRAPHICS The need for Open Source Drivers & StackHarsha Padmanabha
  2. 2. TOPICS• Mobile Graphics : An Introduction• Application Processors : Missing datasheets• Mobile graphics pipeline & cores•A developers perspective on Graphics & Games• Case Study : Developing a Android based product
  3. 3. TYPICAL MOBILE SOCAndroid, Linux, MeeGo, Symbian, ARM Windows Mobile Bus Interconnect Graphics Communication Audio Display Video OpenGL ES 1.1 OpenGL ES 2.0 Post Processing LTE, 3G, Wifi MP3, AAC, 3GPP DeInterlacing Bluetooth MPEG-1/2/4, H. ....... 264, VC-1
  4. 4. IPHONE : A4 PROCESSOR• A4 Processor ARM A8 !"#$%&( • Main processor for the Apple iPad • 1GHz ARM Cortex A8 45nm core • NEON SIMD• Cache size: VXD • L1I$=32KB • L1D$=32KB SGX 535 • L2=512KB• Graphics engine: • PowerVR SGX 3D engine from Imagination Technologies !""#$%!&%()*$++)( ,-.%, .>;"%,;?$%@%/2%66A .>• Video Engine 2 ! !""#$%!&%3+4%,56+789%,/.001! .)8:;<$8=;5# • PowerVR VXD / VXE : MultiProfile video encode/decode• 53mm² on 45nm LP
  5. 5. NVIDIA TEGRA 250•8 separate cores Dual ARM A9 ARM 9• Dual ARM Cortex A9• ARM9 300MHz HD Decode•1 HD Encode, 1 Decode HD Encode• OGLES 2.0 Graphics core• Image Processor, for HD Camera• Lots of internal memories 2D/3D GPU
  6. 6. MOBILE OS MARKET SHARE 2%2% 18% 41% 17% 5% 14% Symbian iOS Windows Mobile Android RIM Linux Others
  7. 7. TOP APPLICATIONS RANK CATEGORIES 1 Games 2 Social Networking 3 Mail & Messaging 4 Music 5 Entertainment 6 Weather 7 Sports 8 Education & Employment 9 News 10 Health & Fitness
  8. 8. TODAY’S GRAPHICS PERFORMANCE Processor GPU Mhz (M) Tri/Sec (M) Pixels/Sec Texas Instruments PowerVR 130 28 500* OMAP SGX 535 ARM ST Micro 275 30 400* MALI 400 Marvell Vivante 250 25 375* ARMADA GC 800References* Estimates as per marketing data
  9. 9. VISUAL COMPUTING Authoring & Accessibility 3D Digital Asset Plugin-free 3D Mobile OS resource Exchange format Web Content abstraction High-level Steaming Media High-level Multimedia Frameworks Recording and Playback Enhanced Audio Inter-API Embedded 3D Interoperability Hub Acceleration Heterogeneous Parallel Programming Streaming Media and Image Processing Vector 2D Codec Creation Window System Acceleration System IntegrationAll logos & trademarks are copyright
  10. 10. MOBILE GRAPHICS API• OpenGL ES 1.1 • Fixed Function API derived from OpenGL 1.3/1.4• OpenGL ES 2.0 • Complete programmable API, GLSL support derived from OpenGL 2.0• OpenGL ES 3.0* • Future, planned for 2011/12, cross compatibility with OpenGL 4.0• EGL 1.4 • Visual API interoperability hub• OpenCL 1.1 • Open compute, supports GPGPU on multicore architecture
  11. 11. OPENGL ES VERSIONS• OpenGL ES 1.1 – fixed-function pipeline • Based on OpenGL 1.5 • Vertex Arrays / Buffer Objects • Transform & Lighting • Multi-texturing (min 2 units) • Fixed-point & Floating-point profiles Images Copyright Rightware• OpenGL ES 2.0 – programmable pipeline • Based on OpenGL 2.0 • Adds vertex and fragment shader programming • Removes fixed function pipeline • Super-compact, efficient API • High level language (GLSL ES) Images & text Copyright • On-line or off-line compilation
  12. 12. OPENGL ES 2.0 PIPELINE Triangles/Lines/Points Primitive Vertex Primitive Primitive Processing Shader Processing ProcessingAPI Vertex Buffer Objects Fragment Shader Colour Alpha Depth Buffer Dither Framebuffer Test Test Blend
  13. 13. PROGRAMMABLE HARDWARE Input Assembly Texture Shader Shader Cache Core Core Rasterizer Fetch Alpha / Depth Test Output Blend Texture Shader Shader Cache Core Core Fetch Scheduler
  14. 14. DIFFUSE SHADERuniform sampler2D my_color_texture;void main(){ // Defining The Material Colors const vec4 DiffuseColor = vec4(1.0, 0.0, 0.0, 1.0); // Scaling The Input Vector To Length 1 vec3 normalized_normal = normalize(normal); vec3 normalized_vertex_to_light_vector = normalize(vertex_to_light_vector); // Calculating The Diffuse Term And Clamping It To [0;1] float DiffuseTerm = clamp(dot(normal, vertex_to_light_vector), 0.0, 1.0); // Calculating The Final Color gl_FragColor = AmbientColor + DiffuseColor * DiffuseTerm;} Shader compilerInput Fragment <diffuseShader>: Output Shaded Fragment Sample r0,v4, t0, s0 mul r3, v0, cb0[0] madd r3, v1, cb0[1], r3 madd r3, v2, cb0[2], r3 clmp r3, r3, 1(0.0), 1(1.0) mul o0,r0,r3 mul o1,r1,r3 mul o2,r2,r3 mov o0, 1(1.0)
  15. 15. THE DRIVER ARCHITECTURE Shader CompilerUser Space OpenGL ES EGL L2-StackKernel Space GPU Driver FB Driver Hardware [ GPU + Display Controller ]
  16. 16. KERNEL DRIVERS• Device State & Information• Memory Management• Register Allocation• IRQ handling• Performance Counters• Framebuffer Access
  17. 17. USER SPACE DRIVER• Based on OpenGL ES state trackers • Consider OpenGL ES as a big state machine • Some time contains essential algorithms GPU manipulation• Use of host CPU for setup • Tessellation for example • Emulate calls not supported by GPU• Just-In-Time Shader compiler
  18. 18. GRAPHICS PC EMULATOR Shader CompilerUser Space OpenGL ES EGL MESA GL / Proprietary OGL GLSLKernel Space GPU Driver FB Driver Hardware [ GPU + Display Controller ]
  19. 19. MEEGO QEMU EMULATOR Kernel Application X Server Virtual I/O Driver LibGL Client Client OS QEMU HOST Virtual I/O Device Process Frame Buffer Management LibGL Server Stub Offscreen Buffer
  20. 20. THE EMULATOR DRAWBACKS• Does not emulate all OpenGL ES functions • OpenGL ES is usually mapped to OS OpenGL calls • Shader compilers are not optimized • Too many SW layers, consider QMEU as well• Rendering is not pixel accurate• Most emulators don’t support texture compression formats• Texture bandwidth & Memory is not accurately depicted• On Intel integrated GPUs rendering is mostly SW• Only recently have Nvidia & AMD started supporting OGLES 2• Performance counters ?, PBuffers, Depth Size variations........
  21. 21. ANDROID GRAPHICS ARCHITECTURE Android JSR Surface Flinger JNI Wrapper JNI Wrapper OpenGL ES SKIA EGL copyBLIT libagl SW renderer libSkiaHWUser SpaceKernel Space HAL ARM + NeoN Gralloc GPU KDrv FBDev
  22. 22. MEEGO GRAPHICS ARCHITECTURE QML + API QT Paint Engine QT OpenGL Wrapper QScreen X Server (optional) OpenGL ES 2.0 OpenGL ES 1.1 EGLUser SpaceKernel Space GPU KDrv FBDev
  23. 23. A PLATFORM SCENARIO• Consider a SoC with ARM Cortex A9 + SGX535 GPU• Intel Atom Z6xx with x86 + SGX535• As a system integrator you want to “sell products”• Simpler to maintain a common set of drivers• But GPU drivers are an issue • Android userspace drivers are compiled against bionic libc • MeeGo for example uses glibc and generic linux stack • DirectX support only on Intel SGX, not on TI OMAP • Symbol & linker errors • PC development uses an emulation layer with PC Graphics• Performance on the same GPU varies
  24. 24. MOBILE GRAPHICS BIG PICTURE PC Development Device PowerVR SGX QEMU ARM Emulation Android Qualcomm AMD OGLES Based on AMD Nvidia Tegra Simulator, x86 code PowerVR MBX iPhone OpenGL ->OGLES PowerVR SGX QEMU ARM Emulation PowerVR SGX MeeGo X11, OGLES Passthrough ARM MALI
  25. 25. GPU IP Emulators Emulator Features API Support Win32 - OpenGLImagination Tech No PVRTC support OpenGL ES 2.0 Linux - Mesa/OpenGL Win32 - OpenGL ARM MALI OpenGL ES 2.0 Linux - Mesa/OpenGL No Antialiasing Nvidia Tegra Win32 - OpenGL ES OpenGL ES 2.0 No ETC1 Vivante Win32 - OpenGL Basic OpenGL ES 2.0 Win32 - OpenGL Supports performance Qualcomm* OpenGL ES 2.0 Android - OpenGL counters
  26. 26. THE DEVELOPER• OpenGL ES is a standard, conformance guarantees the implementation & functionality• Fragmentation, a lot of players• Khronos does not specify • Texture Compression formats, ETC, ETC2, S3 • Non power of 2 textures, wrapping & mip-mapping • Shader Application Binary Interface • Depth size is not constant , varies 24bit PVR, 16Bit Tegra • Font rendering , Tessellation of Geometry • Although FSAA has become a standard • There is no API to turn off AA feature
  27. 27. WHAT HIDDEN IN SWSoC OMAP 3730 Screen Resolution 800x480SoC Process 45nm CPU ARM Cortex A8OS Android 2.2 DSP C64x TI DSPMobile DDR2 256MBytes GPU SGX 530 ARM 600 MHz ARM 800MHz GPU 200MHz Draw Image 16.0 21.0 Utah TeaPot 30.33 39.19 OGLES Fog 110.71 129.78 OGLES Blending 105.43 118.92
  28. 28. SUMMARY OF THE ISSUES• Many API ( OpenGL ES, OpenVG, OpenCL, OpenMAX )• Many OS ( Linux, Android, MeeGo, Symbian, WinPhone7 )• Many GPU IP ( SGX, MALI, TEGRA, Vivante, DMP )• Binary portability of Shader programs• Extensions differentiate GPUs, also fragments• Does not provide for easy transition among vendors IPs• SW/HW partition is not transparent• Performance of emulators ( Linux, Windows, OS X )
  29. 29. DRIVER ARCHITECTURE Shader CompilerUser Space OpenGL ES EGL L2-StackKernel Space GPU Driver FB Driver Open Source Hardware [ GPU + Display Controller ]
  30. 30. USER SPACE DRIVER OpenGL ES 2.0 SGX LLVM Backend OpenGL ES 2.0 Gallium PC Emulation MALI LLVM Backend LLVM OpenGL ES 1.1 PC GPU Emulation LLVM Backend OpenCL 1.1Source : , 10-lattner-OpenGL.pdf
  31. 31. HOW DOES IT WORK• OpenGL ES, OpenCL, OpenMAX are all standard APIs• Gallium outputs LLVM intermediate representation• LLVM optimizes this IR• Each GPU driver has a LLVM backend description• LLVM Backend describes • Instruction Set • Registers • Constraints
  32. 32. WHAT DOES THIS OFFER• Portable infrastructure supports all GPUs & CPUs• Reuse of optimization paths • PowerPC ( Altivec ) • ARM ( Neon ) • x86 ( SSE )• Smaller driver, simpler to maintain• Interface to inject proprietary code as LLVM backend• Support a common set of graphics functions across platforms• Support a number of different APIs and bring cohesion
  33. 33. REALITY CHECK C Code Bytecode Opcode- Llvm-gcc Diskfunctions OpenGL OpenGL to GLSL LLVM Optimizer LLVM JIT Parser LLVM OpenGL AST LLVM IR LLVM IR • OpenGL support on x86 without explicit Graphics card Leopard (OS X 10.5) • Step1: gcc front end parses OpenGL C code • Step2: GLSL or shader is compiled using clang/LLVM • Step3 : LLVM IR is produced to be further optimized by LLVM JIT
  34. 34. REALITY CHECK OpenCL• OpenCL compilation process on Compute Program SnowLeopard (OSX 10.6) OpenCL front-end [ clang ] • Step1: Compile OCL to LLVM IR LLVM IR (Intermediate Representation) • Step2: Compile to target device X86 LLVM NVIDIA Back-End• NVIDIA GPU device compiles the LLVM X86 Binary IR in two steps: PTX IR • LLVM IR to PTX (CUDA IR) – PTX to target GPU G80 G92 G200 Binary Binary Binary • CPU device uses LLVM x86 BE to compile directly to x86 Binary code.
  35. 35. CAN IT BECOME A REALITY• Fabless companies should force this on IP vendors• Projects like Android & MeeGo can help push SoC providers• Ultimately its the product companies, Nokia, Motorola etc• Developers• Maybe a commercial FOSS company
  36. 36. MISC REFERENCES IN SLIDES• [Slide-1]• ATI RUBY : &menu=browser&image_id=999114&article_id=680582&page=1&show=original••••
  37. 37. WinXP GDI eglpla+orm.h libEGL.lib eglext.h egl.h OpenGL32.dll k lin incl ude libEGL.dll libGLESv2.dll includekhrpla+orm.h ApplicaCon de c lu GPU in link gl2pla+orm.h gl2ext.h gl2.h libGLESv2.lib Linux X11 Components of OpenGL ES 2.0 simulator