© Copyright Khronos Group, 2005 - Page 1

479 views
412 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
479
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

© Copyright Khronos Group, 2005 - Page 1

  1. 1. OpenGL ES Development for OMAP 2™ Processors under Linux Tom Olson Corporate Research and Development Texas Instruments, Inc.
  2. 2. Outline <ul><li>TI, Linux, and the OMAP 2 Architecture </li></ul><ul><li>Ways to work with the OMAP 2 Architecture </li></ul><ul><li>Platform Overview </li></ul><ul><li>Linux for the OMAP 2 Architecture </li></ul><ul><li>Optimizing Graphics for the platform </li></ul><ul><li>Recommended Development Setup </li></ul><ul><li>Development Environment Demo </li></ul><ul><li>Q&A </li></ul>
  3. 3. Texas Instruments <ul><li>The Leader in Wireless </li></ul><ul><ul><li>#1 in wireless semiconductor revenue (Dataquest) </li></ul></ul><ul><ul><li>#1 in wireless applications processors (IDC) </li></ul></ul><ul><ul><li>Six of the top seven 3G handset makers use TI silicon </li></ul></ul><ul><ul><li>All 3G NTT DoCoMo FOMA manufacturers use TI OMAP </li></ul></ul><ul><li>A Strong Supporter of Standards </li></ul><ul><ul><li>Khronos Promoting member </li></ul></ul><ul><ul><li>Active participant in OpenGL ES and OpenMAX </li></ul></ul>
  4. 4. The OMAP 2 Architecture and Linux <ul><li>OMAP 2 Architecture </li></ul><ul><ul><li>TI’s next-gen wireless applications processor family </li></ul></ul><ul><ul><li>A universal mobile entertainment platform </li></ul></ul><ul><ul><li>Targets media, imaging, graphics, gaming </li></ul></ul><ul><ul><li>First family member (OMAP2420) is sampling now </li></ul></ul><ul><li>Linux </li></ul><ul><ul><li>A full-featured, flexible, robust, OS - and it’s free! </li></ul></ul><ul><ul><li>Growing rapidly in embedded applications </li></ul></ul><ul><ul><li>Fully supported on TI OMAP and OMAP 2 </li></ul></ul>
  5. 5. Working with OMAP 2 Architecture <ul><li>You can work with </li></ul><ul><ul><li>TI </li></ul></ul><ul><ul><li>Handset OEMs </li></ul></ul><ul><ul><li>Operators </li></ul></ul><ul><li>You can work on </li></ul><ul><ul><li>TI development boards </li></ul></ul><ul><ul><li>OEM prototype handsets </li></ul></ul><ul><ul><li>Real handsets </li></ul></ul><ul><li>You can create </li></ul><ul><ul><li>Demos, benchmarks </li></ul></ul><ul><ul><li>Middleware </li></ul></ul><ul><ul><li>Pre-installed games </li></ul></ul><ul><ul><li>Downloadable games </li></ul></ul><ul><li>You can work in </li></ul><ul><ul><li>Low-level C/C++ </li></ul></ul><ul><ul><li>C/C++ middleware </li></ul></ul><ul><ul><li>JSR 184/239, DoJa </li></ul></ul>
  6. 6. OMAP 2 Architecture Overview <ul><li>OMAP2420 Features </li></ul><ul><li>ARM 1136 @ 330 MHz, VFP (Vector Floating Point), 32K/32K I/Dcache </li></ul><ul><li>DSP @ 220 MHz </li></ul><ul><li>2D/3D graphics accelerator </li></ul><ul><li>IVA supports still images to >4 Mpixels, 30 fps VGA video decode </li></ul><ul><li>Output to TV for gaming and video playback </li></ul><ul><li>Encryption hardware for DRM and security </li></ul>ARM11 + VFP 2D/3D Graphics Accelerator Camera I/F Memory Controller Peripherals L4 Interconnect Imaging & Video Accelerator (IVA) Internal SRAM OMAP2420 LCD I/F Video Out L3 Interconnect TMS320C55x DSP Security
  7. 7. OMAP 2 Graphics Acceleration <ul><li>Provided by Imagination Technologies MBX + VGP </li></ul><ul><li>Feature set optimized for OpenGL ES 1.1 </li></ul><ul><ul><li>Full geometry pipe in hardware </li></ul></ul><ul><ul><li>Vertex programmable (~DX8 1.0, off-line assembler) </li></ul></ul><ul><ul><li>Two texture units w/ Combine, Dot3 environments </li></ul></ul><ul><ul><li>Texture compression </li></ul></ul><ul><ul><li>4x FSAA </li></ul></ul><ul><li>Leaves the ARM+VFP free for application use… </li></ul>
  8. 8. Linux SDK for OMAP 2 Architecture <ul><li>Standard offerings </li></ul><ul><ul><li>Linux 2.6 kernel </li></ul></ul><ul><ul><li>Memory protection for applications </li></ul></ul><ul><ul><li>Kdrive (lightweight X11) windowing and display </li></ul></ul><ul><ul><li>OpenGL ES 1.1 driver </li></ul></ul><ul><ul><li>OpenMAX IL for audio / video </li></ul></ul><ul><li>For more information: </li></ul><ul><ul><li>http://linux.omap.com </li></ul></ul><ul><ul><li>http://freedesktop.org/Software/Xserver </li></ul></ul><ul><li>For commercial support: </li></ul><ul><ul><li>MontaVista (http://www.mvista.com) </li></ul></ul>
  9. 9. OpenGL ES Driver <ul><li>Currently supports ES 1.0 - will ship with ES 1.1 </li></ul><ul><li>Standard extensions </li></ul><ul><ul><li>OES_matrix_palette </li></ul></ul><ul><ul><li>OES_draw_texture </li></ul></ul><ul><ul><li>OES_query_matrix </li></ul></ul><ul><li>Platform-specific extensions at vendor / carrier option </li></ul><ul><ul><li>IMG_vertex_program </li></ul></ul><ul><ul><li>IMG_texture_compress_PVRTC </li></ul></ul><ul><li>EGL support </li></ul><ul><ul><li>Window and PBuffer surfaces </li></ul></ul>
  10. 10. VGP / MBX Key Features <ul><li>Memory shared with CPU </li></ul><ul><ul><li>Memory bandwidth is a scarce resource </li></ul></ul><ul><li>Deferred rendering architecture </li></ul><ul><ul><li>Deep pipeline allows higher throughput </li></ul></ul><ul><ul><li>Pipeline stalls are very expensive </li></ul></ul><ul><li>Tiled architecture </li></ul><ul><ul><li>Color and Z-buffer are normally on chip </li></ul></ul><ul><ul><li>Reduces external memory bandwidth and power </li></ul></ul>
  11. 11. Optimizing Graphics for the Platform <ul><li>General principles </li></ul><ul><ul><li>Conserve memory bandwidth - memory is power </li></ul></ul><ul><ul><li>Consider the implications of the deep pipeline </li></ul></ul><ul><ul><li>Use standard ‘best practices’ - don’t be lazy! </li></ul></ul><ul><ul><li>Don’t use what you don’t need… </li></ul></ul><ul><ul><li>…But do take advantage of what the hardware gives you </li></ul></ul>
  12. 12. Optimizing Graphics for the Platform <ul><li>Memory bandwidth conservation </li></ul><ul><ul><li>Use small data types: short or byte </li></ul></ul><ul><ul><li>Use Vertex Buffer Objects </li></ul></ul><ul><ul><li>Use efficient primitives: indexed primitives, strips </li></ul></ul><ul><ul><li>Use MIPmaps to optimize texture cache performance </li></ul></ul><ul><ul><li>Don’t send a big texture to do a little texture’s job </li></ul></ul><ul><ul><li>Use dot3 bump mapping to substitute for model detail </li></ul></ul><ul><ul><li>Use compressed textures if supported & where they work </li></ul></ul><ul><ul><li>… But paletted textures are not recommended </li></ul></ul>
  13. 13. Optimizing Graphics for OMAP 2 <ul><li>Don’t use what you don’t need </li></ul><ul><ul><li>Know your defaults; turn off features you don’t need </li></ul></ul><ul><ul><li>Don’t draw what you know you can’t see </li></ul></ul><ul><ul><ul><li>But don’t kill yourself; MBX only draws visible fragments </li></ul></ul></ul><ul><ul><li>Don’t exceed a useful frame rate - use eglSwapInterval </li></ul></ul><ul><li>… But do use what the hardware gives you </li></ul><ul><ul><li>Z-buffering is free </li></ul></ul><ul><ul><li>Blending is free </li></ul></ul><ul><ul><li>Bilinear texturing is free </li></ul></ul><ul><ul><li>FSAA is relatively low cost </li></ul></ul><ul><ul><li>Do vertex computations in the VGP </li></ul></ul>
  14. 14. Optimizing Graphics for OMAP 2 <ul><li>Use standard ‘best practices’ </li></ul><ul><ul><li>Send in large arrays to minimize number of calls </li></ul></ul><ul><ul><li>Use indexed primitives or (rarely) strips </li></ul></ul><ul><ul><li>Minimize state changes via state sorting </li></ul></ul><ul><ul><ul><li>But depth sorting (e.g. front-to-back) is not needed </li></ul></ul></ul><ul><ul><li>etc; see SIGGRAPH ‘Advanced OpenGL’ course </li></ul></ul>
  15. 15. Implications of Deep Pipelining ARM writes 3D data to VGP cmd buffer VGP performs T&L, Tile Accelerator writes display list MBX Rasterizer reads display list and renders triangles Frame Buffer Game VGP Cmds Display List TA VGP Rasterizer <ul><li>MBX trades latency for throughput </li></ul><ul><ul><li>Rasterization is deferred until VGP finishes frame T&L </li></ul></ul><ul><ul><li>Enables hardware hidden surface removal </li></ul></ul>
  16. 16. Implications of Deep Pipelining ARM writes 3D data to VGP cmd buffer VGP performs T&L, Tile Accelerator writes display list Rasterizer reads display list and renders triangles ARM issues a command that stalls the pipe <ul><li>Stalling the pipeline is expensive </li></ul><ul><ul><li>up to 50% performance hit in a balanced application </li></ul></ul>
  17. 17. Implications of Deep Pipelining <ul><li>Commands that stall the pipe (avoid them!) </li></ul><ul><ul><li>glReadPixels </li></ul></ul><ul><ul><li>glCopyTex{Sub}Image2d </li></ul></ul><ul><ul><li>glFinish / eglWaitGL </li></ul></ul><ul><li>Other Implications </li></ul><ul><ul><li>Texture update (glTexSubImage2d) is tricky </li></ul></ul><ul><ul><li>Texture cannot be updated while it is still in use </li></ul></ul><ul><ul><li>Note: texture may be ‘in use’ for longer than you think </li></ul></ul>
  18. 18. Combining 3D With Native Rendering <ul><li>So how do we mix 2D and 3D rendering? </li></ul><ul><ul><li>Standard solution uses EGL Pixmaps with eglWaitGL </li></ul></ul><ul><ul><li>This produces correct results but impacts performance </li></ul></ul><ul><li>Recommended Approach </li></ul><ul><ul><li>Wherever possible, use 3D calls to implement 2D </li></ul></ul><ul><ul><li>Lines, rectangles, image blits are fairly trivial </li></ul></ul><ul><ul><li>Complex / concave polygons a bit messy, but doable </li></ul></ul>
  19. 19. OMAP 2 - H4 Development Platform <ul><li>All-purpose prototyping and evaluation board </li></ul><ul><li>OMAP2420 </li></ul><ul><ul><li>ARM11 + VFP (300 MHz) </li></ul></ul><ul><ul><li>DSP / IVA (200 MHz) </li></ul></ul><ul><ul><li>SDR / DDR (100 MHz) </li></ul></ul><ul><ul><li>Imagination Technologies VGP / MBX (50 MHz) </li></ul></ul><ul><li>Keypad / QVGA display / touchscreen </li></ul><ul><li>Camera / audio / MMC </li></ul><ul><li>Bluetooth / ethernet / serial / IRDA / USB host, client </li></ul><ul><li>JTAG HW debug port </li></ul><ul><li>Support for Linux, Symbian, WinCE </li></ul>
  20. 20. Recommended Development Setup <ul><ul><li>Boot Linux console via terminal emulator </li></ul></ul>Linux Host <ul><ul><li>NFS-mount your dev directory to see executables, data </li></ul></ul><ul><ul><li>Run GDB server for remote debug (or good-ol’ printf…) </li></ul></ul>OMAP2420 Console Shell serial Terminal Emulator ethernet NFS Client NFS Server ethernet GDB Server GDB Client
  21. 21. Linux / H4 Development Tips <ul><li>Take full advantage of the Linux environment: </li></ul><ul><ul><li>Shared, fast file system </li></ul></ul><ul><ul><ul><li>Edit / compile / debug loop benefits are obvious </li></ul></ul></ul><ul><ul><ul><li>Quick turn-around on game asset tuning </li></ul></ul></ul><ul><ul><ul><li>Big log files, debug images, traces </li></ul></ul></ul><ul><ul><li>Fast networking </li></ul></ul><ul><ul><ul><li>Remote debug </li></ul></ul></ul><ul><ul><ul><li>Auxiliary input streams </li></ul></ul></ul><ul><ul><li>Be creative! (or at least intelligent) </li></ul></ul><ul><ul><ul><li>There are a lot of tools out there - use them </li></ul></ul></ul><ul><li>…But don’t fool yourself </li></ul><ul><ul><ul><li>Fielded systems will be more limited </li></ul></ul></ul>
  22. 22. Development Platform Demo
  23. 23. Any Questions?

×