Droidcon2013 triangles gangolells_imagination
Upcoming SlideShare
Loading in...5

Droidcon2013 triangles gangolells_imagination






Total Views
Views on SlideShare
Embed Views



4 Embeds 264

http://fr.droidcon.com 134
http://de.droidcon.com 126
http://translate.googleusercontent.com 3
http://webcache.googleusercontent.com 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Droidcon2013 triangles gangolells_imagination Droidcon2013 triangles gangolells_imagination Presentation Transcript

  • © Imagination Technologies p1www.imgtec.comApril 2013It’s all about triangles!Understanding the GPU in your pocket towrite better code
  • © Imagination Technologies p2Introductions Who? Guillem Vinals Gangolells (guillem.vinalsgangolells@imgtec.com) Developer Technology Engineer, PowerVR Graphics What? It’s all about triangles! Understanding the GPU in your pocket to write better code
  • © Imagination Technologies p3Company overview Leading silicon, software & cloud IP supplier Multimedia: graphics; GPU compute; video; vision Communications: demodulation; connectivity; sensors Processors: applications CPUs; embedded MCUs Cloud: device and user management; services Targeting high volume, high growth markets Top semis and OEMs for mobile, connected home consumerautomotive and more Pure: our strategic product division Digital radio, internet connected audio, home automation Established technology powerhouse Founded 1985; London FTSE 250 (IMG.L); ~1,500 employees UK HQ; global operationsComprehensive IPportfolio for SoCs& cloud connectivityIP business pathfinderMarket maker/driver
  • © Imagination Technologies p4www.imgtec.comA Crash Course in Graphics Architectures
  • © Imagination Technologies p5Immediate Mode Renderer (IMR) Buffers kept in system memory High bandwidth use, power consumption & latency Each triangle is processed to completion in submission order Wastes processing time and thus power due to “overdraw” ‘Early-Z’ techniques help but are only as good as your geometry sorting
  • © Imagination Technologies p6Concept: Tiling Frame buffer sub-divided into Tiles 32x32 pixels per tile, for example Varies by device Geometry is sorted into affected tiles Allows each tile to be processed independently Small number of fragments per tile Allows on-chip memory to be used
  • © Imagination Technologies p7Tile Based Renderer (TBR) Rasterizing performed per-tile Allows the use of fast, on-chip, buffers Each triangle is processed to completion in submission order Wastes processing time and thus power due to “overdraw” ‘Early-Z’ techniques help but are only as good as your geometry sorting
  • © Imagination Technologies p8Concept: Deferred Rendering Fragments - Two stage process Hidden Surface Removal (HSR) Shading HSR is pixel perfect Only visible fragments pass, no ‘overdraw’ Only requires position data Less bandwidth & processing, saves power HSR is submission order independent No need for applications to submit geometry front to back
  • © Imagination Technologies p9Tile Based Deferred Renderer (TBDR) = PowerVR Rasterizing performed per-tile Allows the use of fast, on-chip, buffers Hidden Surface Removal (HSR) reduces overdraw Pixel perfect, and submission order independent, no geometry sorting needed Optimised to only retrieve information required (*), saving even more bandwidth Saves power and bandwidth
  • © Imagination Technologies p10www.imgtec.comPowerVR Hardware Overview
  • © Imagination Technologies p11Pipeline SummaryGeometry Processing
  • © Imagination Technologies p12Pipeline SummaryFragment Processing
  • © Imagination Technologies p13Bandwidth Saving Bandwidth usage is the biggest contributor to GPU power consumption Saving bandwidth means staying ‘on chip’ as much as possible It also means throwing away work you don’t need to do PowerVR is designed from the ground up to do all of these
  • © Imagination Technologies p14Unified Architecture
  • © Imagination Technologies p15Pixel Back End (PBE) Combines sub-samples for on-chip MSAA MSAA Performed per-tile Done using sub-sampling Negligible impact on bandwidth Each sub-sample benefits from HSR Series5/5XT: 4x MSAA Series6: 8x MSAA Performs final format conversions Up scaling, down scaling etc. (Internal TrueColour)
  • © Imagination Technologies p16www.imgtec.comFurther Considerations
  • © Imagination Technologies p17Micro Kernel Specialised software running on the USSE (Series5) or its own core (Series6) Allows the GPU and CPU to operate with minimal synchronisation Improves performance by handling interrupts on the GPU Competing solutions handle interrupts on CPU (in the driver)
  • © Imagination Technologies p18Multicore Near linear performance scaling Small fixed overhead known at design time Geometry processing load-balanced Cores share the processing effort Tiling enables parallel fragment processing Any core can work on any tile when available Each tile is self-contained Multi-core logic is handled by the hardware Completely transparent to the developer
  • © Imagination Technologies p19Alpha Blending Tiling GPUs don’t need to reach in to system memory to perform an alpha blend The colour buffer is on-chip This means that alpha blending doesn’t cost you any additional bandwidth It also means that alpha blending is fast…very fast HSR will also save you some work by throwing away occluded blending work Remember: Opaque, Alpha Test, Alpha Blend
  • © Imagination Technologies p20www.imgtec.comGolden Rules
  • © Imagination Technologies p21Common BottlenecksBased on past observationMost LikelyCPU UsageBandwidth UsageCPU/GPU SynchronisationFragment Shader InstructionsGeometry UploadTexture UploadVertex Shader InstructionsGeometry ComplexityLeast Likely
  • © Imagination Technologies p22Warning!Some of these rules may seem obvious to you……we still see them broken everyday……if you know them, please bear with us
  • © Imagination Technologies p23Understand Your Target Device No two devices are identical Even when they look the same Different SoCs will have different bottlenecks Make sure you test against different chips Make sure you understand the hardware You don’t want your optimisation to make things worse Clearly, you’re already doing this….your here Golden Rule 1
  • © Imagination Technologies p24Don’t Waste GPU Time The Principle of “Good Enough” Dont waste polygons on un-needed detail Textures should never be much larger than their size on screen Why waste time loading a 1Kx1K texture if it’s never going to appear bigger than 128x128? If the user wont notice it, don’t waste time processing itGolden Rule 2
  • © Imagination Technologies p25Promote Calculations up The Chain Don’t do a calculation you don’t need to do If you can do it once per scene, do it once per scene If you can’t, try and do it per vertex There are generally fewer vertices in a scene than fragments. If you can, pre-bake E.g. lighting Remember, ‘Good Enough’Golden Rule 3
  • © Imagination Technologies p26Don’t Access an Active Render Target Accessing a render target from the CPU is very bad for performance If it’s not done properly it will synchronise the GPU and CPU….This is Bad™Golden Rule 4
  • © Imagination Technologies p27Accessing Render Targets Safely Use EGL_KHR_fence_sync Use CPU side handles to GPU mapped memory to avoid blocking calls E.g. GraphicsBuffer (or gralloc) on AndroidGolden Rule 4 Cont.
  • © Imagination Technologies p28Avoid Updating Active Assets Assets may need to stay the same for multiple frames We refer to this as an asset’s ‘Lifespan’Golden Rule 5 Changing a texture during its lifespan may cause ‘Ghosting’ Changing a buffer during its lifespan is blocking This can be managed using circular buffers, similarly to render targets
  • © Imagination Technologies p29Use VBOs and Indexed Geometry VBOs benefit from driver level optimisations Vertex Array Objects (VAOs) may be even better Index your geometry It makes your data smaller It also benefits from driver level optimisations Use static VBOs ideally, and consider the assets lifespan Don’t use a VBO for dynamic dataGolden Rule 6
  • © Imagination Technologies p30Batch Your Draw Calls Group static objects, and draw once Static objects are objects that are static relative to each other Sort objects by render state Emphasis on texture and program state changes Try using texture atlases Remember Golden Rule 5 if your going to update the contentsGolden Rule 7
  • © Imagination Technologies p31Compress Your Textures The lower the bitrate the less bandwidth consumed Use PVRTC & PVRTC2, at 2 & 4bpp RGB/RGBA Don’t confuse this with PNG or JPG which aredecompressed in memory Usually to 24bpp or 32bpp PVRTC is read directly from the compressed form It stays in memory at 2bpp or 4bpp Use MIP-Mapping and remember ‘Good Enough’Golden Rule 8
  • © Imagination Technologies p32Alpha Test/Discard & Alpha Blend Alpha Test removes advantages of ‘Early-Z’ techniques and HSR Fragment visibility isn’t known until fragment shader is run Prefer Alpha Blending, and render in the order Opaque, Alpha Test, Alpha Blend Makes best use of HSRGolden Rule 9
  • © Imagination Technologies p33Use ‘Clear’ and ‘DiscardFrameBuffer’ Calling ‘Clear’ ensures the previous render isn’t uploaded to the GPU By default, the depth/stencil buffers are written to memory at the end of a render Calling DiscardFrameBufferExt(…) ensures these buffers aren’t written to system memory Look for the ‘GL_EXT_discard_framebuffer’ extensionDo both if you can!Golden Rule 10
  • © Imagination Technologies p34Questions ?Or drop us an email: devtech@imgtec.comDownload our PowerVR SDK: bit.ly/PVR_SDKAlso, you can download examples, tools andshell as an Android SDK add-on:http://install.powervrinsider.com/androidsdk.xml
  • © Imagination Technologies p35www.imgtec.comApril 2013