LCE13: Android Hwcomposer on KMS


Published on

Resource: LCE13
Name: Android Hwcomposer on KMS
Date: 11-07-2013
Speaker: Ross Oldfield

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

LCE13: Android Hwcomposer on KMS

  1. 1. Group photograph at Linaro Connect in Copenhagen Monday 29 Oct 2012 Android hwcomposer on KMS LCE-13 Graphics WG
  2. 2. What?  Attempt to have a common display backend for both Linux and Android. Would have several advantages:  Common Android display HAL implementation for multiple SoCs  Reduce the amount of work to get an SoC’s display going?  KMS ought to be a decent fit, surely?  KMS has “planes” and Android has “layers”  Things get synchronised to the vertical blanking  Android can always fall back to GLES composition if KMS can’t support a particular layer
  3. 3. Android SurfaceFlinger  Android displays things using SurfaceFlinger  Can composite multiple layers together  using OpenGL|ES  or use hwcomposer  GLES composition will always be possible  Android requires that GLES be present  Using SoC composition hardware will be Lower power Higher performance
  4. 4. Why?  Why would we want this?  To get development platforms up-and-running  getting a platform up-and-running  just want to be able to see the UI  not having to make any special changes to go from a Linux system to an Android one (to get the display going...)  High performance HAL suitable for a production platform  Use any hardware overlay support  Reduce reliance on GPU (let it render the UI!)
  5. 5. hwcomposer HAL  Most of the behaviour is described in the header   Broadly three distinct functions  Modesetting, handling display hotplug etc.  Displaying the output of any GLES composition  Offloading layer composition to dedicated hardware
  6. 6. hwcomposer Display Layers Buffers Blending mode Plane Alpha Transform (Rotation) Acquire & Release Fences Source & Dest Windows Hints Pixel Format Dimensions
  7. 7. Prepare and Set  prepare() chooses which layers are to be handled by  Hardware ( layer type is set to HWC_OVERLAY )  GLES ( layer type is set to HWC_FRAMEBUFFER )  Potentially many calls to Prepare() per display/composition cycle  Set() consumes layers  HWC_BACKGROUND  HWC_FRAMEBUFFER_TARGET  HWC_OVERLAY prepare() GLES composition set()
  8. 8. Layer Attributes  Blending mode  Are RGBA pixels pre-multiplied alpha or not?  Plane alpha  applied to the whole layer Transform  Flip or rotate (clockwise) the source layer  Source and Destination windows  “scaling”
  9. 9. Brief Digression: Explicit Sync  Every layer has an acquire and release fence  Each array of layers has a retire fence  Acquire fence  “don't show this buffer until the fence is signalled”  Release fence  “the buffer for this layer is no longer being read”  Retire fence  “this scene / array of layers is no longer being shown”
  10. 10. Fences  Used to signal read/write ownership of the buffer GPU Display Buffer owned by GPU Buffer owned by Display Buffer owned by GPU Signal acquire fence Signal release fence
  11. 11. Acquire Fences  Normally a pageflip would cause the hardware to be set up  Next buffers flipped to when the VSYNC occurs VSYNC Pageflip B A B C VSYNC Program hardware Show B next Pageflip C Program hardware Show C next
  12. 12. Acquire Fences  Acquire fence must be signalled before the h/w is set up  Otherwise the buffer might be read before its pixels are valid!  If registers are set asynchronously  How can you signal any error status back to the pageflip? VSYNC Pageflip B, fence A B C VSYNC Remember B Pageflip C, fence Fence cb() Set h/w to B Remember C Fence cb() Set h/w to C
  13. 13. Release Fences  Signal when a buffer is no longer being read  For a display controller  A buffer is no longer being read…  After the next scene gets set()  … and then after the following VSYNC  All the buffers get their release fences signalled at the same time  sync timeline increments after VSYNC after a pageflip  same sync_pt set to current timeline position + 1  Each buffer has its own fence  potentially lots of file handles to open and return
  14. 14. Release Fences  The release fence for buffer A is signalled  for every scene it was in that has been replaced  even if the next scene also is going to read from A VSYNC Scene Buffers Fence to Signal A B A C A C VSYNC X Y Z Release X A Release X B VSYNC Release Y A Release Y C
  15. 15. Requirements  We must be able to  modeset a display  handle the VSYNC event from a display  map the memory backing a GLES-composited framebuffer  “wait” until the acquire fence is signalled before actually presenting anything to the screen  signal release and retire fences after the next pageflip completes  We would like to  show YUV and RGBA layers as planes  not have to cover the screen in buffers  (show the background colour)  handle all the layer attributes  per-plane alpha  non and pre-multiplied alpha
  16. 16. How?  Using KMS gets us all the modesetting goodness  This leaves  Mapping buffers  Handling sync (especially the Acquire fences)  Handling layers & planes
  17. 17. Buffers  Given a layer with a native_buffer_t  Usage hint of GRALLOC_USAGE_HW_COMPOSER  Gralloc is responsible for ensuring the buffer’s memory is suitable for the display controller  Want  A dma_buf file descriptor  Dimensions and pitch of the image  Pixel format 4CC  Any other metadata  maybe YUV luma range?
  18. 18. Buffers & Pixel Formats  Funny pixel formats  Maybe your video decoder generates some proprietary tiled format?  HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED  Need a platform-specific routine to generate a 4CC  This should live in gralloc  It is already platform specific  Want to add gralloc extensions in gralloc_perform()  return a dma_buf  Any other metadata we need
  19. 19. Sync – wait in user space?  Certainly possible  Would block set() until all buffers have been acquired  Good news  Would work without having to add anything to KMS drivers  Bad news  A bit of extra scheduling overhead to set up your screen  Pageflip is back to being serialised behind content generation
  20. 20. Sync – handle in the kernel  Would require some sort of ‘delayed pageflip’  Defer the actual pageflip work until after the acquire fence has signalled  Issues with accurate error reporting  we care that our pageflip will actually succeed!  Some sort of new API into KMS?  Use Android fences or dmabuf fences?  Ought to be common but would impact a lot of drivers?  Testing? VSYNC Pageflip B, fence A B C VSYNC Remember B Pageflip C, fence Fence cb() Set h/w to B Remember C Fence cb() Set h/w to C
  21. 21. Layers  For each new composition scene we get  Lots of layers  Lots of attributes per layer  Rotation, alpha etc.  Lots of release fences to create  Acquire fences to deal with  Want to deal with them all at once  SurfaceFlinger always specifies the entire scene for each display every time it changes any part of the composition scene  Should we be looking at adopting one of the ‘atomic modeset’ patches to KMS for this?  Must support scenes with no ‘fullscreen CRTC framebuffer’  Opportunity to add some of this ‘delayed pageflip’ behaviour?
  22. 22. Summary  What to do is strongly influenced by how we want to handle sync  Broadly three options Fences Pro Con 1 Userland synchronous wait Correct behaviour for SoC bring-up No KMS changes GPU composition only More likely to ‘jank’ 2 Kernel async wait Less likely to ‘jank’ than (1) GPU composition only KMS API additions 3 Kernel async wait Uses SoC composition hardware Lower power Large KMS changes (atomic pageflip?)
  23. 23. Thank you