This is about accelerating Web graphics performance with ozone-gbm on Intel based Linux desktop systems.
You can find Google Docs slides at
https://docs.google.com/presentation/d/1o-a-DV43SnPPeyQodeMdbIqA05bRTNpZ3uidP2CBYeo/edit#slide=id.g38a9ffee37_0_0
JavaScript Usage Statistics 2024 - The Ultimate Guide
Accelerate graphics performance with ozone-gbm on Intel based Linux desktop systems
1. Accelerate graphics performance with ozone-gbm
on Intel based Linux systems
BlinkOn9(bit.ly/blinkon9-info) Apr 18-19, 2018
Joone Hur(joone.hur@intel.com, joone@chromium.org)
Intel Open Source Technology Center
2. Intel Confidential — Do Not Forward
2
Agenda
▪ Overview
▪ New graphics architecture in CrOS
▪ Ozone & Ozone-gbm
▪ History about running ozone-gbm on Linux systems
▪ Why ozone-gbm for Linux systems?
▪ Why ozone-gbm on Intel GPU?
▪ Intel Graphics features: zero-copy texture upload, video decoding, and
hardware overlay
▪ Upstream status
▪ Enabling ozone-gbm in Yocto Linux and Arch Linux.
▪ Demo
3. Intel Confidential — Do Not Forward
3
Overview
• Intel based Chromebooks have enabled graphics acceleration
features through ozone-gbm:
• zero-copy texture upload, hardware overlay/automic page-flip,
and video encoding/decoding.
• Those features enable Intel Chromebooks to achieve better
performance, save system memory, and extend battery life.
• Make Linux based systems run on ozone-gbm
• DTVs, digital signage devices, IVI systems, and etc.
4. Intel Confidential — Do Not Forward
4
• Intel based Chromebooks had been updated to the new freon graphics
stack(ozone-gbm) in 2015
Xlib
DRM/KMS
i915 device driver
Aura
View
ChromeOS UI
Ozone-gbm
DRM/KMS
i915 device driver
Aura
View
ChromeOS UI
X-Server
New graphics architecture in CrOS (2015)
5. Intel Confidential — Do Not Forward
5
Ozone
• Platform abstraction layer for:
• Accelerated surface
• Low-level input and event handling.
• Allow Chromium to run on embedded SoC
targets and a new windowing systems:
Wayland or Mir
• Ozone Backend:
• X11, Wayland, and GBM.
ozone
Aura/View
Chrome/Content
ozone implementations
(X11, Wayland, GBM)
X11 Wayland DRM/KMS
6. Intel Confidential — Do Not Forward
6
Ozone-gbm
● It allows Chromium to directly talk to DRM/KMS & evdev:
○ Allocate a buffer in GPU memory that can be accessed by a render process.
○ Change the display mode (mode-setting)
○ Align events from evdev and painting with vsync signal from GPU
● It enables Intel graphics features:
○ Texture zero-copy, hardware overlay/atomic page flip and video encoding/decoding.
Chromium
Linux Kernel
events
DRM/KMS
GEM
MESAmini-gbm
GpuMemoryBufferozone-gbm
evdev i915
Driver
Intel GPU
libdrm
7. Intel Confidential — Do Not Forward
7
History
• Started working on ozone-gbm support
for Linux desktop in 2014
• Published a technical article about
ozone-gbm.
• Patches are available:
https://github.com/kalyankondapally/Chro
mium-OzoneGBM
https://software.intel.com/en-us/blogs/2014/10/23/chromium-ozone-gbm-explained
8. Intel Confidential — Do Not Forward
8
History
• Announce the ozone-gbm for Linux
system on ozone-dev in Oct, 2017
9. Intel Confidential — Do Not Forward
9
Why ozone-gbm for Linux systems?
• Chromium is a de facto standard for Linux based devices.
• They are limited to use CSS animations, video playback, and WebGL.
• Ozone-gbm is highly optimized for Intel based Chromebooks.
• It could work on a regular Linux. o/
10. Intel Confidential — Do Not Forward
10
Why ozone-gbm on Intel GPU?
• Ozone-gbm allows to use modern graphics systems:
• Hardware compositor, atomic page flip, hardware overlay, and video encode/decode
• Chromium has enabled the following graphics features:
• Zero-copy texture uploads for 2D rendering
• Video encoding/decoding for HTML5 video and WebRTC
• Hardware overlay/atomic page flip for video, ARC++, and WebGL.
• Those features have been commercially verified on Intel based chromebooks.
ARC++ is the Android app runtime for ChromeOS
Google Pixel (7th i5m $999) Samsung Chromebook Pro
(Core M, $509)
ASUS C202SA
(Celeron, $218)
HP Chromebook 14
(Celeron, $219)
11. Intel Confidential — Do Not Forward
11
Intel graphics features for Chromium
• Zero-copy texture uploads for 2D rendering
• Video encoding/decoding for HTML5 video and
WebRTC
• Hardware overlay for video, ARC++, and
WebGL.
ARC++ is the Android app runtime for ChromeOS
12. Intel Confidential — Do Not Forward
12
Intel graphics features: zero-copy texture upload
• Need one copy to upload a bitmap to
GPU memory as texture.
• In Intel SoC, CPU and GPU share the
same physical memory.
• It enables the render process to paint
content on an imported GPU buffer.
https://software.intel.com/en-us/articles/zero-copy-texture-uploads-in-chrome-os
13. Intel Confidential — Do Not Forward
13
Workload test (zero-copy texture upload)
• Zero-copy texture upload is almost 4 times faster than the fallback in certain
cases. Usually, it is 30~40% faster (link)
Test page software fallback zero-copy texture upload
http://browsertests.herokuap
p.com/perf/background_color
_animation.html
5.3 fps 22.3 fps 4.2X faster
https://codepen.io/cubix4u/pe
n/KXpKRe
12.2 fps 48.2 fps 3.95X faster
Intel Compute stick X5-Z8300 (Cherry-trail, Gen8 graphics, GPU RAM 512MB)
Disclaimer: Estimate only. Not an official benchmark results.
14. Intel Confidential — Do Not Forward
14
Memory consumption(zero-copy texture upload)
• Zero-copy texture upload is also more effective in memory consumption and
power saving
• In case of ChromeOS, GPU process is about 65% lower with zero-copy
compared to software fallback. In the Renderer process the memory
consumed is about 20% lower with zero-copy.
Test page software fallback zero-copy texture upload
http://browsertests.herokuap
p.com/perf/background_color
_animation.html
80 MB 48 MB 40% lower
Intel Compute stick X5-Z8300 (Cherry-trail, Gen8 graphics, GPU RAM 512MB)
Disclaimer: Estimate only. Not an official benchmark results.
15. Intel Confidential — Do Not Forward
15
Intel graphics features: video/image accelerations
Braswell / Cherry
Trail(Atom, Gen8)
Skylake(6th, Gen9) Apollo Lake (Atom, Gen9)
Kaby(7th) / Gemini /
Coffee Lake(8th) (Gen9.5)
H.264 Yes Yes Yes Yes
VC-1 Decode only Decode only Decode only Decode only
JPEG Yes Yes Yes Yes
VP8 Yes Yes Yes Yes
HEVC Decode only Yes Yes Yes
HEVC 10-bit No No Decode only (8K) Yes
VP9 No No Decode only (4K) Yes
https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video
16. Intel Confidential — Do Not Forward
16
Workload test: video decoding, Intel Compute stick(Atom)
• HW accelerated video decoding is 1.7~3.6x faster, which allow to play bigger size
videos(2K, 4K).
• Zero-copy texture upload is also effective in software decoding by reducing a copy
operation between CPU to GPU.
test page software fallback hardware accelerated (video decoding,
zero-copy texture upload)
H.264 video 2K (vimeo video) 13.6 fps 24 fps 1.7x higher
H.264 video 4K (vimeo video) 5.7 fps 20.7 fps 3.6x higher
VP9 1K (Youtube) 30 fps 30 fps
VP9 2K (Youtube) video keeps pausing 25 fps, slightly pausing
* Intel Compute stick X5-Z8300 (Cherry-trail, Gen8 graphics, GPU RAM 512MB)
* VP9 uses software decoding
Disclaimer: Estimate only. Not an official benchmark results.
17. Intel Confidential — Do Not Forward
17
Workload test: video decoding, Skylake(Core 6th gen)
• HW accelerated video decoding is 1.1~1.7x faster, which allow to play bigger size
videos(4K).
• Zero-copy texture upload is also effective in software decoding by reducing a copy
operation between CPU to GPU.
test page software fallback hardware accelerated (video decoding,
zero-copy texture upload)
H.264 video 2K (vimeo video) 25 fps 39.2 fps 1.5 x higher
H.264 video 4K (vimeo video) 24 ps 27.3 fps 1.1 x higher
VP9 1K (Youtube) 58.4 fps 58.4 fps
VP9 2K (Youtube) 23.8 fps, frequently
pausing
42.6 fps, slightly pausing 1.7 X
* Intel Compute stick STK2m364CC (Skylake, Gen9 graphics, GPU RAM 512MB)
* VP9 uses software decoding
Disclaimer: Estimate only. Not an official benchmark results.
18. Intel Confidential — Do Not Forward
18
Workload test: video decoding, Kaby Lake(Core 7th gen),
• HW accelerated video decoding can only play 4K 60fps Youtube videos.
• Software fallback uses almost 3 times memory than hardware decoding.
test page software fallback hardware accelerated
VP9 4K 60fps (Youtube) video keeps pausing 50 fps
VP9 2K 60fps (Youtube) 60 fps(14.9 MB) 60 fps (4.5 MB)
Intel Kaby Lake i7 NUC NUC7i7BNH (Gen9 graphics, GPU RAM 512MB)
Disclaimer: Estimate only. Not an official benchmark results.
19. Intel Confidential — Do Not Forward
19
Intel graphics features: hardware overlay
Display controller can scale and composite overlay planes.
Source: pictures from this doc.
20. Intel Confidential — Do Not Forward
20
Legacy: GPU Composition
• GPU composites all planes
• Composition bandwidth: 2.028GB/s @60fps
Video Layer
Web contents
Browser UI
Primary planeRender Target
Open GL GPU Composition
RGBA / R
RGBA / R
RGBA / R
RGBA /
R
RGBA / W
Picture by Dongseong Hwang
21. Intel Confidential — Do Not Forward
21
Brand-new: Hardware Overlay
Video Layer
Web contents
Browser UI
Primary planeRender Target
Open GL GPU Composition
RGBA / R
RGBA / R
RGBA / R
RGBA /
R
Overlay Plane
• Saves the bandwidth(2.028GB/s ⇒ 0.949GB/s @ 60fps), and power(75 mW)
• GPU is idle, which saves more power.
RGBA / W
Picture by Dongseong Hwang
22. Intel Confidential — Do Not Forward
22
Hardware overlay support in CrOS
● Video overlay
○ Enabled in M61.
● ARC++ overlay
○ Enabled in M61.
○ Run Android applications as Wayland client in CrOS
● WebGL overlay
○ It will be supported in Gen10 (bug).
23. Intel Confidential — Do Not Forward
23
Benefits of Ozone-gbm
● Performance
○ Zero-copy texture is 30~40% faster.
● Memory consumption
○ The GPU process uses 65% less memory.
○ The render process also consumes 20% lower memory.
● UI Responsiveness
○ Ozone-gbm provide a better frame rate by aligning events from evdev and painting with
Vsync signal from GPU.
○ It reduces significant latency of touch and stylus.
● Power Saving
○ Using hardware feature saves more battery such as hardware overlay, hardware
encode/decode, and zero-copy texture upload.
For more details, see https://software.intel.com/en-us/articles/zero-copy-texture-uploads-in-chrome-os
24. Intel Confidential — Do Not Forward
24
Upstream : Allow to run ozone_demo on a Linux/Intel desktop (Landed)
For more details, see
https://chromium-review.googlesource.com/886836
commit 63612d32bdf79162b3aadf5b80ee61e36624fd14
Author: Joone Hur <joone.hur@intel.com>
Date: Fri Feb 2 22:09:35 2018 +0000
Allow to run ozone_demo on a Linux/Intel desktop
First, we need to enable Intel driver by building
Mesa 17.0.2 with --with-egl-platform=surfaceless
--with-dri-drivers=i965
BUG=733450
TEST=ozone_demo
$ cd ~/git/chromium/src
$ gn gen out/Release "--args=use_ozone=true ozone_platform_gbm=true
use_intel_minigbm=true"
$ ninja -C out/Release ozone_demo
$ export EGL_PLATFORM=surfaceless
$ out/Release/ozone_demo
Change-Id: Ic560b2b4f36701f3c159fd35e771d04c2e1ec97e
Reviewed-on: https://chromium-review.googlesource.com/886836
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Robert Kroeger <rjkroege@chromium.org>
Commit-Queue: Joone Hur <joone.hur@intel.com>
Cr-Commit-Position: refs/heads/master@{#534168}
25. Intel Confidential — Do Not Forward
25
Upstream : Separate display configurator from CrOS build (Landed o/)
For more details, see
https://chromium-review.googlesource.com/c/chro
mium/src/+/969416
commit d3ae8737f7186154d2fdc3ccc5fe43bf290f91b5
Author: Joone Hur <joone.hur@intel.com>
Date: Tue Apr 17 18:05:09 2018 +0000
Separate display configurator from CrOS build
This CL moves all files in ui/display/manager/chromeos to
ui/display/manager and adds build_display_configuration GN arg
so that the display configurator could be used in Linux
desktop.
BUG=733450
Change-Id: Idd790cfe6f9e5daf6ccad23353573028ebe5d7ee
Reviewed-on: https://chromium-review.googlesource.com/969416
Reviewed-by: Yusuke Sato <yusukes@chromium.org>
Reviewed-by: James Cook <jamescook@chromium.org>
Reviewed-by: Devlin <rdevlin.cronin@chromium.org>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: Steven Bennetts <stevenjb@chromium.org>
Reviewed-by: Nico Weber <thakis@chromium.org>
Reviewed-by: Ahmed Fakhry <afakhry@chromium.org>
Reviewed-by: kylechar <kylechar@chromium.org>
Reviewed-by: Malay Keshav <malaykeshav@chromium.org>
26. Intel Confidential — Do Not Forward
26
Upstream : [WIP] Make mus_demo work on a desktop Linux
For more details, see
https://chromium-review.googlesource.com/c/chromium/src/+/1015224
[WIP] Make mus_demo work on a desktop Linux
BUG=733450
TEST=mash --service=mus_demo --enable-features=Mash
$ cd ~/git/chromium/src
$ gn gen out/Release "--args=use_ozone=true ozone_platform_gbm=true
use_intel_minigbm=true"
$ ninja -C out/Release mash:all mus_demo
$ export EGL_PLATFORM=surfaceless
$ out/Release/mash --service=mus_demo --enable-features=Mash
Change-Id: I7ab7db87d74f1e1440f63a8a10118c8394597ad4
27. Intel Confidential — Do Not Forward
27
Upstream : Enable VAVDA, VAVEA and VAJDA on linux with VAAPI only
For more details, see
● https://chromium-review.googlesource.com/c/chromium
/src/+/532294
● https://www.phoronix.com/scan.php?page=news_item&
px=Chrome-VA-API-Nears
Enable VAVDA, VAVEA and VAJDA on linux with VAAPI only
This patch contains all the changes necessary to use VA-API along with
vaapi-driver to run all media use cases supported with hardware
acceleration.
It is intended to remain as experimental accessible from chrome://flags on
linux.
It requires libva/intel-vaapi-driver to be installed on the system path where
chrome is executed. Other drivers could be tested if available. Flags are
kept independent for linux, where this feature has to be enabled before
actually using it. This should not change how other OSes use the flags
already, the new flags will show at the buttom on the section of unavailable
experiments
The changes cover a range of compiler pre-processor flags to enable the
stack.
It moves the presandbox operations to the vaapi_wrapper class as the hook
function
is available there. vaInit will open driver on the correct installed folder.
chrome flags consolidtation into only two flags for linux. Mjpeg and
accelerated
video are used. The other flags are kept for ChromeOS and other OSes.
Developer testing was made on skylake hardware, ChromeOS and Ubuntu.
28. Intel Confidential — Do Not Forward
28
How to enable ozone-gbm on Yocto Linux
● Prepare the same Linux kernel & Mesa and its patches for CrOS.
○ https://github.com/joone/poky/tree/pyro_gbm
● Needs several Chromium patches(link) in M58:
○ Enable i915 driver in DRM for mini-gbm.
○ Add CrOS display configurator to Chromium and etc.
● Chromium browser and mash demo work.
○ Support zoro-copy texture upload and video decoding.
● Build Chromium ozone-gbm, Linux kernel and Mesa using Yocto recipes:
○ Instructions: https://github.com/joone/meta-crosswalk.
29. Intel Confidential — Do Not Forward
29
How to enable ozone-gbm on Arch Linux
• ozone_demo and mus_demo work fine in upstream(r550918_0415_2018)
$ cd ~/git/chromium/src
$ gn gen out/Release "--args=use_ozone=true ozone_platform_gbm=true
use_intel_minigbm=true"
$ ninja -C out/Release ozone_demo mash:all mus_demo
$ export EGL_PLATFORM=surfaceless
$ out/Release/ozone_demo --enable-drm-atomic --enable-overlay
$ out/Release/mash --service=mus_demo --enable-features=Mash
30. Intel Confidential — Do Not Forward
30
Demo
• Demo on Intel Atom processor (Cherry-trail)
31. Intel Confidential — Do Not Forward
31
Future Plan
• Run Chromium browser or content_shell with ozone-gbm on Intel SoCs.
• Update Yocto recipe.
• Keep it work in ToT.
• Enable more graphics acceleration features.