Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming

Decoding Complexity-Rate-Quality Pareto-Front for
Adaptive VVC Streaming
—
Angeliki Katsenou¹, Vignesh V Menon², Adam Wieckowski², Benjamin Bross², Detlev Marpe²
¹Bristol Digital Futures Institute, University of Bristol, UK
²Video Communication and Applications Dept., Fraunhofer HHI, Germany

MHV’24
Introduction
8-11 Dec 2024 © Fraunhofer
Slide 2
● High-Quality Video Demand
○ Growing demand for video streaming drives advances in video
compression.
○ Versatile Video Coding (VVC) [1] improves compression over High
Effciency Video Coding (HEVC) [2] but increases computational
complexity.
● HTTP Adaptive Streaming (HAS)
○ Dynamically adjusts video quality based on network conditions [3].
○ Uses bitrate ladders to select optimal resolutions and bitrates.
[1] B. Bross, Y.-K. Wang, Y. Ye, S. Liu, J. Chen, G. J. Sullivan, and J.-R. Ohm, “Overview of the Versatile Video Coding (VVC) Standard and its Applications,” in IEEE Transactions on Circuits and Systems for Video
Technology, vol. 31, no. 10, Oct. 2021, pp. 3736–3764.
[2] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the high efficiency video coding (HEVC) standard,” in IEEE Transactionson circuits and systems for video technology, vol. 22, no. 12. IEEE, Dec. 2012,
pp. 1649–1668.
[3] A. Bentaleb, B. Taani, A. C. Begen, C. Timmerer, and R. Zimmermann, “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP,” in IEEE Communications Surveys & Tutorials, vol. 21, no. 1, 2019, pp.
562–585.
[4] Uhrina, Miroslav, Lukas Sevcik, Juraj Bienik, and Lenka Smatanova. 2024. "Performance Comparison of VVC, AV1, HEVC, and AVC for High Resolutions" Electronics 13, no. 5: 953.
https://doi.org/10.3390/electronics13050953
Table: Averaged BD-BR savings depending on codec and resolution [4]. Table: Averaged BD-PSNR savings depending on codec and resolution [4].

MHV’24
Introduction
Slide 3
● Challenges
○ Constructing efficient bitrate ladders is crucial for
balancing video quality and decoding complexity.
○ Increased complexity impacts energy-constrained
devices like smartphones.
● Objective
○ Use Pareto-front optimization to balance bitrate,
quality, and decoding complexity.
○ Enhance streaming efficiency while prioritizing quality,
latency, or energy savings.

MHV’24
Objectives
Slide 4
● Goal
○ Develop an optimized approach for bitrate ladder construction in adaptive
streaming.
● Trade-Off Optimization
○ Focus on balancing three key metrics:
■ Bitrate: Minimize network usage.
■ Video Quality: Maximize viewer experience.
■ Decoding Complexity: Reduce computational load and energy
consumption.
● Joint Rate-Quality-Decoding Time Pareto-Front (RQT-PF)
○ Introduce a novel optimization approach to find the best trade-offs.
○ Provide flexibility to prioritize different streaming aspects (e.g., quality,
latency, energy efficiency).
● Outcome
○ Enhance streaming efficiency for diverse scenarios.
○ Improve user experience by selecting the optimal balance for given
constraints.

MHV’24
Overview of Pareto-Front Optimization
Slide 5
● What is Pareto-Front Optimization?
○ A method used to identify optimal trade-offs
between multiple conflicting objectives.
○ Solutions lie on the Pareto front if improving one
metric results in degrading another.
● Application in Video Streaming
○ Used to balance conflicting objectives like bitrate,
video quality, and decoding complexity [1,2].
○ Enables the construction of bitrate ladders that
adapt based on specific priorities (e.g., quality vs.
complexity) [3].
● Why Pareto-Front Matters for Streaming
○ Identifies optimal trade-offs that traditional fixed [4]
or heuristic approaches [5] miss.
○ Supports energy-efficient and adaptive streaming to
enhance QoE.
[1] A. V. Katsenou, F. Zhang, K. Swanson, M. Afonso, J. Sole, and D. R. Bull, “VMAF-based Bitrate Ladder Estimation for Adaptive Streaming,” in 2021 Picture Coding Symposium (PCS), 2021, pp. 1–5.
[2] J. De Cock, Z. Li, M. Manohara, and A. Aaron, “Complexity-based consistent-quality encoding in the cloud,” in 2016 IEEE International Conference on Image Processing (ICIP), 2016, pp. 1484–1488.
[3] A. V. Katsenou, J. Sole, and D. R. Bull, “Efficient Bitrate Ladder Construction for Content-Optimized Adaptive Video Streaming,” in IEEE Open Journal of Signal Processing, vol. 2, 2021, pp. 496–511.
[4] Apple Inc., “HLS Authoring Specification for Apple Devices.” [Online]. Available: https://developer.apple.com/documentation/http_live_streaming/hls_authoring_specification_for_apple_devices
[5] C. Chen, Y.-C. Lin, S. Benting, and A. Kokaram, “Optimized Transcoding for Large Scale Adaptive Streaming Using Playback Statistics,” in 25th IEEE International Conference on Image Processing (ICIP), Oct 2018, pp.
3269–3273.
Rate-XPSNR curves and decoding times of example Inter4K sequences using VVenC /
VVdeC.

MHV’24
Illustration of RQT-PF Optimization
Slide 6
● Exploring the Parameter Space
○ RQT-PF optimization identifies the optimal
encoding configurations by exploring different
parameter combinations.
○ Parameters considered: resolution and
quantization parameter (QP).
● Trade-Off Visualization
○ Higher resolutions and lower QP values provide
better video quality but increase bitrate and
decoding time.
○ Lower resolutions and higher QP values reduce
bitrate and decoding time but compromise video
quality.
● Rate-Quality-Decoding Time Space
○ Visualize the Rate-Quality-Decoding Time (RQT)
parameter space across different encoding
settings.
○ Different points in this space represent various
trade-offs, such as higher quality vs. increased
complexity.
Quality-Rate-Time points for Inter4K across six spatial resolutions.

MHV’24
Algorithm for RQT-PF Estimation
Slide 7
● Multi-Objective Optimization
○ Objective: Minimize bitrate and decoding time while
maximizing quality.
○ Optimization is performed in the Rate-Quality-Time
(RQT) space to identify Pareto-optimal points.
● Composite Metric
○ Define a composite metric M as a combination of
decoding time and bitrate:
■ M = α * log(τD) + (1 - α) * log(b), where α controls
the importance of decoding time vs. bitrate.
○ Goal: Minimize M and maximize quality simultaneously.

MHV’24
Evaluation Setup
Slide 8
● Dataset: Inter4K [1]
○ 1,000 UHD clips at 60 frames per second.
○ Videos cover a range of content types: urban environments, nature,
sports, music videos, etc.
● Target Codecs
○ Encoder: VVenC (Versatile Video Encoder) [2].
○ Decoder: VVdeC (Versatile Video Decoder) [3].
● Performance Metrics
○ Quality Metrics: PSNR (Peak Signal-to-Noise Ratio), XPSNR [4].
○ Bjøntegaard Delta Metrics: BDR (Rate), BD-PSNR, BD-XPSNR —
used to compare average performance across methods.
○ Decoding Time Difference (∆TD): Measures the efficiency of each
method in terms of decoding time compared to a reference.
[1] A. Stergiou and R. Poppe, “AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling,” in IEEE Transactions on Image Processing, vol. 32, 2023, pp. 251–266.
[2] A. Wieckowski, J. Brandenburg, T. Hinz, C. Bartnik, V. George, G. Hege, C. Helmrich, A. Henkel, C. Lehmann, C. Stoffers, I. Zupancic, B. Bross, and D. Marpe, “Vvenc: An Open And Optimized VVC Encoder
Implementation,” in 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Jul. 2021, pp. 1–2.
[3] A. Wieckowski, G. Hege, C. Bartnik, C. Lehmann, C. Stoffers, B. Bross, and D. Marpe, “Towards A Live Software Decoder Implementation For The Upcoming Versatile Video Coding (VVC) Codec,” in Proc. IEEE
International Conference on Image Processing (ICIP), pp. 3124–3128.
[4] C. R. Helmrich, M. Siekmann, S. Becker, S. Bosse, D. Marpe, and T. Wiegand, “Xpsnr: A Low-Complexity Extension of The Perceptually Weighted Peak Signal-To-Noise Ratio For High-Resolution Video Quality
Assessment,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2020, pp. 2727–2731.

MHV’24
Benchmarks
Slide 9
● FixedLadder [1]
○ Uses Apple's HLS (HTTP Live Streaming) bitrate ladder specification.
○ Fixed set of bitrate-resolution pairs, lacks adaptation to different
content.
● Default
○ Encodes only at native resolution (2160p) for all bitrates.
○ Demonstrates limitations in covering the whole bitrate range, especially
at lower bitrates.
● DynResXPSNR [2]
○ Dynamic approach that selects the resolution with the highest XPSNR
for each target bitrate.
○ Optimizes video quality for specific bitrates without considering
complexity.
[1] Apple Inc., “HLS Authoring Specification for Apple Devices.” [Online]. Available: https://developer.apple.com/documentation/httplivestreaming/hls authoring specification for apple devices.
[2] V. V. Menon, C. R. Helmrich, A. Wieckowski, B. Bross, and D. Marpe, “Convex-hull Estimation using XPSNR for Versatile Video Coding,” in 2024 IEEE International Conference on Image Processing (ICIP), 2024. [Online].
Available: https://arxiv.org/abs/2406.13712
Table: FixedLadder used in this paper [1]

MHV’24
Example Results
Slide 10
● Sequence Analysis
○ Selected video sequences with varying spatio-temporal
features (e.g., texture energy, motion, luminance) [1].
○ RQT-PF with α = 0.25 closely matched DynResXPSNR for
rate-quality, while significantly reducing decoding time.
● Quality and Decoding Time
○ Default method provides highest quality at high bitrates but
has the longest decoding times.
○ RQT-PF achieves better balance between quality and
decoding time compared to other methods.
● Observation
○ RQT-PF adapts well to different sequences, providing the
best trade-offs in terms of quality and complexity.
○ Highlights the flexibility of RQT-PF to fit specific streaming
requirements (e.g., low latency, high efficiency).
[1] V. V. Menon, C. Feldmann, K. Schoeffmann, M. Ghanbari, and C. Timmerer, “Green video complexity analysis for efficient encoding in Adaptive Video Streaming,” in Proceedings of the First International Workshop on
Green Multimedia Systems, 2023, p. 16–18.

MHV’24
Performance Comparison
Slide 11
● Benchmark Comparisons
○ Default
■ High bitrate savings (-25.53% to -42.41%) but
high decoding time (+159.39%).
○ DynResXPSNR
■ Excellent bitrate savings (-28.97% to -46.43%)
but at the cost of increased decoding time
(+72.68%).
● RQT-PF Results
○ α = 0.75
■ Moderate bitrate reduction (-4.60%) but
significant decoding time reduction (-14.86%).
○ α = 0.25
■ Largest bitrate savings (-26.25% to -37.40%) with
lower emphasis on reducing decoding time.
● Trade-Off Flexibility
○ Different values of α provide flexibility in balancing
between bitrate savings, quality, and decoding time.
○ Higher α values prioritize decoding efficiency, while
lower α values focus on quality and bitrate.

MHV’24
PDF Analysis of Results
Slide 12
● Probability Density Functions (PDFs)
○ Compared decoding time, bitrate, XPSNR, and VMAF across
different methods using PDFs.
○ PDFs illustrate how RQT-PF distributes performance across these
metrics.
● Decoding Time Analysis
○ QT-PF skews towards lower decoding times, making it ideal for
low-latency streaming.
○ DynResXPSNR results in higher decoding times, limiting its use for
energy-constrained devices.
● Bitrate Distributions
○ Bitrate PDFs show consistency across methods, indicating similar
efficiency in bitrate usage.

MHV’24
PDF Analysis of Results
Slide 13
● Quality Distributions
○ RQT-PF achieves competitive quality (XPSNR, VMAF), especially
with α = 0.25, closely matching DynResXPSNR.
● Trade-Offs in Quality Metrics
○ PDFs for VMAF indicate that DynResXPSNR and RQT-PF (α = 0.25)
provide higher subjective quality.
○ RQT-PF effectively balances bitrate and quality based on the chosen
value of α.

MHV’24
Key Findings
Slide 14
● Flexibility of RQT-PF
○ RQT-PF effectively balances bitrate, quality, and decoding complexity.
○ Allows streaming services to prioritize specific needs, such as low latency, high quality, or bandwidth savings.
● Impact on Streaming Quality
○ Achieved significant decoding time reduction (up to 14.86%) compared to FixedLadder.
○ Provided bitrate savings while maintaining or improving video quality (PSNR, VMAF, XPSNR).
● Tailored Streaming Solutions
○ By adjusting α, RQT-PF can adapt to a variety of streaming requirements:
○ Low Latency (α = 0.75): Prioritizes reduced decoding complexity.
○ High Quality (α = 0.25): Focuses on maximizing video quality for given bitrates.

MHV’24
Conclusion
Slide 15
● Summary of Contributions
○ Introduced RQT-PF, a novel optimization approach for adaptive bitrate ladder construction in VVC streaming.
○ Balanced bitrate, video quality, and decoding complexity to improve Quality of Experience (QoE).
● Benefits of RQT-PF
○ Achieved significant decoding time reduction (up to 14.86%).
○ Provided flexible trade-offs to suit different streaming scenarios, whether prioritizing quality, latency, or energy
efficiency.
○ Improved streaming efficiency, especially for energy-constrained devices and bandwidth-limited environments.
● Practical Impact
○ Streaming platforms can leverage RQT-PF for more efficient video encoding strategies.
○ Supports energy-efficient playback, enhancing viewing experiences for mobile users.
● Future Work
○ Explore subjective evaluations of streaming quality for RQT-PF configurations.
○ Investigate energy optimization by using decoding time as a proxy for energy consumption.
○ Expand RQT-PF to consider additional metrics like end-to-end latency.

Thank you for your attention
— ● Angeliki Katsenou (angeliki.katsenou@bristol.ac.uk)
● Vignesh V Menon (vignesh.menon@hhi.fraunhofer.de)
● Adam Wieckowski (adam.wieckowski@hhi.fraunhofer.de)
● Benjamin Bross (benjamin.bross@hhi.fraunhofer.de)
● Detlev Marpe (detlev.marpe@hhi.fraunhofer.de)

Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming

More Related Content

Similar to Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming

More from Vignesh V Menon

Recently uploaded

Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming